top of page
trustyape

Reverse Engineering with GDB

Gnu Debugger for beginners




What is GDB and why do we need it?

GDB also known as a GNU Debugger is a free software written by Richard Stallman in 1986, released under GNU GPL (General Public License). It's main task is to control program execution, giving you full control over its flow. Allows user to set breakpoints in specific lines of code, so you can stop execution and inspect program's state, you can examine variables, registers, and memory. You can single step through code, backtrace and jump to different addresses. Inspecting stack is also possible. Overall, GDB is a powerful tool for troubleshooting and fixing software bugs, making it an essential part of the software development, debugging, reverse engineering and bypassing security measures.


List of supported languages is extensive, and growing from year to year. As of now, GDB can successfully debug: Ada, Assembly, C, C++, D, Fortran, Go, Objective-C, OpenCL, Modula-2, Pascal, and Rust.


Program to play with.

Before we start, let's write simple C program, that will allow us to showcase some of the most useful commands in GDB.

In short, first if statement just checks if you running program with one argument, if not then it simply prints how you should use it. Banner() is a call to function outside of our main() function, you don't have to worry about that it's just to make it look good, then we declare a character pointer variable named "nickname", which will be argument that we passed when starting program (argv[1] is our argument, argv[0] is name of our program). Then we call another function.

And that's our processArgument function that simply checks if we input correct nickname to retrieve flag.


Program binary and source code (don't cheat, don't look inside till you successfully learn how to examine binary without having source) available at https://github.com/trustyape/Martian-Defence in Reverse_Engineering_with_GDB folder:

martian MD5 hash: ee256b1c5853eb804d480f70c48dd739

martian.exe MD5 hash: 32ca3321396ae131dd4d4dd9eb84b5df


So now we have our little program we can test, debug and maybe exploit a little. First lets compile it so we have binary file to run our tests. First of all if you compiling your own program with the intention of debugging it, use one of the below flags.

root@md> gcc -g

Produces debugging information in OS native format, includes basic debugging information in the executable binary

root@md> gcc -ggdb

Produces debugging information specifically intended for gdb, this option generates more extensive debugging information than “-g”

root@md> gcc -ggdb2

Produces extra debugging information like macros, definitions and variable descriptions, it is an extension to “-ggdb” , 2 – level 2, 3 – level 3.


But for our purpose, we not going to use any of this arguments while compiling. Let's treat our program as a binary we have off of internet. Let's see what we can find, and modify without external directives help.


To start debugging type:

root@md> gdb program_name 

or:

root@md> gdb program_name argument1 argument2 ...

First command will start GDB with our binary loaded, without any arguments. We can add them after. Second one is a way to run a program under the GDB environment with specific command-line arguments. It allows you to launch and debug a program, providing the program's arguments directly from the command line.

Everything you find between curly brackets "{ }" you can input in gdb command line, those are valid directives for the debugger.


Three most important commands of GDB are Quit, Run, and Kill. Quit {quit} or {q} simply terminates debugger, run {run} or {r} starts our program inside debugger, kill {kill} or {k} command is used to terminate the execution of the currently debugged program or process being debugged, can also terminate specific thread of a process.


Leaving all of it aside, let's run our program, type {run} and press return, or type run, specify your arguments, and then hit enter {run argument1}:

We can see that program wants arguments, so lets give it our nickname.

Still seems like nothing happened, beside this oddly familiar banner? Or program worked like it would without debugging? Let me fix it. Let's try disassembling main function in our program. Type {disassemble *main}:

Better? Looks like gibberish? That's AT&T syntax, might be a little bit overwhelming for a beginner, let's change it to something more readable. With set command you can change quite a lot of things in gdb, for a start try changing it to intel {set disassembly-flavor intel} and then disassemble our program again:

What we can see here is our main function in assembly language. How do I know that? Look at the “printf”, “main”, "banner", and “processArgument” functions, we saw them in our c code, then “je” and “cmp” is our if statement, we compare two registers containing our values and variables, and then jump to different part of code according to result of our comparison. Let's examine another simple code below:

Then translate it to assembly:

How you can see now, this two different looking code snippets do exactly same thing, one is written in the C programming language, the other one is x86 assembly. But what can we use it for? Oh glad you ask. Before we just disassembled our program to see what's inside, without inspecting any registers or stack. To do that, we need to go through our code step by step. First of all lets tell out debugger where we want to start (or more specifically stop).


Breakpoints.

Breakpoints are like little stops for our debugger so he knows where we want to pause and look deeper into internals of our code. We can set simple breakpoint with {break *main}, shorter version {b *main} this will stop executing our program just before first command in main function.

We can use break on line number if we want to inspect specific line of our code (be careful as it needs to be compiled with -gdb flag to be able to load symbol table). Another command is break on address {break *0xaddress}, break on function {break my_function + 54} where 54 is offset, and break on point if condition where you specify point and condition that need to be true to actually trigger breakpoint. Run it now:

We can see that execution stopped at main function, we can see our breakpoint number 1, and the address we viewing in main(). Not sure it is your breakpoint? To check what breakpoints we set before, type info breakpoints, this will list all live breakpoints existing in this debugging session, will aslo give us information on whichone was hit.

You can delete one or all of them with: {del breakpoints} or {delete breakpoints} for all of them, and {del breakpoint 1} or {delete breakpoint 1}, for single one with id 1 (Num 1). For learning purposes I've added another two breakpoints and deleted one of them with Num 2. Then deleted all of them to return to place where we were.


Registers - print and examine.

Now, we still saw nothing, lets rerun our program with argument (you can input whatever, i used my nick), then examine registers with {info reg} or {info registers}:

That's more like it! Now we can see what actually is in our registers, from top we can see our standard registers like rax, rbx, etc (rax – 64bit, eax – 32bit, ax – 16 bit, ah/al – 8 higher/lower bit), we can see our stack pointer “rsp” and instruction pointer “rip”. Pay attention to rip, as it always points to next instruction to be executed.


What if we want to examine just one register, one assembly instruction that is running? Well we have a {print} or short {p} command, that prints the value of given expression. Try it on one of registers, type {print $rax} (we use $ to reference register and its value):

You can also format the output with {/} and then short for {p/t} – binary, {p/c} – integer and its character representation, {p/f} – floating point, {p/s} – string, {p/x} – hexadecimal. And that's just a tip of an iceberg, more advanced instructions let you print current value of attribute of a struct {print var->attr}, current value of first few elements of array {print *arr@len}, we can cast $rsp to pointer and dereference it {p *(long *) $rsp}, and many more, but you don't have to worry about them now, we won't need them for this simple tutorial.


Examine on the other hand allows you to specify the format that you want to display information in. Will also dereference the value that give it. We can compare them both, {p $rip} first:

Then try: {x/i $rip} which means examine {x} an instruction {i} at $rip:

You can clearly see the difference now, print gives you just value of the expression ($rip in our case) basically an address location of instruction that rip is pointing to, and examine shows also instruction at this address (push rbp).


With examine command you can also use a number of formatters, we used one already with previous instruction. To use formatting, follow {x} with {/} so {x/x} – hexadecimal, {x/d} – signed decimal, {x/u} – unsigned decimal, {x/f} – floating point, {x/t} – binary, {x/o} – octal, {x/c} – char, {x/a} – address, and {x/i} – instruction, we know this one already, right?


Another neat feature is ability to show number of instructions, that come after the one we trying to examine, try {x/i $rip} and then {x/10i $rip}

vs

Nice, ain't it? But behold, that's not all, you can also specify a size of output, using {b} – byte, {h} – halfword, {w} – word, {g} – giant (8 bytes), right after a number of instructions, and before type formatter. Test it on our stack: {x/4x $rsp} vs {x/4gx $rsp} vs {x/4bx $rsp}.


{x/4x $rsp} - will show 4 consecutive values of the size that you used last, it was 8 byte for me (i used x/4gx first to show that this instruction will use previous size formatter)

{x/4bx $rsp} - will show 4 consecutive 1 byte values, using {x/4x} after this instruction will yield with same output, as it defaults to last used size

{x/4wx $rsp} - this will show 4 consecutive 4 byte values

What if instead of examining the memory location, we want to print that value? Back to our print command, examine our stack pointer, and then dereference it, cast to long pointer and print as an address.

Stepping Instructions. Everything to this point happens with our program stuck at first breakpoint, we should probably move on. To do this we have four instructions. For moving an instruction at a time we use {si} – step instruction, we can {s} - step this will go through our program till it finds function call, then step into it, we have {n} – next, and {ni} - next instruction, which basically does the same thing, but without descending into function. Out of four, let us focus on {ni}, so step one instruction at a time, at the machine level.

For convenience I've rerun program again. Looks like not a lot of happened? Not exactly, address changed indicating that we are in different place in our code. But we need to check manually every register that is of our interest, unless we use another helpful instruction {display}, try {display/i $rip}

Now every time we step, gdb will display some info for us. Go on and try it {ni}, step till you will find first “call” instruction. Now if you use {ni} again it will run the function without showing you code of this function. Jump in execute and jump out, without showing it to us.

Continue until you find another call, this time it will be processArgument. You remember from before, n and ni do not step into function, s and si does. So instead of {ni}, try {si} now, and then follow with {ni}. Just pushing “Enter/Return” repeats last instruction. With {si} we will step into function, and see it's code.

Notice how our <main+87> changed to <processArgument> that indicates we are inside the funciton. Continue with {ni} or {finish} if you not patient.

Now that's rude. Obviously we typed wrong nickname. But which one will be correct? Follow with {continue} that will run rest of the current function for us, then it will stop, just after it leaves the function, or will run till it encounters program termination, whichever comes first.


Typing {continue} – continues to next breakpoint, if we do not have more breakpoints it will continue till it reaches the end of code and terminates.


Restart and happy ending!

It is time to start again, in real assignments you will probably start over and over numerous times.

Before we type {run nickname} and smash that enter, we going to set another breakpoint. While disassembly of our main function we saw another function lurking just at the bottom of displayed code. You can see a banner, but we can assume it is just function that prints actual banner, so next one in line is processArgument.

Now run the program till it stops at our new breakpoint, don't forget to provide argument {run argument} instead just running code {run}. If you still have breakpoint from before, you can {delete} them, or just use {continue}, that will continue program execution (show banner) till next breakpoint. Check where are we in our code {x/10i $rip}.

Just by looking at this few lines of code you can assume we are in the function, {0x5555555551c0 <processArgument+4>: sub rsp,0x10}, where sub rsp, 0x10 allocates space for our variables on stack. If you don't know assembly this may look a little bit scary, but don't worry, look at another function call <strcmp@plt> that's string compare. Using {ni} step instructions till you hit our call. Use display to save yourself retyping previous instruction over and over {display/10i $rpi}, to remove previous display setting type {undisplay}. Again, after you use {ni} once you can simply hit enter again to repeat last instruction. Little arrow on the left indicates what instruction going to be executed next "=>".

Strcmp is a simple string comparison, before this call we can see two registers being moved to register rsi and rdi (source index register, and destination index register), let's peek inside {x/s $rdi} and {x/s $rsi}.

Whoah! That's our argument, and another string hardcoded in the program. We compare them and according to result jump in different place of our code. If they are not equal, our program terminates, but what happens when they are equal? And what hides in next <processArgument call>? Find out with {ni} instruction. Notice after <test eax, eax> we encountered <jne 0x5555555551f6 <processArgument+58>> jump not equal, we know our argument was not equal to the one stored in code, so we jump to <processArgument+58> avoiding series of instructions. Step till another <strcmp@plt> and {x/s $rsi} results in <CerealKiller>. So we have more then one argument that yields some kind of output.

If you recall, we used {set} instruction to set our disassembly flavor to intel, but that's not the only thing we can set in gdb. We can alter registers {set $rax = 5}, stack {set *(long *) $rsp = 0x1337}, create our own variables {set $awesome_variable = $rdi} and even instruction pointer {set &rip = *main + 45}. Instead of terminating our program and manually rerunning it again with correct argument, we can change a value of register that stores it {set $rdi = $rsi} or {set $rdi = "CerealKiller} and {c} to continue.

Boom! We managed to overwrite our incorrect argument with correct one, that sent us to output we were looking for! Not knowing what is the right nickname we manage to find it in our debugger and successfully exploit previously restricted part of this simple program.

Not really end, go and train! Go and have a fun with it, there is more processArguments in our little program, there is a flag there, you can find it same way we found our first two legit nicknames! Looking through source code is cheating! This is one of first steps to reverse engineering, and binary exploitation, there is more, a lot more, go and discover!


Extra info.

Gdb can run scripts, and plugins.


GDB Scripts, can automate and customize the debugging process. They allow you to create custom debugging workflows, automate repetitive tasks, and extend GDB's functionality.

Usage:

gdb -x my_script.gdb ./program_name

In script we can use all gdb commands, and some special ones.

break *main
command
	silent
	p $rip
end
run robert
q

Teaching you how to write them is out of scope for this tutorial, but you probably recognize some of this instructions, we used them before.

GDB Plugins, two most popular are GEF and PWNDBG


pwndbg

gef


I myself started using GEF, but they both good, and simplify our job immensely (like they have scripts to generate patterns, and check offset of a pattern, which is supper useful with buffer overflow). Below snippet of our little program in gdb with gef plugin.

May look I little bit cluttered, but the longer you play with gdb the quicker you will notice that all that info is helpful, and saves you a lot of time have it displayed every step.


That's all folks!

Trustyape signing off, take it easy!

留言


bottom of page