Andreas,

I am glad you are interested in finding out more. Unfortunately there aren't a 
ton of other documents you can read. The one thing I can do is point out what 
LLDB does different from other typical debuggers.

1 - Types are converted from debug info back into correct clang types.

Most debuggers tend to make up their own internal type representation that is 
more geared toward how the information is represented in the debug info and 
also to how the debugger's expression parser will want to us that type 
information.

LLDB converts DWARF back into real clang types. It currently makes an AST 
context per module (executable, shared library, or other loadable code 
container) and lazily populates the AST as needed by expressions.

2 - We use the compiler as our expression parser

Because we convert all of our types into clang types, we can use clang as our 
expression parser. LLDB just pretends to be a precompiled header when the 
compiler is parsing an expression and we can answer all of the precompiled 
header queries for information based on the current execution context (which 
frame we have selected in which thread a specific process). The other benefit 
of this approach is if clang adds C++Ox support, we already have support for it 
in our expression parser just by updating to the latest version of clang. It 
also allows us to support any feature currently supported by clang. Other 
debuggers must add new runtime features or modify their expression parsers each 
time a new language feature is added. This also allows us to have expression 
local variables:

(lldb) expression
for (int i=0; i<10; i++)
  (int)printf("i = %i\n", i);

i = 0
i = 1
i = 2
i = 3
i = 4
i = 5
i = 6
i = 7
i = 8
i = 9
(lldb) 

Note that "i" here is an expression local. If you had code like:

  int main ()
  {
      int i=0;
->    return 0;
  }

And you were stopped at the "return 0;" statement, and evaluated the above 
multi-line expression, it is as if you added a new lexical block scope:

  int main ()
  {
      int i=0;
      {
        for (int i=0; i<10; i++)
          (int)printf("i = %i\n", i);
      }
->    return 0;
  }


3 - The expression parser can JIT the code you have for expressions

We use clang to JIT the results of expressions and runs them locally down on in 
the process you are debugging. We also let clang handle all of the ABI issues 
when it comes to calling functions which is typically a place where debuggers 
can also mess up. For example if you have an expression like:

(lldb) expr c4 = complex_add (c1, c2) + c3

We actually JIT up a function that takes a single "data *" as a parameter 
(which is easy for debuggers to figure out how to call) and we define data as:

struct data
{
   complex c1; // variable used as first arg to complex_add() function
   complex c2; // variable used as second arg to complex_add() function
   complex c3; // variable to add to result of complex_add() function
   complex result; // result of the expression (which will be "c4")
}

Then we JIT up a function

complex
$___lldb_expr (data *data_ptr)
{
    data_ptr->result = complex_add (data_ptr->c1, data_ptr->c2) + data_ptr->c3;
}


Why is this important? Becuase now none of our debugger plug-ins need to know 
how and where to put arguments to functions. We let clang handle the current 
ABI issues and let it place the variables in which registers or on the stack as 
needed, including dealing with the return type from functions. This keeps the 
debugger from being in the business of having to know the current ABI for the 
current target (a big source of bugs in debugger expression parsers).

4 - JIT'ed code can be used for more complete expression validation

We write our own helper functions that we JIT up and can copy into the process 
we are debugging. We can post process the Intermediate Representation (IR) we 
get after we compile an expression and put extra checks into your expressions. 
So for an expression like:

(lldb) expr 2 + pt_ptr->x + pt2_ptr->y

We can actually rewrite this expression to use our "void 
*pointer_validation(void *)" function so we would actually run:

2 + pointer_validation (pt_ptr)->x + pointer_validation (pt2_ptr)->y

And if either "pt_ptr" or "pt2_ptr" was invalid, we can stop the epxression 
early and let the user know that a pointer was invalid. This can help to detect 
issues, escpecially when a bad pointer might point to memory just before valid 
memory and the field access could actually put you back into valid memory.


5 - LLDB parses debug information lazily.

Many debuggers have a lot of different approaches to how they parse debug info. 
GDB tends to parse everything a compile unit at a time. LLDB will parse only 
what it needs as it needs it. If you only touch one function in a compile unit 
with 100 functions, we will have parse only the function and the types needed 
for that one function. This can help save on memory footprint.

6 - LLDB can run multiple debug sessions simultaneously:

(lldb) target create /tmp/server.exe
(lldb) breakpoint set --name main
(lldb) run
Process 1000 launched: '/tmp/server.exe' (x86_64)
...
(lldb) target create /tmp/client.exe
(lldb) breakpoint set --name main
(lldb) run
Process 1001 launched: '/tmp/client.exe' (x86_64)
...
(lldb) target list 
Current targets:
  target #0: /tmp/server.exe ( arch=x86_64-apple-darwin, platform=localhost, 
pid=1000, state=stopped )
* target #1: /tmp/client.exe ( arch=x86_64-apple-darwin, platform=localhost, 
pid=1001, state=stopped )
(lldb) target select 0
(lldb) run
(lldb) target select 1
(lldb) run

LLDB can also run binaries for different architectures from the same debugger 
so you could debug a local server and a remote client for a different 
architecture on a remote machine in the same session. 

7 - LLDB is build around plug-ins

This means no matter what you are debugging, you always have access to other 
plug-ins for differnet architectures. So you can use any of the supported 
disassemblers from any target. Below we create a x86_64 target and debug it, 
and we can disassemble using the ARM disassembler on a x86_64 memory


(lldb) target create /tmp/arm-compiler-on-x86_64
(lldb) breakpoint set --name main
(lldb) run
Process 1000 launched: '/tmp/server.exe' (x86_64)
(lldb) disasemble --arch armv7 --count 32 0x12020300

GDB only has disassemblers for the currently built binary inside of it and can 
cross disassemble.

There are many more important architectural differences, but I believe that I 
have outlined the important big differences above.

Greg Clayton


On Nov 10, 2011, at 9:22 AM, Andreas Donig wrote:

> Hello everybody,
> 
> I'm an undergrad student of computer science doing some research in 
> debuggers. I've read J.B. Rosenberg's "How Debuggers Work", the GDB Internals 
> manual and went through the documentation on LLDB's website and now I'm 
> craving for more. I'd greatly appreciate if you'd let me know about other 
> relevant documentation or anything else you could suggest as a reading.
> 
> Best regards
> Andreas
> _______________________________________________
> lldb-dev mailing list
> [email protected]
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

_______________________________________________
lldb-dev mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

Reply via email to