Re: [lldb-dev] Debugging JIT-compiled code with LLVM

Greg Clayton Thu, 18 Nov 2010 10:08:15 -0800

On Nov 17, 2010, at 11:35 PM, Simon Ask Ulsnes wrote:

> 2010/11/18 Greg Clayton <[email protected]>:
>> LLDB currently doesn't yet have any support for JIT'ed code, though I would 
>> be happy to work with you if you wanted to get that working in LLDB.
> 
> This was my fear. But I'll need some kind of debugger for my own
> language anyway, so I'd be happy to implement this in LLDB.


Great!

> Could you briefly outline the general steps necessary for adding
> support for JIT'ed code? As you mention, I would expect the procedure
> to look similar to how dlopen()ed dylibs are registered, but I might
> be wrong.

A few questions on how this would be debugged (not worrying about the JIT yet):
1 - When you are debugging this, are you going to want to step through your new 
source code files or generated C/C++ sources? 
2 - If you want to debug sources that you produce, will this be like debugging 
lex/yacc code where a bunch of #line and #file directives are used to map C/C++ 
code to your proprietary source code?

If you are going to be debugging standard i386/x86_64 code, then you won't need 
to subclass lldb_private::Process. 

In order to support JIT'ed code, we just need a way to communicate between a 
running program and the debugger. Setting a breakpoint, like is done with the 
JIT support in GDB, is quite ok for this as this is how the dynamic loader 
plug-in for macosx currently works. We can probably get away with being able to 
register additional dynamic loader plug-ins with the current Process. To 
elaborate a bit lets look at how the dynamic loaders work for shared libraries. 
Currently each process has a pluggable dynamic loader plug-in that gets loaded 
prior to launch by the Process subclasses in "Process::WillLaunch()", or prior 
to attaching in "Process::WillAttachToProcessWithID (lldb::pid_t pid)" and 
"Process::WillAttachToProcessWithName (const char *process_name, bool 
wait_for_launch)". So any process can re-use an abstract dynamic loader plugin. 
The pseudo code looks like:

class Process
{
...
        std::auto_ptr<DynamicLoader> m_dynamic_loader_ap;
};

When the WillLaunch, or WillAttach functions are overridden in the Process 
subclasses (see ProcessGDBRemote for an example), it will find a dynamic loader 
by the plug-in name:

    m_dynamic_loader_ap.reset(DynamicLoader::FindPlugin(this, 
"dynamic-loader.macosx-dyld"));

Since the ProcessGDBRemote plug-in is currently for MacOSX debugging, we know 
to lookup the dynamic loader using a specific name.

After a dynamic loader plug-in is installed, it will get a callback after 
attaching or launching:


void
ProcessGDBRemote::DidLaunch ()
{
    DidLaunchOrAttach ();
    if (m_dynamic_loader_ap.get())
        m_dynamic_loader_ap->DidLaunch();
}

This gives the dynamic loader plug-in a chance to install its breakpoint and 
assign a callback to that breakpoint. When breakpoints have callbacks 
associated with them, the callbacks get called synchronously when the 
breakpoint is hit and this allows you to load/unload shared libaries (See 
DynamicLoaderMacOSXDYLD for example code).

We could allow the Process class to have more than one dynamic loader plug-in 
since loading JIT code is very similar to loading shared libraries:

class Process
{
...
        std::vector<DynamicLoaderSP> m_dynamic_loaders;
};

where DynamicLoaderSP is a shared pointer typedef...

This would allow us to have a standard system dynamic loader, and one or more 
JIT dynamic loader plug-ins.

The JIT'ed dynamic loader plug-in would do the same kind of thing the macosx 
one does: it will set a breakpoint, install a callback and react to that 
breakpoint callback as needed.

Inside LLDB we will need to think about how we want to represent JIT'ed code. 
There are a few options, but first lets look at how shared libraries are 
represented. Any executable or shared library is represented by a Module. 
Module objects have ObjectFile objects (abstracted object file readers (ELF and 
mach-o)), and a SymbolFile for reading debug symbols. We will want to repesent 
JIT'ed code by making a new Module that might be a special module that might 
own all of the JIT'ed code in a process from a specific JIT. So the clang 
JIT'ed code might require us to make a DynamicLoaderClangJIT DynamicLoader 
subclass, which would create a new module named with a fake name "<ClangJIT>" 
that we could add any information to. As new JIT'ed code gets added, new 
functions and data would get added to the "<ClangJIT>" object file (symbol 
table symbols and new sections) and symbol file (if we have debug info for the 
JIT'ed code). Another way would be let the JIT define logical modules in case 
you want to organize your JIT'ed code a bit more so that you can create many 
different Clang JIT modules. Either way, all of this work will be done by the 
DynamicLoaderClangJIT class.

> 
> GDB has a hook for JIT'ed code, but you are right that it only works
> for ELF binaries. The approach there is that GDB sets a breakpoint in
> an extern function, which LLVM calls when emitting code, giving GDB a
> chance to load the symbols. Would a different approach in LLDB be
> desirable, or does that seem OK to you?

That should work, see above comments.
> 
> According to the LLVMdev list, LLVM did not emit DWARF data for JIT'ed
> code as of March 2009 — I'm not sure if this has changed, though I
> suspect it hasn't, so to get this to work I guess there is also a bit
> of work to be done on the LLVM side of things.

Agreed, it would be great to be able to get DWARF for JIT'ed code.

> - Simon

Let me know if you need any explanation on anything mentioned above.

Greg

_______________________________________________
lldb-dev mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

Re: [lldb-dev] Debugging JIT-compiled code with LLVM

Reply via email to