> On Feb 24, 2015, at 2:36 PM, David Blaikie <[email protected]> wrote: > > > > On Mon, Feb 23, 2015 at 3:45 PM, Adrian Prantl <[email protected] > <mailto:[email protected]>> wrote: > >> On Feb 23, 2015, at 3:37 PM, David Blaikie <[email protected] >> <mailto:[email protected]>> wrote: >> >> >> >> On Mon, Feb 23, 2015 at 3:32 PM, Adrian Prantl <[email protected] >> <mailto:[email protected]>> wrote: >> >>> On Feb 23, 2015, at 3:14 PM, David Blaikie <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> >>> >>> On Mon, Feb 23, 2015 at 3:08 PM, Adrian Prantl <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>>> On Feb 23, 2015, at 2:59 PM, David Blaikie <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> >>>> >>>> On Mon, Feb 23, 2015 at 2:51 PM, Adrian Prantl <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> > On Jan 20, 2015, at 11:07 AM, David Blaikie <[email protected] >>>> > <mailto:[email protected]>> wrote: >>>> > >>>> > My vague recollection from the previous design discussions was that >>>> > these module references would be their own 'unit' COMDAT'd so that we >>>> > don't end up with the duplication of every module reference in every >>>> > unit linked together when linking debug info? >>>> > >>>> > I think in my brain I'd been picturing this module reference as being an >>>> > extended fission reference (fission skeleton CU + extra fields for users >>>> > who want to load the Clang AST module directly and skip the split CU). >>>> >>>> Apologies for letting this rest for so long. >>>> >>>> Your memory was of course correct and I didn’t follow up on this because I >>>> had convinced myself that the fission reference would be completely >>>> sufficient. Now that I’ve been thinking some more about it, I don’t think >>>> that it is sufficient in the LTO case. >>>> >>>> Here is the example from the >>>> http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-November/040076.html >>>> <http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-November/040076.html>: >>>> >>>> foo.o: >>>> .debug_info.dwo >>>> DW_TAG_compile_unit >>>> // For DWARF consumers >>>> DW_AT_dwo_name ("/path/to/module-cache/MyModule.pcm") >>>> DW_AT_dwo_id ([unique AST signature]) >>>> >>>> .debug_info >>>> DW_TAG_compile_unit >>>> DW_TAG_variable >>>> DW_AT_name "x" >>>> DW_AT_type (DW_FORM_ref_sig8) ([hash for MyStruct]) >>>> >>>> In this example it is clear that foo.o imported MyModule because its DWO >>>> skeleton is there in the same object file. But if we deal with the result >>>> of an LTO compilation we will end up with many compile units in the same >>>> .debug_info section, plus a bunch of skeleton compile units for _all_ >>>> imported modules in the entire project. We thus loose the ability to >>>> determine which of the compile units imported which module. >>>> >>>> Why would we need to know which CU imported which modules? (I can imagine >>>> some possible reasons, but wondering what you have in mind) >>> >>> When the debugger is stopped at a breakpoint and the user wants to evaluate >>> an expression, it should import the modules that are available at this >>> location, so the user can write the expression from within the context of >>> the breakpoint (e.g., without having to fully qualify each type, etc). >>> >>> I'm not sure how much current debuggers actually worry about that - (& this >>> may differ from lldb to gdb to other things, of course). I'm pretty sure at >>> least for GDB, a context in one CU is as good as one in another (at least >>> without split-dwarf, type units, etc - with those sometimes things end up >>> overly restrictive as the debugger won't search everything properly). >>> >>> eg: if you have a.cpp: int main() { }, b.cpp: void func() { } and you run >>> 'start' in gdb (which breaks at the beginning of main) you can still run 'p >>> func()' to call the func, even though there's no declaration of it in >>> a.cpp, etc. >> >> LLDB would definitely care (as it is using clang for the expression >> evaluation supporting these kinds of features is really straightforward >> there). By importing the modules (rather than searching through the DWARF), >> the expression evaluator gains access to additional declarations that are >> not there in the DWARF, such as templates. But since clang modules are not >> namespaces, we can’t generally "import the world” as a debugger would >> usually do. >> >> Sorry, not sure I understand this last sentence - could you explain further? >> >> I imagine it would be rather limiting for the user if they could only use >> expressions that are valid in this file from the file - it wouldn't be >> uncommon to want to call a function from another module/file/etc to aid in >> debugging. > > Usually LLDB’s expression evaluator works by creating a clang AST type out of > a DWARF type and inserting it into its AST context. We could pre-polulate it > with the definitions from the imported modules (with all sorts of benefits as > described above), but that only works if no two modules conflict. If the > declaration can’t be found in any imported module, LLDB would still import it > from DWARF in the “traditional” fashion. > > But it would import it from DWARF in other TUs rather than use the module > info just because the module wasn't directly referenced from this TU? That > would seem strange to me. (you would lose debug info fidelity (by falling > back to DWARF even though there are modules with the full fidelity info) > unnecessarily, it sounds like)
I think it’s reasonable to expect full fidelity for everything that is available in the current TU, and having the normal DWARF-based debugging capabilities for everything beyond that. But we can only ever provide full fidelity if we have the list of imports for the current TU. > > Would it be reasonable to use the accelerator table/index to lookup the > types, then if the type is in the module you could use the module rather than > the DWARF stashed alongside it? (so the comdat'd split-dwarf skeleton CU for > the module would have an index to tell you what names are inside it, but if > you got an index hit you'd just look at the module instead of loading the > split-dwarf debug info in the referenced file) I don’t think this approach would work for templates and enumerator values; they aren’t in the accelerator tables to begin with. It would also be slower if the declaration is available in a module. -- adrian > > - David > > > > > -- adrian > >> >> >> -- adrian >>>> >>>> I think it really is necessary to put the info about the module imported >>>> into the compile unit that imported it. Or is there a way to do this using >>>> the fission capabilities that I’m not aware of? >>>> >>>> -- adrian >>>> >>>> > >>>> > [rambling a bit more along those lines: >>>> > This would work fine in the case of the module (now an object file) >>>> > containing all the static debug info >>>> > The future step, when we put IR/object code in a module to be linked >>>> > into the final binary, we could put the skeleton CU in that object file >>>> > that's being linked in (then we wouldn't need to COMDAT it) or, >>>> > optionally, link in the debug info itself (skipping the indirection >>>> > through the external file) if a standalone debug info executable was >>>> > desired] >>>> >>>> >>>> >>>> > >>>> > On Tue, Jan 20, 2015 at 9:39 AM, Adrian Prantl <[email protected] >>>> > <mailto:[email protected]>> wrote: >>>> > As a complementary part of the module debugging story, here is a >>>> > proposal to list the imported modules in the debug info. This patch is >>>> > not about efficiency, but rather enables a cool debugging feature: >>>> > >>>> > Record the clang modules imported by the current compile unit in the >>>> > debug info. This allows a module-aware debugger (such as LLDB) to >>>> > @import all modules visible in the current context before evaluating an >>>> > expression, thus making available all declarations in the current >>>> > context (that originate from a module) and not just the ones that were >>>> > actually used by the program. >>>> > >>>> > This implementation uses existing DWARF mechanisms as much as possible >>>> > by emitting a DW_TAG_imported_module that references a DW_TAG_module, >>>> > which contains the information necessary for the debugger to rebuild the >>>> > module. This is similar to how C++ using declarations are encoded in >>>> > DWARF, with the difference that we're importing a module instead of a >>>> > namespace. >>>> > The information stored for a module includes the umbrella directory, any >>>> > config macros passed in via the command line that affect the module, and >>>> > the filename of the raw .pcm file. Why include all these parameters when >>>> > we have the .pcm file? Apart from module chache volatility, there is no >>>> > guarantee that the debugger was linked against the same version of clang >>>> > that generated the .pcm, so it may need to regenerate the module while >>>> > importing it. >>>> > >>>> > Let me know what you think! >>>> > -- adrian >>>> > >>>> > >>>> > >>>> >>>> >>> >>> >> >> > >
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
