On Tue, Feb 24, 2015 at 2:56 PM, Adrian Prantl <[email protected]> wrote:
> > On Feb 24, 2015, at 2:36 PM, David Blaikie <[email protected]> wrote: > > > > On Mon, Feb 23, 2015 at 3:45 PM, Adrian Prantl <[email protected]> wrote: > >> >> On Feb 23, 2015, at 3:37 PM, David Blaikie <[email protected]> wrote: >> >> >> >> On Mon, Feb 23, 2015 at 3:32 PM, Adrian Prantl <[email protected]> wrote: >> >>> >>> On Feb 23, 2015, at 3:14 PM, David Blaikie <[email protected]> wrote: >>> >>> >>> >>> On Mon, Feb 23, 2015 at 3:08 PM, Adrian Prantl <[email protected]> >>> wrote: >>> >>>> >>>> On Feb 23, 2015, at 2:59 PM, David Blaikie <[email protected]> wrote: >>>> >>>> >>>> >>>> On Mon, Feb 23, 2015 at 2:51 PM, Adrian Prantl <[email protected]> >>>> wrote: >>>> >>>>> >>>>> > On Jan 20, 2015, at 11:07 AM, David Blaikie <[email protected]> >>>>> wrote: >>>>> > >>>>> > My vague recollection from the previous design discussions was that >>>>> these module references would be their own 'unit' COMDAT'd so that we >>>>> don't >>>>> end up with the duplication of every module reference in every unit linked >>>>> together when linking debug info? >>>>> > >>>>> > I think in my brain I'd been picturing this module reference as >>>>> being an extended fission reference (fission skeleton CU + extra fields >>>>> for >>>>> users who want to load the Clang AST module directly and skip the split >>>>> CU). >>>>> >>>>> Apologies for letting this rest for so long. >>>>> >>>>> Your memory was of course correct and I didn’t follow up on this >>>>> because I had convinced myself that the fission reference would be >>>>> completely sufficient. Now that I’ve been thinking some more about it, I >>>>> don’t think that it is sufficient in the LTO case. >>>>> >>>>> Here is the example from the >>>>> http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-November/040076.html: >>>>> >>>>> foo.o: >>>>> .debug_info.dwo >>>>> DW_TAG_compile_unit >>>>> // For DWARF consumers >>>>> DW_AT_dwo_name ("/path/to/module-cache/MyModule.pcm") >>>>> DW_AT_dwo_id ([unique AST signature]) >>>>> >>>>> .debug_info >>>>> DW_TAG_compile_unit >>>>> DW_TAG_variable >>>>> DW_AT_name "x" >>>>> DW_AT_type (DW_FORM_ref_sig8) ([hash for MyStruct]) >>>>> >>>>> In this example it is clear that foo.o imported MyModule because its >>>>> DWO skeleton is there in the same object file. But if we deal with the >>>>> result of an LTO compilation we will end up with many compile units in the >>>>> same .debug_info section, plus a bunch of skeleton compile units for _all_ >>>>> imported modules in the entire project. We thus loose the ability to >>>>> determine which of the compile units imported which module. >>>>> >>>> >>>> Why would we need to know which CU imported which modules? (I can >>>> imagine some possible reasons, but wondering what you have in mind) >>>> >>>> >>>> When the debugger is stopped at a breakpoint and the user wants to >>>> evaluate an expression, it should import the modules that are available at >>>> this location, so the user can write the expression from within the context >>>> of the breakpoint (e.g., without having to fully qualify each type, etc). >>>> >>> >>> I'm not sure how much current debuggers actually worry about that - (& >>> this may differ from lldb to gdb to other things, of course). I'm pretty >>> sure at least for GDB, a context in one CU is as good as one in another (at >>> least without split-dwarf, type units, etc - with those sometimes things >>> end up overly restrictive as the debugger won't search everything properly). >>> >>> eg: if you have a.cpp: int main() { }, b.cpp: void func() { } and you >>> run 'start' in gdb (which breaks at the beginning of main) you can still >>> run 'p func()' to call the func, even though there's no declaration of it >>> in a.cpp, etc. >>> >>> >>> LLDB would definitely care (as it is using clang for the expression >>> evaluation supporting these kinds of features is really straightforward >>> there). By importing the modules (rather than searching through the DWARF), >>> the expression evaluator gains access to additional declarations that are >>> not there in the DWARF, such as templates. But since clang modules are not >>> namespaces, we can’t generally "import the world” as a debugger would >>> usually do. >>> >> >> Sorry, not sure I understand this last sentence - could you explain >> further? >> >> I imagine it would be rather limiting for the user if they could only use >> expressions that are valid in this file from the file - it wouldn't be >> uncommon to want to call a function from another module/file/etc to aid in >> debugging. >> >> >> Usually LLDB’s expression evaluator works by creating a clang AST type >> out of a DWARF type and inserting it into its AST context. We could >> pre-polulate it with the definitions from the imported modules (with all >> sorts of benefits as described above), but that only works if no two >> modules conflict. If the declaration can’t be found in any imported module, >> LLDB would still import it from DWARF in the “traditional” fashion. >> > > But it would import it from DWARF in other TUs rather than use the module > info just because the module wasn't directly referenced from this TU? That > would seem strange to me. (you would lose debug info fidelity (by falling > back to DWARF even though there are modules with the full fidelity info) > unnecessarily, it sounds like) > > > I think it’s reasonable to expect full fidelity for everything that is > available in the current TU, and having the normal DWARF-based debugging > capabilities for everything beyond that. But we can only ever provide full > fidelity if we have the list of imports for the current TU. > > > Would it be reasonable to use the accelerator table/index to lookup the > types, then if the type is in the module you could use the module rather > than the DWARF stashed alongside it? (so the comdat'd split-dwarf skeleton > CU for the module would have an index to tell you what names are inside it, > but if you got an index hit you'd just look at the module instead of > loading the split-dwarf debug info in the referenced file) > > > I don’t think this approach would work for templates and enumerator values; > Not sure why enumerator values are an issue - but templates (& all manner of other things that don't make it into the index, unfortunately), sure. > they aren’t in the accelerator tables to begin with. It would also be > slower if the declaration is available in a module. > Though you're rapidly going to end up loading a lot of modules in (as you go up & down a stack printing various things you'll cross into other TUs & load more modules). For a standard DWARF consumer, it seems fine to just have a comdat'd skeleton CU for a module without the need for other CUs to mention which module CUs they reference (but I could be wrong here) & that's the design we originally discussed. It would seem unfortunate to bloat every CU with a non-deduplicable list of every module it references, but if that's necessary for a serialized AST aware debugger, it might be fine to have it as an option (so long as it can be turned off) & may still benefit from that list not being the authoritative module reference, but a /very/ terse reference to it so all the extra flags & stuff can be in the deduplicable comdat (& to keep it as consistent as possible between the flag (on/off) codepaths for this extra data). Maybe a FORM_block (?) of fixed-size hashes of all the modules back-to-back, so it's as small as possible? But I wouldn't mind spending some more time discussing whether there's a better way to keep these things streamlined/symmetric/the same between modular and non-modular debug info. - David > > -- adrian > > > - David > > > > >> >> -- adrian >> >> >> >>> >>> -- adrian >>> >>> >>>> >>>>> I think it really is necessary to put the info about the module >>>>> imported into the compile unit that imported it. Or is there a way to do >>>>> this using the fission capabilities that I’m not aware of? >>>>> >>>>> -- adrian >>>>> >>>>> > >>>>> > [rambling a bit more along those lines: >>>>> > This would work fine in the case of the module (now an object file) >>>>> containing all the static debug info >>>>> > The future step, when we put IR/object code in a module to be linked >>>>> into the final binary, we could put the skeleton CU in that object file >>>>> that's being linked in (then we wouldn't need to COMDAT it) or, >>>>> optionally, >>>>> link in the debug info itself (skipping the indirection through the >>>>> external file) if a standalone debug info executable was desired] >>>>> >>>>> >>>>> >>>>> > >>>>> > On Tue, Jan 20, 2015 at 9:39 AM, Adrian Prantl <[email protected]> >>>>> wrote: >>>>> > As a complementary part of the module debugging story, here is a >>>>> proposal to list the imported modules in the debug info. This patch is not >>>>> about efficiency, but rather enables a cool debugging feature: >>>>> > >>>>> > Record the clang modules imported by the current compile unit in the >>>>> debug info. This allows a module-aware debugger (such as LLDB) to @import >>>>> all modules visible in the current context before evaluating an >>>>> expression, >>>>> thus making available all declarations in the current context (that >>>>> originate from a module) and not just the ones that were actually used by >>>>> the program. >>>>> > >>>>> > This implementation uses existing DWARF mechanisms as much as >>>>> possible by emitting a DW_TAG_imported_module that references a >>>>> DW_TAG_module, which contains the information necessary for the debugger >>>>> to >>>>> rebuild the module. This is similar to how C++ using declarations are >>>>> encoded in DWARF, with the difference that we're importing a module >>>>> instead >>>>> of a namespace. >>>>> > The information stored for a module includes the umbrella directory, >>>>> any config macros passed in via the command line that affect the module, >>>>> and the filename of the raw .pcm file. Why include all these parameters >>>>> when we have the .pcm file? Apart from module chache volatility, there is >>>>> no guarantee that the debugger was linked against the same version of >>>>> clang >>>>> that generated the .pcm, so it may need to regenerate the module while >>>>> importing it. >>>>> > >>>>> > Let me know what you think! >>>>> > -- adrian >>>>> > >>>>> > >>>>> > >>>>> >>>>> >>>> >>>> >>> >>> >> >> > >
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
