> On Aug 23, 2017, at 2:06 PM, Jan Kratochvil via lldb-dev 
> <lldb-dev@lists.llvm.org> wrote:
> 
> Hello,
> 
> at least Fedora Linux distribution uses DWZ to reduce DWARF debug info size:
>       https://fedoraproject.org/wiki/Features/DwarfCompressor
> 
> It is DWARF optimization - not really a compression.  One can find it by
> DW_TAG_partial_unit/DW_TAG_imported_unit:
> <0><b>: Abbrev Number: 103 (DW_TAG_partial_unit)
>    <c>   DW_AT_stmt_list   : 0x0
>    <10>   DW_AT_comp_dir    : (alt indirect string, offset: 0xe61c)
> <1><14>: Abbrev Number: 45 (DW_TAG_imported_unit)
>    <15>   DW_AT_import      : <alt 0xb>
> ...
> 
> I have already made some attempt for its implementation:
>       https://people.redhat.com/jkratoch/lldb-2017-08-13.patch
>       https://people.redhat.com/jkratoch/lldb-2017-08-13imp.patch
> 
> But I found that approach as a dead-end because LLDB expects all the DIEs from
> a CU really belong to the same DWARFCompileUnit.  Contrary to LLDB
> expectations the patch above creates different DWARFCompileUnit for each
> DW_TAG_partial_unit.  This the patch solves for cross-DIE references but then
> it ends up with:
>       tools/clang/lib/AST/DeclBase.cpp:75:
>       Assertion `!Parent || &Parent->getParentASTContext() == &Ctx' failed.
> as GetCompUnitForDWARFCompUnit() is returning different clang context for DIEs
> from DWZ supplementary files (different SymbolFileDWARF) vs. base file CUs.
> I tried to generate alternative user_id_t to always refer to originating CU in
> the base file but it is more and more complicated.
> 
> Therefore I would like a new approach to keep all the DIEs from
> a DW_TAG_compile_unit incl. all its imported DW_TAG_partial_unit in the same
> DWARFCompileUnit.  So far I wanted to prevent expansion/copy of all
> DW_TAG_partial_unit m_die_array data into each of its parent
> DW_TAG_compile_unit as it may be a performance hit.
> 
> But then I am not sure whether it is worth it - when LLDB does fully populate
> m_die_array?

It expands them in:

size_t DWARFCompileUnit::ExtractDIEsIfNeeded(bool cu_die_only);

It will either only make the top level CU DIE, or it will parse all DIEs.

>  Currently it always has to as on non-OSX platforms it is using
> DWARFCompileUnit::Index(). But as I plan to implement DWARF-5 .debug_names
> index (like __apple_* index) maybe LLDB then no longer needs to populate
> m_die_array and so just expanding all DW_TAG_partial_unit into a single
> m_die_array for each DW_TAG_compile_unit is fine?

So I glossed over the documentation and I gathered that DWARF type info might 
be stored in other DWARF files and references from the current file.

SymbolFileDWARFDebugMap is an example of how we do things on MacOS. We have one 
clang::ASTContext in the SymbolFileDWARFDebugMap, and multiple external .o 
files (where each ins a SymbolFileDWARF instance) that contain unlinked DWARF. 
Each SymbolFileDWARF instance will have:

  void SymbolFileDWARF::SetDebugMapModule(const lldb::ModuleSP &module_sp);

called to indicate it is actually part of the SymbolFileDWARFDebugMap. Then 
there are functions that check the debug map file and return the 
UniqueDWARFASTTypeMap or the TypeSystem from the SymbolFileDWARFDebugMap if we 
have one:


UniqueDWARFASTTypeMap &SymbolFileDWARF::GetUniqueDWARFASTTypeMap() {
  SymbolFileDWARFDebugMap *debug_map_symfile = GetDebugMapSymfile();
  if (debug_map_symfile)
    return debug_map_symfile->GetUniqueDWARFASTTypeMap();
  else
    return m_unique_ast_type_map;
}

TypeSystem *SymbolFileDWARF::GetTypeSystemForLanguage(LanguageType language) {
  SymbolFileDWARFDebugMap *debug_map_symfile = GetDebugMapSymfile();
  TypeSystem *type_system;
  if (debug_map_symfile) {
    type_system = debug_map_symfile->GetTypeSystemForLanguage(language);
  } else {
    type_system = m_obj_file->GetModule()->GetTypeSystemForLanguage(language);
    if (type_system)
      type_system->SetSymbolFile(this);
  }
  return type_system;
}

This allows one master DWARF file to use a bunch of other DWARF files to make 
one cohesive debug info. That is one approach you could use, but you would need 
to not affect the SymbolFileDWARFDebugMap with any changes you made.

One other idea is to keep all DWARF files separate and stand alone. Your main 
DWARF file with one or more DW_TAG_imported_unit and all DW_TAG_imported_unit 
referenced files, each as its own SymbolFileDWARF. Any reference to a 
DW_FORM_ref_alt would turn into a forward declaration in the current 
SymbolFileDWARF, so the ASTContext in each SymbolFileDWARF wouldn't know 
anything about the types, but we would need to add the DW_TAG_imported_unit 
object files to the target, and _they_ would know about any types they own. 
This way you could have multiple libraries that are the main top level DWARF 
files refer to a bunch of common DW_TAG_imported_unit files with type info and 
and those files would only be loaded once. We would rely on LLDB being able to 
track down the forward declared types later when the variables need to get 
displayed. We already have logic to do that.

The other approach I might suggest is to write a DWARF linker, maybe using 
LLVM's DWARF classes (see llvm-dsymutil sources) that takes the top level DWARF 
and all DW_TAG_imported_unit files and combines them all back into one large 
DWARF file. Then debugging will just work. You get the type uniquing for all 
compile units in the current top level DWARF file. This won't help you if you 
are looking to share the debug info between multiple shared libraries.

One other idea is to let each DWARF file be separate, and when you need a type 
from a DW_TAG_imported_unit you log that file as stand alone and copy the type 
from its clang::ASTContext into the main SymbolFileDWARF's AST context. We copy 
types all the time in expressions as each on has its own AST context.

So there are many solutions. I would vote for linking the DWARF into a single 
file much like we do with llvm-dsymutil on Mac, but that really depends if the 
type uniquing is desired within a single DWARF file and not across many shared 
libraries that all reference common DW_TAG_imported_unit files.

Greg

_______________________________________________
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Reply via email to