Re: [lldb-dev] Resolving dynamic type based on RTTI fails in case of type names inequality in DWARF and mangled symbols

Anton Gorenkov via lldb-dev Tue, 19 Dec 2017 12:33:18 -0800

Tamas, Greg, thank you, I got the idea how it should work withoutaccelerator tables, but I still cannot figure out how to use/update theexisting accelerator tables. So let me walk trough it once again: 1. It is necessary to perform lookup by mangled name (as all weinitially have is mangled "vtable for ClassName"-symbol). 2. All the existing apple accelerator tables (e.g. apple_types) havedemangled and unqualified names as a key. 3. It is not always possible to get the original demanled type nameby the mangled one (e.g. for templates parametrized with enums thedemangled one is Impl<(TagType)0> vs original Impl<TagType::Tag1>, butthere are more complex cases).

Thus, I don't see how adding DW_AT_linkage_name to vtable member ofclass (or even to class itself) could help, as it still won't bepossible to resolve DIE by the mangled type name. However possiblesolutions are: 1. To generate a separate accelerator table: mangled name for vtablemember of a class => DIE; 2. Build index on startup iterating through the apple_types andgather the map mangled name => DIE;


Greg, did you mean some of these or something else?

Thanks,
Anton.

19.12.2017 19:39, Greg Clayton wrote:

I agree with Tamas. The right way to do this it to add theDW_AT_linkage_name to the class. Apple accelerator tables have manydifferent forms, but one is a mapping of type name to exact DIE offset(in the __DWARF_ segment in the __apple_types section). If the mangledname was added to the class, then the apple accelerator tables wouldhave it. So when a lookup happens with these tables around, we do avery quick hash lookup, and we find the exact DIE (or DIEs) we need.Entries for classes in the Apple accelerator tables have both themangled and raw class name as entries pointing to the same DIE sincelookups don't usually happen via mangled names. LLDB also knows how topull names apart and search correctly, so if someone tries to lookup atype with "a::b::MyClass", we will chop that up into "MyClass" and doa lookup on that. We might get many many different "MyClass" resultsback (a::c::MyClass, ::MyClass, b::MyClass), but then we cull thosedown by making sure any matches have a matching decl context of"a::b::". For mangled names, it is easy and just a direct lookup.
The apple accelerator tables are only enabled for Darwin target, butthere is nothing to say we couldn't enable these for other targets inELF files. It would be a quick way to gauge the performanceimprovement that these accelerator tables provide for linux. Currentlylinux will completely index the DWARF, but it will load the DWARF,index it, and unload the DWARF so we don't hog memory for things wedon't need loaded yet. We must manually index the DWARF because theDWARF accelerator tables are really not accelerator tables, they arerandom indexes of related data (names in no particular order,addresses in or particular order). These tables are also not completeso no debugger can rely on them. For example ".debug_pubtypes" is for"public" types only. ".debug_pubnames" is a random name table withonly public functions (no static functions or functions in anonymousnamespaces). So the DWARF accelerator tables can't be used by debuggers.
There is now a modified version of the Apple accelerator tables in theDWARF standard that can provide the same data as the Apple versions,but I don't believe anyone has added this support to any compilersyet. So for simplicity, we can try things out with the Appleaccelerator tables and see how things go.
Another solution involves using llvm-dsymutil, a DWARF linker that isused on Apple platforms. It is a tool that is normally run onexecutables where the DWARF is left in the .o files and linked laterinto final DWARF files. This tool also has a "--update" option thattake a linked dSYM file and updates the accelerator tables in casethey change over time, or in case an older version of llvm-dsymutildidn't add everything that was needed to the tables due to a bug. Soanother way we can try this out is to modify the llvm-dsymutil to workwith ELF files and have it generate and add the Apple acceleratortables to the ELF files. This is nice because it allows us to useDWARF that is generated by any compiler (no need for the compiler tosupport making the accelerator tables). This would a great way to tryout the accelerator tables without requiring compiler changes.
The short term solution is to validate that the Apple acceleratortables work and do speed debugging up by a large amount. The long termsolution is to have clang start emitting the new DWARF acceleratortables and modify LLDB to support and use those tables.
Let me know if there are any questions on any of this.

Greg Clayton
On Dec 19, 2017, at 5:35 AM, Tamas Berghammer via lldb-dev<[email protected] <mailto:[email protected]>> wrote:
Hi,
I thought most compiler still emits DW_AT_MIPS_linkage_name insteadof the standard DW_AT_linkage_name but I agree that if we can weshould use the standard one.
Regarding performance we have 2 different scenarios. On Appleplatforms we have the apple accelerator tables to improve load time(might work on FreeBsd as well) while on other platforms we Index theDWARF data (DWARFCompileUnit::Index) to effectively generateaccelerator tables in memory what is a faster process then fullyparsing the DWARF (currently we only parse function DIEs and we don'tbuild the clang types). I think an ideal solution would be to havethe vtable name stored in DWARF so the DWARF data is standalone andthen have some accelerator tables to be able to do fast lookup frommangled symbol name to DIE offset. I am not too familiar with theapple accelerator tables but if we have anything what maps frommangled name to DIE offset then we can add a few entry to it to mapfrom mangled vtable name to type DIE or vtable DIE.
Tamas
On Mon, Dec 18, 2017 at 9:02 PM xgsa <[email protected]<mailto:[email protected]>> wrote:
    Hi Tamas,
    First, why DW_AT_MIPS_linkage_name, but not just
    DW_AT_linkage_name? The later is standartized and currently
    generated by clang at least on x64.
    Second, this doesn't help to solve the issue, because this will
    require parsing all the DWARF types during startup to build a map
    that breaks DWARF lazy load, performed by lldb. Or am I missing
    something?
    Thanks,
    Anton.
    18.12.2017, 22:59, "Tamas Berghammer" <[email protected]
    <mailto:[email protected]>>:
    Hi Anton and Jim,

    What do you think about storing the mangled type name or the
    mangled vtable symbol name somewhere in DWARF in the
    DW_AT_MIPS_linkage_name attribute? We are already doing it for
    the mangled names of functions so extending it to types
    shouldn't be too controversial.

    Tamas

    On Mon, 18 Dec 2017, 17:29 xgsa via lldb-dev,
    <[email protected] <mailto:[email protected]>> wrote:

        Thank you for clarification, Jim, you are right, I
        misunderstood a little bit what lldb actually does.

        It is not that the compiler can't be fixed, it's about the
        fact that relying on correspondence of mangled and demangled
        forms are not reliable enough, so we are looking for more
        robust alternatives. Moreover, I am not sure that such fuzzy
        matching could be done just basing on class name, so it will
        require reading more DIEs. Taking into account that, for
        instance, in our project there are quite many such types, it
        could noticeable slow down the debugger.

        Thus, I'd like to mention one more alternative and get your
        feedback, if possible. Actually, what is necessary is the
        correspondence of mangled and demangled vtable symbol.
        Possibly, it worth preparing a separate section during
        compilation (like e.g. apple_types), which would store this
        correspondence? It will work fast and be more reliable than
        the current approach, but certainly, will increase debug
        info size (however, cannot estimate which exact increase
        will be, e.g. in persent).

        What do you think? Which solution is preferable?

        Thanks,
        Anton.

        15.12.2017, 23:34, "Jim Ingham" <[email protected]
        <mailto:[email protected]>>:
        > First off, just a technical point. lldb doesn't use RTTI
        to find dynamic types, and in fact works for projects like
        lldb & clang that turn off RTTI. It just uses the fact that
        the vtable symbol for an object demangles to:
        >
        > vtable for CLASSNAME
        >
        > That's not terribly important, but I just wanted to make
        sure people didn't think lldb was doing something fancy with
        RTTI... Note, gdb does (or at least used to do) dynamic
        detection the same way.
        >
        > If the compiler can't be fixed, then it seems like your
        solution [2] is what we'll have to try.
        >
        > As it works now, we get the CLASSNAME from the vtable
        symbol and look it up in the the list of types. That is
        pretty quick because the type names are indexed, so we can
        find it with a quick search in the index. Changing this over
        to a method where we do some additional string matching
        rather than just using the table's hashing is going to be a
        fair bit slower because you have to run over EVERY type
        name. But this might not be that bad. You would first look
        it up by exact CLASSNAME and only fall back on your fuzzy
        match if this fails, so most dynamic type lookups won't see
        any slowdown. And if you know the cases where you get into
        this problem you can probably further restrict when you need
        to do this work so you don't suffer this penalty for every
        lookup where we don't have debug info for the dynamic type.
        And you could keep a side-table of mangled-name -> DWARF
        name, and maybe a black-list for unfound names, so you only
        have to do this once.
        >
        > This estimation is based on the assumption that you can do
        your work just on the type names, without having to get more
        type information out of the DWARF for each candidate match.
        A solution that relies on realizing every class in lldb so
        you can get more information out of the type information to
        help with the match will defeat all our attempts at lazy
        DWARF reading. This can cause quite long delays in big
        programs. So I would be much more worried about a solution
        that requires this kind of work. Again, if you can reject
        most potential candidates by looking at the name, and only
        have to realize a few likely types, the approach might not
        be that slow.
        >
        > Jim
        >
        >>  On Dec 15, 2017, at 7:11 AM, xgsa via lldb-dev
        <[email protected] <mailto:[email protected]>>
        wrote:
        >>
        >>  Sorry, I probably shouldn't have used HTML for that
        message. Converted to plain text.
        >>
        >>  -------- Original message --------
        >>  15.12.2017, 18:01, "xgsa" <[email protected]
        <mailto:[email protected]>>:
        >>
        >>  Hi,
        >>
        >>  I am working on issue that in C++ program for some
        complex cases with templates showing dynamic type based on
        RTTI in lldb doesn't work properly. Consider the following
        example:
        >>  enum class TagType : bool
        >>  {
        >>     Tag1
        >>  };
        >>
        >>  struct I
        >>  {
        >>     virtual ~I() = default;
        >>  };
        >>
        >>  template <TagType Tag>
        >>  struct Impl : public I
        >>  {
        >>  private:
        >>     int v = 123;
        >>  };
        >>
        >>  int main(int argc, const char * argv[]) {
        >>     Impl<TagType::Tag1> impl;
        >>     I& i = impl;
        >>     return 0;
        >>  }
        >>
        >>  For this example clang generates type name
        "Impl<TagType::Tag1>" in DWARF and "__ZTS4ImplIL7TagType0EE"
        when mangling symbols (which lldb demangles to
        Impl<(TagType)0>). Thus when in
        ItaniumABILanguageRuntime::GetTypeInfoFromVTableAddress()
        lldb tries to resolve the type, it is unable to find it.
        More cases and the detailed description why lldb fails here
        can be found in this clang review, which tries to fix this
        in clang [1].
        >>
        >>  However, during the discussion around this review [2],
        it was pointed out that DWARF names are expected to be close
        to sources, which clang does perfectly, whereas mangling
        algorithm is strictly defined. Thus matching them on
        equality could sometimes fail. The suggested idea in [2] was
        to implement more semantically aware matching. There is
        enough information in the DWARF to semantically match
        "Impl<(TagType)0>)" with "Impl<TagType::Tag1>", as enum
        TagType is in the DWARF, and the enumerator Tag1 is present
        with its value 0. I have some concerns about the performance
        of such solution, but I'd like to know your opinion about
        this idea in general. In case it is approved, I'm going to
        work on implementing it.
        >>
        >>  So what do you think about type names inequality and the
        suggested solution?
        >
        >>  [1] - https://reviews.llvm.org/D39622
        >>  [2] -
        
http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20171211/212859.html
        >>
        >>  Thank you,
        >>  Anton.
        >>  _______________________________________________
        >>  lldb-dev mailing list
        >> [email protected] <mailto:[email protected]>
        >> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
        _______________________________________________
        lldb-dev mailing list
        [email protected] <mailto:[email protected]>
        http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
_______________________________________________
lldb-dev mailing list
[email protected] <mailto:[email protected]>
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


_______________________________________________
lldb-dev mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Re: [lldb-dev] Resolving dynamic type based on RTTI fails in case of type names inequality in DWARF and mangled symbols

Reply via email to