Re: [Lldb-commits] [PATCH] Patch for LLDB demangler for demangling upon actual language

Greg Clayton Fri, 30 Jan 2015 16:15:59 -0800

I do also want to stress that I believe that all the places that are manually 
checking for "_Z" and the likes switch over to using methods on Mangled so we 
can avoid the issues you are seeing.


> On Jan 30, 2015, at 4:03 PM, Greg Clayton <[email protected]> wrote:
> 
> 
>> On Jan 30, 2015, at 3:50 PM, Zachary Turner <[email protected]> wrote:
>> 
>> I agree that if memory is important then we should use the opportunity to 
>> reduce memory usage rather than keeping it the same by changing stuff.  But 
>> the reason I asked leads into my next question.
>> 
>> I've been thinking about mangling and demangling for a while and how it 
>> relates to Windows.  I see a lot of code all over the place that manually 
>> inspects mangled names, and usually the code is all custom and handrolled.  
>> (If you're interested I can point you to a bunch of examples).  I don't like 
>> this way of doing things and I think it's generally fragile.  There should 
>> be one place that's responsible for anything to do with mangling.  All these 
>> places that are inspecting strings for _Z or ? should just be calling some 
>> class to ask it about the properties of this string.
> 
> Exactly, why can't we just look at the mangled name and look for the prefix 
> and return the language we calculate?
> 
>> The most sensible place to do that, to me, seems like the ABI.  So I'm 
>> imagining that there's a Mangler base class, and then from that there is an 
>> ItaniumCppMangler, a MsCppMangler, and let's say perhaps a JavaMangler for 
>> the purposes of this CL.  Maybe they share some code, but that's not the 
>> important part.
> 
> Doesn't windows actually have 2 forms of mangling? Itanium + the $ mangling?
> 
>> 
>> ABI provides a method called getMangler().  It returns a singleton instance 
>> (which for Windows would be an MsCppMangler, and for everyone else would be 
>> an ItaiumCppMangler).
> 
> Again, why do we need to get so fancy. I would prefer to avoid this if we can 
> just try demangling if it starts with one of the mangling prefixes. 
> 
>> 
>> In the Symbol class, then, all you need to store is the mangled name.  
> 
> And you need to know if the name is mangled in the first place. C function 
> names have no mangling, so if you store the name you can store the name + a 
> flag to say is this mangled.
> 
>> Implement a method in Symbol called getMangler() which looks at m_comp_unit
> 
> With no debug info we have no compile unit and no way to figure out which 
> compile unit a symbol came from. So you can't associate symbols with compile 
> units. Symbol are from symbol tables in the object file, compile units, 
> function, blocks and variables come from debug info which may or may not live 
> in the object file. So what ever you do, just know symbols do not refer to 
> compile units and won't store any compile unit info inside them. You can 
> always take your symbol address and look it up in the debug info and then 
> associate things that way, but there should be no direct reference.
> 
>> and either gets the ABI and calls getCppMangler (if Lang is C++) or a null 
>> mangler (if Lang is C)  or a java mangler (if Lang is Java), etc.
> 
> Again, you can't associate symbols with compile units. So we need something 
> else. Again, can't we just look at the prefix and know how to demangle it?
> 
>> 
>> Then, just call the method on it.
>> 
>> All this seems complicated, but the advantage is that now this logic is 
>> abstracted for anyone else who wants to use it.  
> 
> It was abstracted before when we were relying on the prefix to be able to 
> demangle. Are you saying this isn't possible now?
> 
>> The Mangler interface could provide such methods as IsGuardVariable() or 
>> IsFunction() that things like the interpreter could use by getting the 
>> correct mangler from the ABI, for example.  
> 
> Again, this is a question for the Mangled class to answer based solely on the 
> mangled name itself. If would prefer to stick to looking at mangling prefixes 
> if we can. If not, let me know why we can't.
> 
>> And all of the places in the code that currently have hardcoded mangler 
>> checks could be made to work in the presence of ABI differences and language 
>> differences easily.
> 
> This can be switched to asking the mangled name for its language which will 
> be calculable from the mangled name prefix.
>> 
>> And this doesn't impose any memory penalty on Symbol (and actually reduces 
>> the footprint of each Symbol by the size of 1 pointer)
> 
> I would prefer to save this memory to make symbol tables more efficient. We 
> can also change lldb_private::Symbol values using file addresses only and 
> then convert them to lldb_private::Address values on the fly using the 
> section list of the object file.
> 
> So you will need to prove that the Mangled class function that calculates the 
> language is costly by showing it causing slowdowns in a sampling tool before 
> we add the space to a class that is used all over.
> 
> Greg
> 


_______________________________________________
lldb-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits

Re: [Lldb-commits] [PATCH] Patch for LLDB demangler for demangling upon actual language

Reply via email to