[PATCH] D93747: Rename debug linkage name with -funique-internal-linkage-names

Sriraman Tallam via Phabricator via cfe-commits Wed, 06 Jan 2021 11:58:04 -0800

tmsriram added a comment.

In D93747#2482726 <https://reviews.llvm.org/D93747#2482726>, @hoy wrote:


> In D93747#2482519 <https://reviews.llvm.org/D93747#2482519>, @tmsriram wrote:
>
>> In D93747#2481494 <https://reviews.llvm.org/D93747#2481494>, @dblaikie wrote:
>>
>>> In D93747#2481383 <https://reviews.llvm.org/D93747#2481383>, @hoy wrote:
>>>
>>>> In D93747#2481304 <https://reviews.llvm.org/D93747#2481304>, @tmsriram 
>>>> wrote:
>>>>
>>>>> In D93747#2481223 <https://reviews.llvm.org/D93747#2481223>, @tmsriram 
>>>>> wrote:
>>>>>
>>>>>> In D93747#2481163 <https://reviews.llvm.org/D93747#2481163>, @dblaikie 
>>>>>> wrote:
>>>>>>
>>>>>>> In D93747#2481095 <https://reviews.llvm.org/D93747#2481095>, @hoy wrote:
>>>>>>>
>>>>>>>> In D93747#2481073 <https://reviews.llvm.org/D93747#2481073>, @dblaikie 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Suggesting that gdb isn't using the DW_AT_name at all for "break 
>>>>>>>>> <function name>" but instead demangling and stripping off the extras 
>>>>>>>>> from the linkage name, and since it can't demangle this uniquified 
>>>>>>>>> name (unlike the mangled name used when using the overloadable 
>>>>>>>>> attribute) that degrades the debugger user experience? I'd have to 
>>>>>>>>> test that in more detail/with some hand-hacked DWARF.
>>>>>>>>
>>>>>>>> Yes, I think in your case the linage name can be demangled by the 
>>>>>>>> debugger. In my previous experiment, the uniquefied names could not be 
>>>>>>>> demangled therefore I was not able to breakpoint.
>>>>>>>
>>>>>>> Ah, did some more testing. Seems it's because it isn't a classically 
>>>>>>> mangled name.
>>>>>
>>>>> The simplest fix I can think of is to emit the base 10 number of the 
>>>>> md5hash.  That would preserve all the existing uniqueness and be 
>>>>> demangler friendly.  @hoy @dblaikie  WDYT?
>>>>
>>>> I think using the base 10 form of md5hash is a smart workaround. Thanks 
>>>> for the suggestion.
>>>
>>> Sure - wouldn't mind having some wording from the Itanium ABI/mangling 
>>> rules that explain/corroborate what we've seen from testing to ensure we 
>>> are using it correctly and there aren't any other ways that might be more 
>>> suitable, or lurking gotchas we haven't tested.
>>
>> Yes, makes sense. Also, like @dblaikie  pointed out in D94154 
>> <https://reviews.llvm.org/D94154> it is now important that we don't generate 
>> the DW_AT_linkage_name for C style names using this suffix as it will not 
>> demangle.  I think this makes it important that we only update this field if 
>> it already exists.
>
> Yes, it makes sense to do the renaming only if the linkage name preexists to 
> not break the debuggability. Unfortunately we won't be able to support C with 
> this patch.

There is nothing needed to support C right?  overloadable attribute will mangle 
with _Z so it will work as expected?  What did I miss?

>>>>>>> It might be possible for this uniquifying step to check if the name is 
>>>>>>> mangled (does it start with "_Z") and if it isn't mangled, it could 
>>>>>>> mangle it (check the length of the name, then "_Z<length><name>v") and 
>>>>>>> add the unique suffix. That looks to me like it preserves the original 
>>>>>>> debugging behavior?
>>>>>>>
>>>>>>> (has the problem that we don't actually know the mangling scheme at 
>>>>>>> this point - we do know it up in clang (where it might be Itanium 
>>>>>>> mangling or Microsoft mangling), might point again to the possibility 
>>>>>>> this feature is more suitable to implement in the frontend rather than 
>>>>>>> a middle end pass. And also the 'v' in the mangling is 'void return 
>>>>>>> type', which is a lie - not sure if we could do something better there)
>>>>
>>>> Doing name unification in the frontend sounds like the ultimate solution 
>>>> and since the frontend has all the knowledge about the name mangler. I 
>>>> think for now we can go with the solution @tmsriram suggested, i.e., using 
>>>> the base 10 form of md5 hash.
>>>
>>> Any general idea of how long "for now" would be? It doesn't seem like 
>>> putting this off is going to make things especially better & seems like 
>>> more bug fixes/changes/etc are being built around the solution as it is at 
>>> the moment. So I'd be hesitant to put off this kind of restructuring too 
>>> long.
>
> I'm not sure. It looks like implementation in the front end has more 
> complexity as discussed in the other thread D94154 
> <https://reviews.llvm.org/D94154>.
>
>>>>>>> I think it's pretty important that we don't break tradeoff 
>>>>>>> debuggability for profile accuracy. It'll make for a difficult tradeoff 
>>>>>>> for our/any users.
>>>>>>
>>>>>> Agreed, I didn't expect this.
>>>>>>
>>>>>>> (side note: using the original source file name seems problematic - I 
>>>>>>> know in google at least, the same source file is sometimes built into 
>>>>>>> more than one library/form with different preprocessor defines, so this 
>>>>>>> may not produce as unique a name as you are expecting?)
>>>>>>
>>>>>> It was a best effort and I think the hashing can be improved by using 
>>>>>> more signals other than just the module name if needed.  For hard data 
>>>>>> though, this does significantly improve performance which clearly comes 
>>>>>> from better profile attribution so it does something.
>>>
>>> I'm OK with the idea that this helped the situation - but it doesn't seem 
>>> very principled/rigorous. Rather than a sliding scale of "better" I think 
>>> it'd be worth considering in more detail what would be required to make 
>>> this correct.
>>>
>>> Using the object file name may be more robust - also not perfect for all 
>>> build systems (one could imagine a distributed build system that /might/ be 
>>> able to get away with having the distributed builders always build a file 
>>> named x.o - only to have the distribution system rename the file to its 
>>> canonical name on the main machine before linking occurs - but I think 
>>> that's not how most distributed build systems work, and most build systems 
>>> provide a unique object file name across a build?) but maybe moreso than 
>>> using the input file name? (at least I think for google/blaze the object 
>>> filename is genuinely unique)
>>
>> So we are using the full path name of the source.  We could also combine the 
>> object file name into the hash to make it more robust. I can work on this 
>> patch independently if that is alright.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93747/new/

https://reviews.llvm.org/D93747

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D93747: Rename debug linkage name with -funique-internal-linkage-names

Reply via email to