Re: [lldb-dev] Huge mangled names are causing long delays when loading symbol table symbols

2018-01-24 Thread Greg Clayton via lldb-dev

> On Jan 24, 2018, at 4:14 PM, Zachary Turner  wrote:
> 
> That's true, but shouldn't it be possible to demangle up until the last point 
> you got something meaningful?  (I don't know the details of itanium mangling, 
> just assuming this is possible)

anywhere you cut the string many things can go wrong. I think this would fall 
under the "start to demangle the string and if the output buffer goes over a 
certain length, abort the demangling which is solution #4 from my original 
email.

> 
> On Wed, Jan 24, 2018 at 3:54 PM Greg Clayton  > wrote:
> If you just cut off the string, then it might not demangle without an error 
> if you truncate the mangled string at a specific point...
> 
>> On Jan 24, 2018, at 3:52 PM, Zachary Turner > > wrote:
>> 
>> What about doing a partial demangle?   Take at most 1024 (for example) 
>> characters from the mangled name, demangle that, and then display ... at the 
>> end.
>> 
>> On Wed, Jan 24, 2018 at 3:48 PM Greg Clayton via lldb-dev 
>> mailto:lldb-dev@lists.llvm.org>> wrote:
>> I have an issue where I am debugging a C++ binary that is around 250MB in 
>> size. It contains some mangled names that are crazy:
>> 
>> _ZNK3shk6detail17CallbackPublisherIZNS_5ThrowERKNSt15__exception_ptr13exception_ptrEEUlOT_E_E9SubscribeINS0_9ConcatMapINS0_18CallbackSubscriberIZNS_6GetAllIiNS1_IZZNS_9ConcatMapIZNS_6ConcatIJNS1_IZZNS_3MapIZZNS_7IfEmptyIS9_EEDaS7_ENKUlS6_E_clINS1_IZZNS_4TakeIiEESI_S7_ENKUlS6_E_clINS1_IZZNS_6FilterIZNS_9ElementAtEmEUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZZNSL_ImEESI_S7_ENKUlS6_E_clINS1_IZNS_4FromINS0_22InfiniteRangeContainerIiSI_S7_EUlS7_E_SI_S6_EUlS7_E_SI_S6_EUlS7_E_SI_S6_EUlS7_E_SI_S6_EUlS7_E_EESI_S7_ENKUlS6_E_clIS14_EESI_S6_EUlS7_E_EERNS1_IZZNSH_IS9_EESI_S7_ENKSK_IS14_EESI_S6_EUlS7_E0_ESI_DpOT_EUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZNS_5StartIJZNS_4JustIJS19_S1C_EEESI_S1F_EUlvE_ZNS1K_IJS19_S1C_EEESI_S1F_EUlvE0_EEESI_S1F_EUlS7_E_SI_S6_EUlS7_E_St6vectorIS6_SaIS6_EERKT0_NS_12ElementCountEbEUlS7_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlOS3_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlvE_EES1G_S1O_E25ConcatMapValuesSubscriberEEEDaS7_
>> 
>> This de-mangles to something that is 72MB in size and takes 280 seconds (try 
>> running "time c++filt -n" on the above string).
>> 
>> There are probably many symbols likes this in this binary. Currently lldb 
>> will de-mangle all names in the symbol table so that we can chop up the 
>> names so we know function base names and we might be able to classify a base 
>> name as a method or function for breakpoint categorization.
>> 
>> My questions is: how do we work around such issues in LLDB? A few solutions 
>> I can think of:
>> 1 - time each name demangle and if it takes too long somehow stop 
>> de-mangling similar symbols or symbols over a certain length?
>> 2 - allow a setting that says "don't de-mangle names that start with..." and 
>> the setting has a list of prefixes.
>> 3 - have a setting that turns off de-mangling symbols over a certain length 
>> all of the time with a default of something like 256 or 512
>> 4 - modify our FastDemangler to abort if the de-mangled string goes over a 
>> certain limit to avoid bad cases like this...
>> 
>> #1 would still mean we get a huge delay (like 280 seconds) when starting to 
>> debug this binary, but might prevent multiple symbols from adding to that 
>> delay...
>> 
>> #2 would require debugging debugging once and then knowing which symbols 
>> took a while to de-mangle. If we time each de-mangle, we can warn that there 
>> are large mangled names and print the mangled name so the user might know?
>> 
>> #3 would disable de-mangling of long names at the risk of not de-mangling 
>> names that are close to the limit
>> 
>> #4 requires that our FastDemangle code can decode the string mangled string. 
>> The fast de-mangler currently aborts on tricky de-mangling and we fall back 
>> onto cxa_demangle from the C++ library which doesn't not have a cutoff on 
>> length...
>> 
>> Can anyone else think of any other solutions?
>> 
>> Greg Clayton
>> 
>> 
>> 
>> 
>> 
>> 
>> ___
>> lldb-dev mailing list
>> lldb-dev@lists.llvm.org 
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev 
>> 
> 

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Huge mangled names are causing long delays when loading symbol table symbols

2018-01-24 Thread Zachary Turner via lldb-dev
That's true, but shouldn't it be possible to demangle up until the last
point you got something meaningful?  (I don't know the details of itanium
mangling, just assuming this is possible)

On Wed, Jan 24, 2018 at 3:54 PM Greg Clayton  wrote:

> If you just cut off the string, then it might not demangle without an
> error if you truncate the mangled string at a specific point...
>
> On Jan 24, 2018, at 3:52 PM, Zachary Turner  wrote:
>
> What about doing a partial demangle?   Take at most 1024 (for example)
> characters from the mangled name, demangle that, and then display ... at
> the end.
>
> On Wed, Jan 24, 2018 at 3:48 PM Greg Clayton via lldb-dev <
> lldb-dev@lists.llvm.org> wrote:
>
>> I have an issue where I am debugging a C++ binary that is around 250MB in
>> size. It contains some mangled names that are crazy:
>>
>>
>> _ZNK3shk6detail17CallbackPublisherIZNS_5ThrowERKNSt15__exception_ptr13exception_ptrEEUlOT_E_E9SubscribeINS0_9ConcatMapINS0_18CallbackSubscriberIZNS_6GetAllIiNS1_IZZNS_9ConcatMapIZNS_6ConcatIJNS1_IZZNS_3MapIZZNS_7IfEmptyIS9_EEDaS7_ENKUlS6_E_clINS1_IZZNS_4TakeIiEESI_S7_ENKUlS6_E_clINS1_IZZNS_6FilterIZNS_9ElementAtEmEUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZZNSL_ImEESI_S7_ENKUlS6_E_clINS1_IZNS_4FromINS0_22InfiniteRangeContainerIiSI_S7_EUlS7_E_SI_S6_EUlS7_E_SI_S6_EUlS7_E_SI_S6_EUlS7_E_SI_S6_EUlS7_E_EESI_S7_ENKUlS6_E_clIS14_EESI_S6_EUlS7_E_EERNS1_IZZNSH_IS9_EESI_S7_ENKSK_IS14_EESI_S6_EUlS7_E0_ESI_DpOT_EUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZNS_5StartIJZNS_4JustIJS19_S1C_EEESI_S1F_EUlvE_ZNS1K_IJS19_S1C_EEESI_S1F_EUlvE0_EEESI_S1F_EUlS7_E_SI_S6_EUlS7_E_St6vectorIS6_SaIS6_EERKT0_NS_12ElementCountEbEUlS7_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlOS3_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlvE_EES1G_S1O_E25ConcatMapValuesSubscriberEEEDaS7_
>>
>> This de-mangles to something that is 72MB in size and takes 280 seconds
>> (try running "time c++filt -n" on the above string).
>>
>> There are probably many symbols likes this in this binary. Currently lldb
>> will de-mangle all names in the symbol table so that we can chop up the
>> names so we know function base names and we might be able to classify a
>> base name as a method or function for breakpoint categorization.
>>
>> My questions is: how do we work around such issues in LLDB? A few
>> solutions I can think of:
>> 1 - time each name demangle and if it takes too long somehow stop
>> de-mangling similar symbols or symbols over a certain length?
>> 2 - allow a setting that says "don't de-mangle names that start with..."
>> and the setting has a list of prefixes.
>> 3 - have a setting that turns off de-mangling symbols over a certain
>> length all of the time with a default of something like 256 or 512
>> 4 - modify our FastDemangler to abort if the de-mangled string goes over
>> a certain limit to avoid bad cases like this...
>>
>> #1 would still mean we get a huge delay (like 280 seconds) when starting
>> to debug this binary, but might prevent multiple symbols from adding to
>> that delay...
>>
>> #2 would require debugging debugging once and then knowing which symbols
>> took a while to de-mangle. If we time each de-mangle, we can warn that
>> there are large mangled names and print the mangled name so the user might
>> know?
>>
>> #3 would disable de-mangling of long names at the risk of not de-mangling
>> names that are close to the limit
>>
>> #4 requires that our FastDemangle code can decode the string mangled
>> string. The fast de-mangler currently aborts on tricky de-mangling and we
>> fall back onto cxa_demangle from the C++ library which doesn't not have a
>> cutoff on length...
>>
>> Can anyone else think of any other solutions?
>>
>> Greg Clayton
>>
>>
>>
>>
>>
>>
>> ___
>> lldb-dev mailing list
>> lldb-dev@lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>
>
>
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Huge mangled names are causing long delays when loading symbol table symbols

2018-01-24 Thread Greg Clayton via lldb-dev
If you just cut off the string, then it might not demangle without an error if 
you truncate the mangled string at a specific point...

> On Jan 24, 2018, at 3:52 PM, Zachary Turner  wrote:
> 
> What about doing a partial demangle?   Take at most 1024 (for example) 
> characters from the mangled name, demangle that, and then display ... at the 
> end.
> 
> On Wed, Jan 24, 2018 at 3:48 PM Greg Clayton via lldb-dev 
> mailto:lldb-dev@lists.llvm.org>> wrote:
> I have an issue where I am debugging a C++ binary that is around 250MB in 
> size. It contains some mangled names that are crazy:
> 
> _ZNK3shk6detail17CallbackPublisherIZNS_5ThrowERKNSt15__exception_ptr13exception_ptrEEUlOT_E_E9SubscribeINS0_9ConcatMapINS0_18CallbackSubscriberIZNS_6GetAllIiNS1_IZZNS_9ConcatMapIZNS_6ConcatIJNS1_IZZNS_3MapIZZNS_7IfEmptyIS9_EEDaS7_ENKUlS6_E_clINS1_IZZNS_4TakeIiEESI_S7_ENKUlS6_E_clINS1_IZZNS_6FilterIZNS_9ElementAtEmEUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZZNSL_ImEESI_S7_ENKUlS6_E_clINS1_IZNS_4FromINS0_22InfiniteRangeContainerIiSI_S7_EUlS7_E_SI_S6_EUlS7_E_SI_S6_EUlS7_E_SI_S6_EUlS7_E_SI_S6_EUlS7_E_EESI_S7_ENKUlS6_E_clIS14_EESI_S6_EUlS7_E_EERNS1_IZZNSH_IS9_EESI_S7_ENKSK_IS14_EESI_S6_EUlS7_E0_ESI_DpOT_EUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZNS_5StartIJZNS_4JustIJS19_S1C_EEESI_S1F_EUlvE_ZNS1K_IJS19_S1C_EEESI_S1F_EUlvE0_EEESI_S1F_EUlS7_E_SI_S6_EUlS7_E_St6vectorIS6_SaIS6_EERKT0_NS_12ElementCountEbEUlS7_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlOS3_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlvE_EES1G_S1O_E25ConcatMapValuesSubscriberEEEDaS7_
> 
> This de-mangles to something that is 72MB in size and takes 280 seconds (try 
> running "time c++filt -n" on the above string).
> 
> There are probably many symbols likes this in this binary. Currently lldb 
> will de-mangle all names in the symbol table so that we can chop up the names 
> so we know function base names and we might be able to classify a base name 
> as a method or function for breakpoint categorization.
> 
> My questions is: how do we work around such issues in LLDB? A few solutions I 
> can think of:
> 1 - time each name demangle and if it takes too long somehow stop de-mangling 
> similar symbols or symbols over a certain length?
> 2 - allow a setting that says "don't de-mangle names that start with..." and 
> the setting has a list of prefixes.
> 3 - have a setting that turns off de-mangling symbols over a certain length 
> all of the time with a default of something like 256 or 512
> 4 - modify our FastDemangler to abort if the de-mangled string goes over a 
> certain limit to avoid bad cases like this...
> 
> #1 would still mean we get a huge delay (like 280 seconds) when starting to 
> debug this binary, but might prevent multiple symbols from adding to that 
> delay...
> 
> #2 would require debugging debugging once and then knowing which symbols took 
> a while to de-mangle. If we time each de-mangle, we can warn that there are 
> large mangled names and print the mangled name so the user might know?
> 
> #3 would disable de-mangling of long names at the risk of not de-mangling 
> names that are close to the limit
> 
> #4 requires that our FastDemangle code can decode the string mangled string. 
> The fast de-mangler currently aborts on tricky de-mangling and we fall back 
> onto cxa_demangle from the C++ library which doesn't not have a cutoff on 
> length...
> 
> Can anyone else think of any other solutions?
> 
> Greg Clayton
> 
> 
> 
> 
> 
> 
> ___
> lldb-dev mailing list
> lldb-dev@lists.llvm.org 
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev 
> 

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Huge mangled names are causing long delays when loading symbol table symbols

2018-01-24 Thread Zachary Turner via lldb-dev
What about doing a partial demangle?   Take at most 1024 (for example)
characters from the mangled name, demangle that, and then display ... at
the end.

On Wed, Jan 24, 2018 at 3:48 PM Greg Clayton via lldb-dev <
lldb-dev@lists.llvm.org> wrote:

> I have an issue where I am debugging a C++ binary that is around 250MB in
> size. It contains some mangled names that are crazy:
>
>
> _ZNK3shk6detail17CallbackPublisherIZNS_5ThrowERKNSt15__exception_ptr13exception_ptrEEUlOT_E_E9SubscribeINS0_9ConcatMapINS0_18CallbackSubscriberIZNS_6GetAllIiNS1_IZZNS_9ConcatMapIZNS_6ConcatIJNS1_IZZNS_3MapIZZNS_7IfEmptyIS9_EEDaS7_ENKUlS6_E_clINS1_IZZNS_4TakeIiEESI_S7_ENKUlS6_E_clINS1_IZZNS_6FilterIZNS_9ElementAtEmEUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZZNSL_ImEESI_S7_ENKUlS6_E_clINS1_IZNS_4FromINS0_22InfiniteRangeContainerIiSI_S7_EUlS7_E_SI_S6_EUlS7_E_SI_S6_EUlS7_E_SI_S6_EUlS7_E_SI_S6_EUlS7_E_EESI_S7_ENKUlS6_E_clIS14_EESI_S6_EUlS7_E_EERNS1_IZZNSH_IS9_EESI_S7_ENKSK_IS14_EESI_S6_EUlS7_E0_ESI_DpOT_EUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZNS_5StartIJZNS_4JustIJS19_S1C_EEESI_S1F_EUlvE_ZNS1K_IJS19_S1C_EEESI_S1F_EUlvE0_EEESI_S1F_EUlS7_E_SI_S6_EUlS7_E_St6vectorIS6_SaIS6_EERKT0_NS_12ElementCountEbEUlS7_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlOS3_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlvE_EES1G_S1O_E25ConcatMapValuesSubscriberEEEDaS7_
>
> This de-mangles to something that is 72MB in size and takes 280 seconds
> (try running "time c++filt -n" on the above string).
>
> There are probably many symbols likes this in this binary. Currently lldb
> will de-mangle all names in the symbol table so that we can chop up the
> names so we know function base names and we might be able to classify a
> base name as a method or function for breakpoint categorization.
>
> My questions is: how do we work around such issues in LLDB? A few
> solutions I can think of:
> 1 - time each name demangle and if it takes too long somehow stop
> de-mangling similar symbols or symbols over a certain length?
> 2 - allow a setting that says "don't de-mangle names that start with..."
> and the setting has a list of prefixes.
> 3 - have a setting that turns off de-mangling symbols over a certain
> length all of the time with a default of something like 256 or 512
> 4 - modify our FastDemangler to abort if the de-mangled string goes over a
> certain limit to avoid bad cases like this...
>
> #1 would still mean we get a huge delay (like 280 seconds) when starting
> to debug this binary, but might prevent multiple symbols from adding to
> that delay...
>
> #2 would require debugging debugging once and then knowing which symbols
> took a while to de-mangle. If we time each de-mangle, we can warn that
> there are large mangled names and print the mangled name so the user might
> know?
>
> #3 would disable de-mangling of long names at the risk of not de-mangling
> names that are close to the limit
>
> #4 requires that our FastDemangle code can decode the string mangled
> string. The fast de-mangler currently aborts on tricky de-mangling and we
> fall back onto cxa_demangle from the C++ library which doesn't not have a
> cutoff on length...
>
> Can anyone else think of any other solutions?
>
> Greg Clayton
>
>
>
>
>
>
> ___
> lldb-dev mailing list
> lldb-dev@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] Huge mangled names are causing long delays when loading symbol table symbols

2018-01-24 Thread Greg Clayton via lldb-dev
I have an issue where I am debugging a C++ binary that is around 250MB in size. 
It contains some mangled names that are crazy:

_ZNK3shk6detail17CallbackPublisherIZNS_5ThrowERKNSt15__exception_ptr13exception_ptrEEUlOT_E_E9SubscribeINS0_9ConcatMapINS0_18CallbackSubscriberIZNS_6GetAllIiNS1_IZZNS_9ConcatMapIZNS_6ConcatIJNS1_IZZNS_3MapIZZNS_7IfEmptyIS9_EEDaS7_ENKUlS6_E_clINS1_IZZNS_4TakeIiEESI_S7_ENKUlS6_E_clINS1_IZZNS_6FilterIZNS_9ElementAtEmEUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZZNSL_ImEESI_S7_ENKUlS6_E_clINS1_IZNS_4FromINS0_22InfiniteRangeContainerIiSI_S7_EUlS7_E_SI_S6_EUlS7_E_SI_S6_EUlS7_E_SI_S6_EUlS7_E_SI_S6_EUlS7_E_EESI_S7_ENKUlS6_E_clIS14_EESI_S6_EUlS7_E_EERNS1_IZZNSH_IS9_EESI_S7_ENKSK_IS14_EESI_S6_EUlS7_E0_ESI_DpOT_EUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZNS_5StartIJZNS_4JustIJS19_S1C_EEESI_S1F_EUlvE_ZNS1K_IJS19_S1C_EEESI_S1F_EUlvE0_EEESI_S1F_EUlS7_E_SI_S6_EUlS7_E_St6vectorIS6_SaIS6_EERKT0_NS_12ElementCountEbEUlS7_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlOS3_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlvE_EES1G_S1O_E25ConcatMapValuesSubscriberEEEDaS7_

This de-mangles to something that is 72MB in size and takes 280 seconds (try 
running "time c++filt -n" on the above string).

There are probably many symbols likes this in this binary. Currently lldb will 
de-mangle all names in the symbol table so that we can chop up the names so we 
know function base names and we might be able to classify a base name as a 
method or function for breakpoint categorization.

My questions is: how do we work around such issues in LLDB? A few solutions I 
can think of:
1 - time each name demangle and if it takes too long somehow stop de-mangling 
similar symbols or symbols over a certain length? 
2 - allow a setting that says "don't de-mangle names that start with..." and 
the setting has a list of prefixes. 
3 - have a setting that turns off de-mangling symbols over a certain length all 
of the time with a default of something like 256 or 512
4 - modify our FastDemangler to abort if the de-mangled string goes over a 
certain limit to avoid bad cases like this...

#1 would still mean we get a huge delay (like 280 seconds) when starting to 
debug this binary, but might prevent multiple symbols from adding to that 
delay...

#2 would require debugging debugging once and then knowing which symbols took a 
while to de-mangle. If we time each de-mangle, we can warn that there are large 
mangled names and print the mangled name so the user might know?

#3 would disable de-mangling of long names at the risk of not de-mangling names 
that are close to the limit 

#4 requires that our FastDemangle code can decode the string mangled string. 
The fast de-mangler currently aborts on tricky de-mangling and we fall back 
onto cxa_demangle from the C++ library which doesn't not have a cutoff on 
length...

Can anyone else think of any other solutions?

Greg Clayton






___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] LLVM Social - Paris: January 30th, 2018

2018-01-24 Thread Arnaud Allard de Grandmaison via lldb-dev
Reminder : the next LLVM social in Paris will happen next week, on January
30th, 2018.

Everyone interested in LLVM, Clang, lldb, Polly, lld, ... is invited to
join.

Event details, including registration (free but mandatory) at
http://www.meetup.com/LLVM-Clang-social

This meetup agenda :
 - Dimitri Gerin will present his initial thinking on "vlang : a C++ RTL
simulator and VHDL convertor".
 - Adrien Guinet, Serge Guelton and Juan Manuel Martinez will talk about
the "Challenges when building an LLVM bitcode obfuscator", based on their 4
years experience building an industrial strength code obfuscator for C/C++
and Objective C.

Looking forward to meet you !
-- Arnaud de Grandmaison, Duncan Sands, Sylvestre Ledru

On Mon, Jan 8, 2018 at 3:41 PM, Arnaud Allard de Grandmaison <
arnaud.ad...@gmail.com> wrote:

> The next LLVM social in Paris will happen on January 30th, 2018.
>
> Everyone interested in LLVM, Clang, lldb, Polly, lld, ... is invited to
> join.
>
> Event details, including registration (free but mandatory) at
> http://www.meetup.com/LLVM-Clang-social
>
> For this meetup, Adrien Guinet, Serge Guelton and Juan Manuel Martinez
> will talk about the "Challenges when building an LLVM bitcode obfuscator",
> based on their 4 years experience building an industrial strength code
> obfuscator for C/C++ and Objective C.
>
> Looking forward to meet you !
> -- Arnaud de Grandmaison, Duncan Sands, Sylvestre Ledru
>
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev