Re: [RFC][PR 67336][PING^6] Verify pointers during stack unwind

Yury Gribov Thu, 03 Aug 2017 05:53:22 -0700

On Thu, Aug 3, 2017 at 1:22 PM, H.J. Lu <[email protected]> wrote:
> On Wed, Aug 2, 2017 at 9:29 PM, Yury Gribov <[email protected]> 
> wrote:
>> On 02.08.2017 23:04, H.J. Lu wrote:
>>>
>>> On Wed, Aug 2, 2017 at 1:56 PM, Yury Gribov <[email protected]>
>>> wrote:
>>>>
>>>> On 02.08.2017 21:48, H.J. Lu wrote:
>>>>>
>>>>>
>>>>> On Wed, Aug 2, 2017 at 1:39 PM, Yury Gribov
>>>>> <[email protected]>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> On 02.08.2017 20:02, Jeff Law wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 08/02/2017 12:47 PM, Jakub Jelinek wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Aug 02, 2017 at 12:38:13PM -0600, Jeff Law wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 07/17/2017 01:23 AM, Yuri Gribov wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I've rebased the previous patch to trunk per Andrew's suggestion.
>>>>>>>>>> Original patch description/motivation/questions are in
>>>>>>>>>> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01869.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Is his stuff used for exception handling?  If so, doesn't that make
>>>>>>>>> the
>>>>>>>>> performance a significant concern (ie that msync call?)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> For the performance issue, I wonder if it wouldn't be better to just
>>>>>>>> compile the unwinder twice, basically include everything needed for
>>>>>>>> the
>>>>>>>> verification backtrace in a single *.c file with special defines, and
>>>>>>>> not export anything from that *.o file except the single entrypoint.
>>>>>>>> That would also handle the ABI problems.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Yea.
>>>>>>>
>>>>>>> Given that the vast majority of the time the addresses are valid, I
>>>>>>> wonder if we could take advantage of that property to keep the
>>>>>>> overhead
>>>>>>> lower in the most common cases.
>>>>>>>
>>>>>>>> For the concurrent calls of other threads unmapping stacks of running
>>>>>>>> threads (that is where most of the verification goes against), I'm
>>>>>>>> afraid
>>>>>>>> that is simply non-solveable, even if you don't cache anything, it
>>>>>>>> will
>>>>>>>> still be racy.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Absolutely -- I think solving this stuff 100% reliably without races
>>>>>>> isn't possible.  I think the question is can we make this
>>>>>>> significantly
>>>>>>> better.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> All,
>>>>>>
>>>>>> First of all, thanks for the feedback.  This is indeed not a 100%
>>>>>> robust
>>>>>> solution, just something to allow tools like Asan to produce more sane
>>>>>> backtraces on crash (currently when stack is corrupted they would crash
>>>>>> in
>>>>>> the unwinder without printing at least partial backtrace).
>>>>>>
>>>>>
>>>>> FWIW, glibc 2.26 is changed to abort as soon as possible when stack is
>>>>> corrupted:
>>>>>
>>>>> https://sourceware.org/bugzilla/show_bug.cgi?id=21752
>>>>> https://sourceware.org/bugzilla/show_bug.cgi?id=12189
>>>>>
>>>>> There is very little point to unwind stack when stack is corrupted.
>>>>
>>>>
>>>>
>>>> I'd argue that situation with __stack_chk_fail is very special.  It's
>>>> used
>>>> in production code where you'd want to halt as soon as corruption is
>>>> detected to prevent further damage so even printing message is an
>>>> additional
>>>> risk.  Debugging tools (e.g. sanitizers) are different and would prefer
>>>> to
>>>> print as many survived frames as possible.
>>>>
>>>
>>> But stack unwinder in libgcc is primarily for production use, not for
>>> debugging.
>>
>>
>> I can see it used in different projects to print backtraces on error (bind9,
>> SimGrid, sanitizers).  backtrace(3) also sits on top of libgcc unwinder and
>> seems to be meant primarily for debugging (at least many projects use it for
>> this purpose:
>> https://codesearch.debian.net/search?q=backtrace_symbols&perpkg=1). Same for
>> libbacktrace which, as its README mentions, is meant to "print a detailed
>> backtrace when an error occurs or to gather detailed profiling information".
>
> But it shouldn't be performed on a corrupted stack.  When a stack is
> corrupted, there is just no way to reliably unwind it.


Why not? Attached patch shows that it's possible (up to a point of
corruption of course).

> It isn't a problem
> for a debugger which is designed to analyze corrupted program.

Yes but forcing all applications that would like to print helpful
message on stack corruption to employ gdb sounds less efficient that
providing fail-safe APIs in standard library...

>> Validation is also performed by libunwind (at least in some cases) which
>> libgcc unwinder mimics.
>>
>>
>>> Use it to unwind corrupted stack isn't appropriate.   Santizer can try
>>> similar
>>> similar to valgrind to invoke a debugger to unwind corrupted stack.
>>
>>
>> -Y
>
>
>
> --
> H.J.

Re: [RFC][PR 67336][PING^6] Verify pointers during stack unwind

Reply via email to