[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error and pthread_cancel

2023-03-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

Andrew Pinski  changed:

   What|Removed |Added

 CC||dimitri at ouroboros dot rocks

--- Comment #19 from Andrew Pinski  ---
*** Bug 109198 has been marked as a duplicate of this bug. ***

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error and pthread_cancel

2022-10-18 Thread stsp at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

--- Comment #18 from Stas Sergeev  ---
(In reply to Stas Sergeev from comment #5)
> And its running on a stack previously
> poisoned before pthread_cancel().

And the reason for that is because
the glibc in use is the one not built
with -fsanitize=address. When it calls
its __do_cancel() which has attribute
"noreturn", __asan_handle_noreturn()
is not being called. Therefore the
canceled thread remains with the
poison below SP.
I believe the glibc re-built with asan
would not exhibit the crash.

Note: all URLs above where I was pointing
to the code, now either are a dead links
or point to wrong lines. Its quite a shame
that such a bug remains unfixed after a
complete explanation was provided, but now
that explanation is rotten...

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error and pthread_cancel

2022-02-11 Thread stsp at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

--- Comment #17 from Stas Sergeev  ---
I sent the small patch-set here:
https://lore.kernel.org/lkml/20220126191441.3380389-1-st...@yandex.ru/
but it is so far ignored by kernel developers.
Someone from this bugzilla should give me an
Ack or Review, or this won't float.

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error and pthread_cancel

2022-01-25 Thread stsp at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

--- Comment #16 from Stas Sergeev  ---
I think I'll propose to apply something like this to linux kernel:

diff --git a/kernel/signal.c b/kernel/signal.c
index 6f3476dc7873..0549212a8dd6 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -4153,6 +4153,7 @@ do_sigaltstack (const stack_t *ss, stack_t *oss, unsigned
long sp,
if (ss_mode == SS_DISABLE) {
ss_size = 0;
ss_sp = NULL;
+   ss_flags = SS_DISABLE;
} else {
if (unlikely(ss_size < min_ss_size))
ret = -ENOMEM;

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error and pthread_cancel

2022-01-25 Thread stsp at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

--- Comment #15 from Stas Sergeev  ---
(In reply to Martin Liška from comment #14)
> Please report to upstream as well.

I'd like some guidance on how should that
be addressed, because that will allow to
specify the upstream.
I am not entirely sure that linux is doing
the right thing, and I am not sure man page
even makes sense saying that:
---
The old_ss.ss_flags may return either of the following values:

   SS_ONSTACK
   SS_DISABLE
   SS_AUTODISARM
---

... because what I see is the return of
"SS_DISABLE|SS_AUTODISARM", which is what I
write to flags for probing.
This is cludgy.
Does anyone know what fix should that get?

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error and pthread_cancel

2022-01-25 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

--- Comment #14 from Martin Liška  ---
(In reply to Stas Sergeev from comment #13)
> Found another problem.
> https://github.com/gcc-mirror/gcc/blob/master/libsanitizer/asan/asan_posix.
> cpp#L53
> The comment above that line talks about
> SS_AUTODISARM, but the line itself does
> not account for any flags. In a mean time,
> linux returns SS_DISABLE in combination
> with flags, like SS_AUTODISARM. So the
> "!=" check should not be used.
> 
> My app probes for SS_AUTODISARM by trying
> to set it, and after that, asan breaks.
> This is quite cludgy though.
> Should the check be changed to
> if (!(signal_stack.ss_flags & SS_DISABLE))
> or maybe linux should not return any flags
> together with SS_DISABLE?
> man page talks "strange things" on that subject.

Please report to upstream as well.

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error and pthread_cancel

2022-01-25 Thread stsp at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

--- Comment #13 from Stas Sergeev  ---
Found another problem.
https://github.com/gcc-mirror/gcc/blob/master/libsanitizer/asan/asan_posix.cpp#L53
The comment above that line talks about
SS_AUTODISARM, but the line itself does
not account for any flags. In a mean time,
linux returns SS_DISABLE in combination
with flags, like SS_AUTODISARM. So the
"!=" check should not be used.

My app probes for SS_AUTODISARM by trying
to set it, and after that, asan breaks.
This is quite cludgy though.
Should the check be changed to
if (!(signal_stack.ss_flags & SS_DISABLE))
or maybe linux should not return any flags
together with SS_DISABLE?
man page talks "strange things" on that subject.

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error

2022-01-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

Andrew Pinski  changed:

   What|Removed |Added

 CC||contino at epigenesys dot com

--- Comment #12 from Andrew Pinski  ---
*** Bug 103978 has been marked as a duplicate of this bug. ***

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error

2022-01-20 Thread stsp at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

--- Comment #11 from Stas Sergeev  ---
The third bug here seems to be
that __asan_handle_no_return:
https://github.com/gcc-mirror/gcc/blob/master/libsanitizer/asan/asan_rtl.cpp#L602
also calls sigaltstack() before
unpoisoning stacks. I believe this
makes the problem much more reproducible,
for example the test-case with longjmp()
is likely possible too. I've found about
that instance by trying to call
__asan_handle_no_return() manually as a
pthread cleanup handler, in a hope to
work around the destructor bug. But it
appears __asan_handle_no_return() does
the same thing.
So the fix should be to move this line:
https://github.com/gcc-mirror/gcc/blob/master/libsanitizer/asan/asan_rtl.cpp#L607
above PlatformUnpoisonStacks() call.

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error

2022-01-19 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

--- Comment #10 from Martin Liška  ---
(In reply to Stas Sergeev from comment #9)
> (In reply to Martin Liška from comment #8)
> > Please report the problem to upstream libsanitizer project:
> > https://github.com/llvm/llvm-project/issues
> 
> I already did:
> https://github.com/google/sanitizers/issues/1171#issuecomment-1015913891
> But URL is different, should I also report
> that to llvm-project?

That location is fine, however, they have a duplicated bugzilla.

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error

2022-01-19 Thread stsp at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

--- Comment #9 from Stas Sergeev  ---
(In reply to Martin Liška from comment #8)
> Please report the problem to upstream libsanitizer project:
> https://github.com/llvm/llvm-project/issues

I already did:
https://github.com/google/sanitizers/issues/1171#issuecomment-1015913891
But URL is different, should I also report
that to llvm-project?

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error

2022-01-19 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

Martin Liška  changed:

   What|Removed |Added

 Status|WAITING |NEW

--- Comment #8 from Martin Liška  ---
Please report the problem to upstream libsanitizer project:
https://github.com/llvm/llvm-project/issues

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error

2022-01-18 Thread stsp at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

--- Comment #7 from Stas Sergeev  ---
Created attachment 52221
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52221=edit
test case

This is a reproducer for both problems.

$ cc -Wall -o bug -ggdb3 -fsanitize=address bug.c -O1
to see the canary overwrite problem.

$ cc -Wall -o bug -ggdb3 -fsanitize=address bug.c -O0
to see the poisoned stack after pthread_cancel()
problem.

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error

2022-01-18 Thread stsp at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

--- Comment #6 from Stas Sergeev  ---
I think the fix (of at least 1 problem here)
would be to move this line:
https://code.woboq.org/gcc/libsanitizer/asan/asan_thread.cc.html#109
upwards, before this:
https://code.woboq.org/gcc/libsanitizer/asan/asan_thread.cc.html#103
It will then unpoison stack before
playing its sigaltstack games.
But I don't know how to test that idea.

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error

2022-01-18 Thread stsp at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

--- Comment #5 from Stas Sergeev  ---
Another problem here seems to be
that pthread_cancel() doesn't unpoison
the cancelled thread's stack.
This causes dtors to run on a
randomly poisoned stack, depending
on where the cancellation happened.
That explains the "random" nature of
a crash, and the fact that pthread_cancel()
is in a test-case attached to that ticket,
and in my program as well.

So, the best diagnostic I can come up
with, is that after pthread_cancel() we
have this:
---
#0  __sanitizer::UnsetAlternateSignalStack ()
at
../../../../libsanitizer/sanitizer_common/sanitizer_posix_libcdep.cpp:190
#1  0x77672f0d in __asan::AsanThread::Destroy (this=0x7358e000)
at ../../../../libsanitizer/asan/asan_thread.cpp:104
#2  0x769d2c61 in __GI___nptl_deallocate_tsd ()
at nptl_deallocate_tsd.c:74
#3  __GI___nptl_deallocate_tsd () at nptl_deallocate_tsd.c:23
#4  0x769d5948 in start_thread (arg=)
at pthread_create.c:446
#5  0x76a5a640 in clone3 ()
at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
---

And its running on a stack previously
poisoned before pthread_cancel().
Then it detects the access to poisoned
area and is trying to do a stack trace.
But that fails too because the redzone
canary is overwritten.
So all we get is a crash.

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error

2022-01-18 Thread stsp at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

--- Comment #4 from Stas Sergeev  ---
Thread 3 "X ev" hit Breakpoint 4, __sanitizer::UnsetAlternateSignalStack () at
../../../../libsanitizer/sanitizer_common/sanitizer_posix_libcdep.cpp:190
190 void UnsetAlternateSignalStack() {
(gdb) n
194   altstack.ss_size = GetAltStackSize();  // Some sane value required on
Darwin.
(gdb) p /x $rsp
$128 = 0x7fffee0a0ce0
(gdb) p 
$129 = (stack_t *) 0x7fffee0a0d00
(gdb) p /x *(int *)0x7fffee0a0cc0  <== canary address
$130 = 0x41b58ab3
(gdb) p 0x7fffee0a0ce0-0x7fffee0a0cc0
$132 = 32

Here we can see that before a
call to GetAltStackSize(), rsp
is 32 bytes above the lowest
canary value. After the call,
there is no more canary because
32 bytes are quickly overwritten
by a call to getconf().

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error

2022-01-18 Thread stsp at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

--- Comment #3 from Stas Sergeev  ---
Why does it check for a redzone
on a non-leaf function? GetAltStackSize()
calls to a glibc's getconf and that
overwrites a canary.
Maybe it shouldn't use/check the redzone
on a non-leaf function?

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error

2022-01-18 Thread stsp at users dot sourceforge.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

Stas Sergeev  changed:

   What|Removed |Added

 CC||stsp at users dot 
sourceforge.net

--- Comment #2 from Stas Sergeev  ---
I have the very same crash with the
multi-threaded app. The test-case from
this ticket doesn't reproduce it for
me either, but my app crashes nevertheless.
So I debugged it a bit myself.
gcc-11.2.1.

The crash happens here:
https://github.com/gcc-mirror/gcc/blob/master/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc#L10168
Here asan checks that sigaltstack()
didn't corrupt anything while writing
the "old setting" to "oss" ptr.
Next, some check is later fails here:
https://code.woboq.org/gcc/libsanitizer/asan/asan_thread.cc.html#340
Asan failed to find the canary value
kCurrentStackFrameMagic. The search
was done the following way: it walks
the shadow stack down, and looks for
the kAsanStackLeftRedzoneMagic to find
the bottom of redzone. Then, at the
bottom of redzone, it looks for the
canary value. I checked that the lowest
canary value is overwritten by the call
to GetAltStackSize(). It uses SIGSTKSZ
macro:
https://code.woboq.org/llvm/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp.html#170
which expands into a getconf()
call, so eats up quite a lot.

Now I am not entirely sure what conclusion
can be derived out of that. I think that
the culprit is probably here:
https://code.woboq.org/gcc/libsanitizer/asan/asan_interceptors_memintrinsics.h.html#26
They say that they expect 16 bytes of
a redzone, but it seems to be completely
exhausted with all canaries overwritten.

Does something of the above makes sense?
This is the first time I am looking into
an asan code.

[Bug sanitizer/101476] AddressSanitizer check failed, points out a (potentially) non-existing stack error

2021-07-22 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101476

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1
   Last reconfirmed||2021-07-22

--- Comment #1 from Martin Liška  ---
Cannot reproduce that with

gcc version 10.3.1 20210707 [revision 048117e16c77f82598fca9af585500572d46ad73]
(SUSE Linux) 

and

gcc version 11.1.1 20210625 [revision 62bbb113ae68a7e724255e17143520735bcb9ec9]
(SUSE Linux)