[Bug sanitizer/79341] Many Asan tests fail on s390

2024-05-04 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #78 from Eric Gallager  ---
(In reply to Ilya Leoshkevich from comment #77)
> Apparently fixing the message in GCC will produce maintenance overhead [1]. 
> If that's not very important to you, I'd rather leave this message as is.
> 
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648775.html

OK, I haven't actually seen GCC emit the message in the wild myself yet,
actually; I only came across it due to searching for bugs related to MSan...

[Bug sanitizer/79341] Many Asan tests fail on s390

2024-04-04 Thread iii at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #77 from Ilya Leoshkevich  ---
Apparently fixing the message in GCC will produce maintenance overhead [1].  If
that's not very important to you, I'd rather leave this message as is.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648775.html

[Bug sanitizer/79341] Many Asan tests fail on s390

2024-04-03 Thread iii at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #76 from Ilya Leoshkevich  ---
It's because the sanitizer runtime was copied from LLVM to GCC.  I will post a
patch removing the unsupported MSan and DFSan from the error message.

[Bug sanitizer/79341] Many Asan tests fail on s390

2024-03-30 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org

--- Comment #75 from Eric Gallager  ---
(In reply to Dominik Vogt from comment #25)
> Looks better, but now we get this quite often:
> 
> --
> ==23722==ERROR: Your kernel seems to be vulnerable to CVE-2016-2143.  Using
> ASa\
> n, 
> MSan, TSan, DFSan or LSan with such kernel can and will crash your 
> machine, or worse.
> --
> 
> I'll try to figure out what kernel version we need.

Why does this error message mention all of those sanitizers, when GCC only
supports some of them?

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #74 from Dominik Vogt  ---
With the pending patches/fixes, the *san testsuites are clean on s390x biarch
and s390.  :-)

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #73 from Jakub Jelinek  ---
I've filed https://reviews.llvm.org/D29992 upstream for this.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #72 from Dominik Vogt  ---
I wanted to refer to the funny pc value.  The line information is actually
correct.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #71 from Ulrich Weigand  ---
(In reply to Dominik Vogt from comment #70)
> If funny line information is the only consequence, no.  Is it safe to assume
> that libsanitizer won't crash or produce garbege because of this?

Why should line infomation be "funny"?  With the odd addresses (decremented by
one), line information should identify the line of the call, otherwise we'd get
the line *after* the call.  IMO identifying the call is actually better ...

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #70 from Dominik Vogt  ---
If funny line information is the only consequence, no.  Is it safe to assume
that libsanitizer won't crash or produce garbege because of this?

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #69 from Jakub Jelinek  ---
Is it really that bad? Does it really matter if the addresses printed in the
backtrace are somewhere in the call instructions, end of those call
instructions or their start?

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #68 from Dominik Vogt  ---
Okay, that fixes the test failure, but the addresses further up in the
backtrace are still bad, e.g.

#0 0x10008d2 in NullDeref
#1 0x1000759 in main
#2 0x3fffce23069 in
#3 0x10007d5 

Maybe it's not worth it to knock a workaround together when a real fix is
preferrable.  It's probably acceptable to wait for an upstream fix.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #67 from Jakub Jelinek  ---
This seems to fix the testcase with -march=zEC12 for me.
The problem is that while we carefully compute it, other code than "cleverly"
overwrites it back to the pc it got from the siginfo.

--- sanitizer_common/sanitizer_unwind_linux_libcdep.cc.jj   2017-02-07
11:08:15.0 -0500
+++ sanitizer_common/sanitizer_unwind_linux_libcdep.cc  2017-02-15
10:05:13.246850984 -0500
@@ -92,6 +92,17 @@ uptr Unwind_GetIP(struct _Unwind_Context
   CHECK(res == _UVRSR_OK && "_Unwind_VRS_Get failed");
   // Clear the Thumb bit.
   return val & ~(uptr)1;
+#elif SANITIZER_LINUX && !defined(__arm__)
+  int pc_before_insn = 0;
+  uptr pc = _Unwind_GetIPInfo (ctx, _before_insn);
+  /* If context is for a signal frame, the returned PC is
+ the right one to use, but StackTrace::GetPreviousInstructionPc
+ will be applied to it later unconditionally.  So adjust PC now
+ so that GetPreviousInstructionPc will return what _Unwind_GetIPInfo
+ returned.  */
+  if (pc_before_insn)
+pc += pc - StackTrace::GetPreviousInstructionPc (pc);
+  return pc;
 #else
   return _Unwind_GetIP(ctx);
 #endif
--- asan/asan_errors.cc.jj  2017-02-07 11:08:15.0 -0500
+++ asan/asan_errors.cc 2017-02-15 10:56:56.816850984 -0500
@@ -30,7 +30,15 @@ void ErrorStackOverflow::Print() {
   Printf("%s", d.EndWarning());
   scariness.Print();
   BufferedStackTrace stack;
-  GetStackTraceWithPcBpAndContext(, kStackTraceMax, pc, bp, context,
+  uptr adjusted_pc = pc;
+#if SANITIZER_LINUX && !defined(__arm__)
+  // Undo the damage StackTrace::GetPreviousInstructionPc will do to the pc.
+  // For deadly signal pc we have here is actually the pc of the faulting
+  // instruction.
+  adjusted_pc += pc - StackTrace::GetPreviousInstructionPc (pc);
+#endif
+  GetStackTraceWithPcBpAndContext(, kStackTraceMax, adjusted_pc, bp,
+  context,
   common_flags()->fast_unwind_on_fatal);
   stack.Print();
   ReportErrorSummary("stack-overflow", );
@@ -72,8 +80,15 @@ void ErrorDeadlySignal::Print() {
   }
   scariness.Print();
   BufferedStackTrace stack;
-  GetStackTraceWithPcBpAndContext(, kStackTraceMax, pc, bp, context,
-  common_flags()->fast_unwind_on_fatal);
+  uptr adjusted_pc = pc;
+#if SANITIZER_LINUX && !defined(__arm__)
+  // Undo the damage StackTrace::GetPreviousInstructionPc will do to the pc.
+  // For deadly signal pc we have here is actually the pc of the faulting
+  // instruction.
+  adjusted_pc += pc - StackTrace::GetPreviousInstructionPc (pc);
+#endif
+  GetStackTraceWithPcBpAndContext(, kStackTraceMax, adjusted_pc, bp,
+  context,
common_flags()->fast_unwind_on_fatal);
   stack.Print();
   MaybeDumpInstructionBytes(pc);
   Printf("AddressSanitizer can not provide additional info.\n");

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #66 from Dominik Vogt  ---
Compiled from scratch to make sure it's not a build dependency problem, but the
tests still fail because of the odd backtrace addresses.  Can I provide some
information from single stepping in Gdb?

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #65 from Dominik Vogt  ---
That patch does not compile, and fixing the compiler error (context -> ctx)
doesn't help either.

> but I also can't reproduce the nullptr-1.c failure myself

An example command line is

 $ .../gcc/build-fixes/gcc/xgcc -B.../build-fixes/gcc/
.../gcc/testsuite/c-c++-common/asan/null-deref-1.c
-B.../build-fixes/s390x-ibm-linux-gnu/./libsanitizer/
-B.../build-fixes/s390x-ibm-linux-gnu/./libsanitizer/asan/
-L.../build-fixes/s390x-ibm-linux-gnu/./libsanitizer/asan/.libs
-fsanitize=address -g -I.../gcc/testsuite/../../libsanitizer/include
-fno-diagnostics-show-caret -fdiagnostics-color=never -O2
-fno-omit-frame-pointer -fno-shrink-wrap -lm -m64 -o ./null-deref-1.exe
-march=zEC12

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #64 from Jakub Jelinek  ---
Perhaps the easiest hack would be for
sanitizer_common/sanitizer_unwind_linux_libcdep.cc (Unwind_GetIP) call
_Unwind_GetIPInfo instead of _Unwind_GetIP (perhaps just on SANITIZER_LINUX or
wherever it is available), and return for the signal frames the pc + 1 so that
the later - 1 subtraction undoes that.

So like (completely untested, but I also can't reproduce the nullptr-1.c
failure myself):

--- libsanitizer/sanitizer_common/sanitizer_unwind_linux_libcdep.cc.jj 
2016-11-09 15:22:41.0 +0100
+++ libsanitizer/sanitizer_common/sanitizer_unwind_linux_libcdep.cc
2017-02-15 15:26:31.658948328 +0100
@@ -92,6 +92,17 @@ uptr Unwind_GetIP(struct _Unwind_Context
   CHECK(res == _UVRSR_OK && "_Unwind_VRS_Get failed");
   // Clear the Thumb bit.
   return val & ~(uptr)1;
+#elif SANITIZER_LINUX && !defined(__arm__)
+  int pc_before_insn = 0;
+  uptr pc = _Unwind_GetIPInfo (context, _before_insn);
+  /* If context is for a signal frame, the returned PC is
+ the right one to use, but StackTrace::GetPreviousInstructionPc
+ will be applied to it later unconditionally.  So adjust PC now
+ so that GetPreviousInstructionPc will return what _Unwind_GetIPInfo
+ returned.  */
+  if (pc_before_insn)
+pc += pc - StackTrace::GetPreviousInstructionPc (pc);
+  return pc;
 #else
   return _Unwind_GetIP(ctx);
 #endif

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #63 from Jakub Jelinek  ---
(In reply to Jakub Jelinek from comment #61)
> It is true that libasan calls just _Unwind_GetIP rather than
> _Unwind_GetIPInfo,
> but I don't see where there is that subtraction of 1, so it shouldn't matter;

Ah, it is in StackTrace::GetPreviousInstructionPc.  Unfortunately it is used at
the point where the context is no longer available.  So it at least for the
case when using the DWARF unwinder should be done differently.
This needs to be fixed upstream.  It is pure luck this happened to work on all
the other arches.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #62 from Dominik Vogt  ---
(In reply to Jakub Jelinek from comment #61)
> It is true that libasan calls just _Unwind_GetIP rather than
> _Unwind_GetIPInfo,
> but I don't see where there is that subtraction of 1, so it shouldn't matter;
> it seems to record the address that return address that has been given by
> the unwinder.

_Unwind_GetIP returns the correct addresses:

(gdb) disas
0x03fff728bb08 <+40>:   brasl   %r14,0x3fff71a6458 <_Unwind_GetIP@plt>
0x03fff728bb0e <+46>:   larl%r10,0x3fff773e758
<_ZN11__sanitizer14PageSizeCachedE>
(gdb) b *0x03fff728bb0e
(gdb) display /x $r2
(gdb) c
...
1: /x $r2 = 0x100083a
1: /x $r2 = 0x10006c2
1: /x $r2 = 0x3fff6e2306a
1: /x $r2 = 0x100073e

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #61 from Jakub Jelinek  ---
(In reply to Florian Weimer from comment #58)
> (In reply to Dominik Vogt from comment #57)
> > libsanitizer miscalculates the Pcs in the backtrace:
> > 
> > #0 0x1000839 in NullDeref
> > #1 0x10006c1 in main
> > #2 0x3fff6e23069 in __libc_start_main
> > #3 0x100073d
> > 
> > These are all odd addresses, pointing to the last byte of the previous
> > instruction.  In case of null-deref-1.c that byte belongs to some
> > instrumentation code that is associated with line 11.
> 
> The address decrement should only happen for call instructions.  This thread
> has some background how this is supposed to work:
> 
>   https://gcc.gnu.org/ml/gcc/2016-10/msg00165.html
>   https://gcc.gnu.org/ml/gcc/2016-10/msg00170.html

??  We have the "S" .eh_frame augmentation character for that and the reason
why _Unwind_GetIPInfo has been added (it tells you whether it is something that
is a call or not, and thus whether you need to subtract 1 to get back into the
instruction (it doesn't have to be start of the instruction for unwinding
purposes) or not.
It is true that libasan calls just _Unwind_GetIP rather than _Unwind_GetIPInfo,
but I don't see where there is that subtraction of 1, so it shouldn't matter;
it seems to record the address that return address that has been given by the
unwinder.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #60 from Ulrich Weigand  ---
... well, as Florian said as well :-)

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #59 from Ulrich Weigand  ---
(In reply to Dominik Vogt from comment #57)
> libsanitizer miscalculates the Pcs in the backtrace:
> 
> #0 0x1000839 in NullDeref
> #1 0x10006c1 in main
> #2 0x3fff6e23069 in __libc_start_main
> #3 0x100073d
> 
> These are all odd addresses, pointing to the last byte of the previous
> instruction.  In case of null-deref-1.c that byte belongs to some
> instrumentation code that is associated with line 11.

Normally you should decrement the return address by one for normal frames (in
order to identify the call instruction), but you should not decrement the
return address for signal frames (since the address already identifies the
faulting instruction).

That's why there's usually a bit to distinguish signal frames from normal
frames during unwinding.  Maybe this somehow doesn't work correctly with the
libsanitizer unwinding?

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread fw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #58 from Florian Weimer  ---
(In reply to Dominik Vogt from comment #57)
> libsanitizer miscalculates the Pcs in the backtrace:
> 
> #0 0x1000839 in NullDeref
> #1 0x10006c1 in main
> #2 0x3fff6e23069 in __libc_start_main
> #3 0x100073d
> 
> These are all odd addresses, pointing to the last byte of the previous
> instruction.  In case of null-deref-1.c that byte belongs to some
> instrumentation code that is associated with line 11.

The address decrement should only happen for call instructions.  This thread
has some background how this is supposed to work:

  https://gcc.gnu.org/ml/gcc/2016-10/msg00165.html
  https://gcc.gnu.org/ml/gcc/2016-10/msg00170.html

Here's my attempt to clarify this for the x86-64 ABI:

  https://www.sourceware.org/ml/gnu-gabi/2016-q4/msg00012.html

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #57 from Dominik Vogt  ---
libsanitizer miscalculates the Pcs in the backtrace:

#0 0x1000839 in NullDeref
#1 0x10006c1 in main
#2 0x3fff6e23069 in __libc_start_main
#3 0x100073d

These are all odd addresses, pointing to the last byte of the previous
instruction.  In case of null-deref-1.c that byte belongs to some
instrumentation code that is associated with line 11.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #56 from Dominik Vogt  ---
null-deref-1.c fails because the test expects this message in source line 10
but gets it for line 11:

#0 0x1000853 in NullDeref .../c-c++-common/asan/null-deref-1.c:11
  

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-15 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #55 from Dominik Vogt  ---
(In reply to Dominik Vogt from comment #53)
> no fails with -m31; with -m64 null-deref-1.c fails with c and
> c++, and memcmp-1.c with c++ only.

memcmp-1.c is not reproducible.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-13 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #54 from Jakub Jelinek  ---
Author: jakub
Date: Mon Feb 13 23:09:09 2017
New Revision: 245411

URL: https://gcc.gnu.org/viewcvs?rev=245411=gcc=rev
Log:
PR sanitizer/79341
* c-c++-common/ubsan/float-cast-overflow-8.c (TEST): Make min and max
variables volatile.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-8.c

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-13 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #53 from Dominik Vogt  ---
(In reply to Dominik Vogt from comment #51)
> With r245382 plus the patch from comment 43, only the failure in
> null-deref-1.c is left.

Ah, not quite; no fails with -m31; with -m64 null-deref-1.c fails with c and
c++, and memcmp-1.c with c++ only.  Was any of this shupposed to be fixed? 
This thread has become a bit confusing.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-13 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

Dominik Vogt  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=79487

--- Comment #52 from Dominik Vogt  ---
New bug report for the _Decimal32 problem:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79487

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-13 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #51 from Dominik Vogt  ---
With r245382 plus the patch from comment 43, only the failure in null-deref-1.c
is left.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #50 from Jakub Jelinek  ---
Author: jakub
Date: Sat Feb 11 18:38:11 2017
New Revision: 245361

URL: https://gcc.gnu.org/viewcvs?rev=245361=gcc=rev
Log:
PR sanitizer/79341
* g++.dg/asan/deep-stack-uaf-1.C: New test.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/g++.dg/asan/deep-stack-uaf-1.C

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #49 from Jakub Jelinek  ---
On the other side, we don't turn -fno-omit-frame-pointer or
-mno-omit-leaf-frame-pointer for -fsanitize=address on other targets either,
perhaps this is just a documentation issue.  I'll add -mbackchain to this
testcase.  Perhaps we should just mention those flags in -fsanitize=address
documentation in invoke.texi.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-11 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #48 from Ulrich Weigand  ---
s390(x) has -fasynchronous-unwind-tables on by default anyway, and .eh_frame
based DWARF unwinding is the only way to create stack backtraces that always
works.

However, I understood that asan deliberately doesn't want to use DWARF
unwinding for the the malloc/free case since it can be slow.  That's why Marcin
actually added -mbackchain to LLVM in the first place.  (We've had -mbackchain
in GCC forever, but it has defaulted to off for a very long time.)

I don't think we should switch to *always* using backchain unwinding in asan,
since system libraries on s390 will be built without backchain.  However,
switching -mbackchain on by default when building for asan might make sense.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #47 from Jakub Jelinek  ---
Seems clang doesn't default to -mbackchain for -fsanitize=address, they just
force it on when testing:
if config.target_arch == 's390x':
  clang_asan_static_cflags.append("-mbackchain")

So, if we just want to go that route, we could add to deep-stack-uaf-1.C
// { dg-additional-options "-mbackchain" { target { s390*-*-* } } }

It is of course not very kind to users that would need to add it manually if
they want accurate backtraces for malloc/free.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #46 from Jakub Jelinek  ---
Or shall we use -mbackchain for -fsanitize=address by default and tweak the
unwinding code sanitizer_common/sanitizer_stacktrace.{cc,h} to use the
backchain?
AFAIK libsanitizer uses the .eh_frame unwinding for printing error dumps but
the fast unwinding (through frame pointer) by default when capturing backtraces
of malloc and free.  Always using .eh_frame would be done by defining
SANITIZER_CAN_FAST_UNWIND to 0 in sanitizer_common/sanitizer_stacktrace.h if
__s390__ (e.g. sparc and mips do this).  Wonder if LLVM emits backchain by
default or what.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #45 from Jakub Jelinek  ---
deep-stack-uaf*.C failure is presumably because the fast unwind (one that
doesn't use .eh_frame unwind info) isn't working properly.
But I'm afraid I don't know enough about s390{,x} to debug that.
E.g. on following testcase with -O2 -fno-omit-frame-pointer I get:
void foo (char *);

int
bar (char *p)
{
  foo (p);
  return 1;
}

int
baz (char *p)
{
  char a[64];
  foo (a);
  return 1;
}

stmg%r11,%r15,88(%r15)
aghi%r15,-160
lgr %r11,%r15
brasl   %r14,foo
lg  %r4,272(%r11)
lghi%r2,1
lmg %r11,%r15,248(%r11)
br  %r4

for bar and

stmg%r11,%r15,88(%r15)
aghi%r15,-224
lgr %r11,%r15
la  %r2,160(%r11)
brasl   %r14,foo
lg  %r4,336(%r11)
lghi%r2,1
lmg %r11,%r15,312(%r11)
br  %r4

for baz.  Frame pointer is $r15, stack pointer is $r11, if say in foo I ask for
frame pointer, I can easily get at $r15 from the caller (foo or bar), but how
do I get from there to the location where the outer function's $r15 is stored
at?  It is at offset 160+120 in one function and 224+120 in another (and the
stored memory value doesn't tell much, it can be always computed from the
memory location where it is stored.
So, is non-unwind info backtrace not possible on s390{,x}?  If yes, we should
disable the fast unwinding and maybe enable -fasynchronous-unwind-tables by
default on s390{,x}-linux at least when using -fsanitize=address?

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-10 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #44 from Jakub Jelinek  ---
Author: jakub
Date: Fri Feb 10 23:34:49 2017
New Revision: 245350

URL: https://gcc.gnu.org/viewcvs?rev=245350=gcc=rev
Log:
PR sanitizer/79341
* configure.tgt (s390*-*-linux*): Don't disable libsanitizer on
s390-linux 31-bit.
* sanitizer_common/sanitizer_internal_defs.h: Cherry-pick upstream
r294793.
* sanitizer_common/sanitizer_common_interceptors.inc: Cherry-pick
upstream r294790.
* sanitizer_common/sanitizer_linux_s390.cc: Cherry-pick upstream
r294799.

Modified:
trunk/libsanitizer/ChangeLog
trunk/libsanitizer/configure.tgt
trunk/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
trunk/libsanitizer/sanitizer_common/sanitizer_internal_defs.h
trunk/libsanitizer/sanitizer_common/sanitizer_linux_s390.cc

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-10 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #43 from Jakub Jelinek  ---
Ah, so, if I build with -O0, I always get the expected errors.
If I build with -O2 -mcpu=z9-109, I also get them, but with -O2 -mcpu=z10 or
-O2 -mcpu=zEC12 I don't.
Does _Decimal32 on s390{,x} behave similarly to float/double on i387, that
computations and comparisons can be performed in bigger precision than their
types?
--- gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-8.c.jj 2015-10-29
09:14:30.0 +0100
+++ gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-8.c2017-02-10
18:09:47.767251774 +0100
@@ -8,7 +8,7 @@
 #define TEST(type1, type2) \
   if (type1##_MIN) \
 {  \
-  type2 min = type1##_MIN; \
+  volatile type2 min = type1##_MIN;\
   type2 add = -1.0;\
   while (1)\
{   \
@@ -28,7 +28,7 @@
   volatile type1 tem3 = cvt_##type1##_##type2 (-1.0f); \
 }  \
   {\
-type2 max = type1##_MAX;   \
+volatile type2 max = type1##_MAX;  \
 type2 add = 1.0;   \
 while (1)  \
   {\

seems to be a workaround, the test primarily cares about the actual
conversions, not how those values are reached, so it isn't against the intent
of the test.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-10 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #42 from Dominik Vogt  ---
With glibc-2.18 and the various patches, the following tets fail:

-m31:
 * deep-stack-uaf-1.C

-m64:
 * null-deref-1.c
 * deep-stack-uaf-1.C
 * overflow-vec-1.c
 * overflow-vec-2.c
 * float-cast-overflow-10.c

I.e. the same as with glibc-2.23.  At least this part of the problems is
solved.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-10 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #41 from Dominik Vogt  ---
> The first loop loops until add is -1.00E+12, at which point for the
> first time tem is -9.223373E+18 and thus different from -9.223372E+18, and
> -9.223373E+18 should not be representable in signed long.
> Do you perhaps use HW dfp rather than software emulation?

Well, just what the test driver used:

 ... -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects 
-fsanitize=float-cast-overflow -fsanitize-recover=float-cast-overflow
-DUSE_INT128 -DUSE_DFP -DBROKEN_DECIMAL_INT128  -lm   -m64 ...

When the comparison is done in main, the values "min" and "tem" have 64-Bit
precision.  The actual comparison is

  if (tem.0_1 != -9223372036854775808)

Which is true because that value doesn't fit in a _Decimal32.  The if body is
executed, and "tem" is converted to 32 bit format and stored in %f0.  Gdb says
that the converted value is exactly the same as the value of "min", and that
seems to be the cause of the test failure.

In assembly:
ste %f2,160(%r15) < store "tem" on stack
le  %f2,160(%r15) < load "tem" from stack
ldetr   %f2,%f2,0 < convert "short" dfp value to "long"
cdtr%f2,%f4   < compare with "min"
je  .L33
le  %f0,160(%r15) < reload "tem"
brasl   %r14,cvt_sl_d32

This must look differently for you.  Now, why does the test fail for me but not
for you?

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-10 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #40 from Jakub Jelinek  ---
(In reply to Dominik Vogt from comment #38)
> (And if it does generate messages, does it take the if or the else bodies? 
> For me it's the if-bodies.)

/home/jakub/gcc/obj/gcc/xgcc -B/home/jakub/gcc/obj/gcc/ fco.c
-B/home/jakub/gcc/obj/s390x-ibm-linux-gnu/./libsanitizer/
-B/home/jakub/gcc/obj/s390x-ibm-linux-gnu/32/libsanitizer/ubsan/
-L/home/jakub/gcc/obj/s390x-ibm-linux-gnu/32/libsanitizer/ubsan/.libs
-fno-diagnostics-show-caret -fdiagnostics-color=never -O0 -Wno-psabi
-fsanitize=float-cast-overflow -fsanitize-recover=float-cast-overflow -lm -o
./fco.exe -g -m31
[jakub@devel4 testsuite]$
LD_LIBRARY_PATH=/home/jakub/gcc/obj/s390x-ibm-linux-gnu/32/libsanitizer/ubsan/.libs:/home/jakub/gcc/obj/s390x-ibm-linux-gnu/32/libgcc/32/
./fco.exe
fco.c:4:3: runtime error: value  is outside the range of representable
values of type 'long int'
fco.c:9:3: runtime error: value  is outside the range of representable
values of type 'long long int'

What do you mean by if bodies?  (-0x7fffL - 1L) is non-zero, so
obviously the else bodies aren't reached.
The first loop loops until add is -1.00E+12, at which point for the first
time tem is -9.223373E+18 and thus different from -9.223372E+18, and
-9.223373E+18 should not be representable in signed long.
Do you perhaps use HW dfp rather than software emulation?

In *.optimized dump I see:
  x.0_8 = x_7(D);
  _1 = x.0_8 u<= -9.223373E+18;
  _2 = x.0_8 u>= 9.223373E+18;
  _3 = _1 | _2;
  if (_3 != 0)
goto ; [0.00%]
  else
goto ; [0.00%]

   [0.00%]:
  _4 = VIEW_CONVERT_EXPR(x.0_8);
  _5 = (unsigned long) _4;
  __builtin___ubsan_handle_float_cast_overflow (&*.Lubsan_data0, _5);

   [0.00%]:
  _11 = (long int) x.0_8;

 [0.00%]:
  return _11;
which looks also correct to me.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-10 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #39 from Jakub Jelinek  ---
For overflow-vec-*.c moved this to PR79454.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-10 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #38 from Dominik Vogt  ---
(And if it does generate messages, does it take the if or the else bodies?  For
me it's the if-bodies.)

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-10 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #37 from Jakub Jelinek  ---
The overflow-vec-1.c and vec-2.c on -m64 fail also on ppc64{,le}.
Minimum failing testcase is:
#define SCHAR_MAX __SCHAR_MAX__
#define SCHAR_MIN (-__SCHAR_MAX__ - 1)
typedef signed char VC __attribute__((vector_size (16)));
void __attribute__((noinline,noclone))
checkvc (VC i, VC j)
{
  if (__builtin_memcmp (, , sizeof (VC)))
__builtin_abort ();
}
int
main (void)
{
  /* Check that for a vector operation, only the first element with UB is
reported.  */
  volatile VC a = (VC) { 0, SCHAR_MAX - 2, SCHAR_MAX - 2, 3, 2, 3, 4, 5,  0, 7,
 1,  2,  3, 4,  SCHAR_MAX - 13, SCHAR_MAX };
  volatile VC b = (VC) { 5, 2, 1, 5, 0, 1, 2, 7,  8, 9,
 10, 11, 6, -2, 13, 0 };
  volatile VC k = b + a;
  checkvc (k, (VC) { 5, SCHAR_MAX, SCHAR_MAX - 1, 8, 2, 4, 6, 12, 8,
16, 11, 13, 9, 2,  SCHAR_MAX,  SCHAR_MAX });
  return 0;
}
Looking into that now.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-10 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #36 from Dominik Vogt  ---
Created attachment 40711
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40711=edit
Reduced test for float-cast-overflow-10.c

Test for the float-cast-overflow-10.c failure.

This snippet should detect that _Decimal32 doesn't fit in a signed 64-bit
integer (either signed long or signed long long).  Test uses "-m64 -O2
-fsanitize=float-cast-overflow -fsanitize-recover=float-cast-overflow".

If you compile and execute this preprocessed file, does ubsan generate messages
or not?

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-10 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #35 from Jakub Jelinek  ---
I've filed https://reviews.llvm.org/D29824 and https://reviews.llvm.org/D29825
upstream.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-10 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #34 from Dominik Vogt  ---
(In reply to Jakub Jelinek from comment #33)
> (In reply to Dominik Vogt from comment #32)
> > On a machine with
> >  * glibc-2.23
> 
> :(; I was hoping you could test #c24 patch against glibc 2.18

I'll eventually do that, but the colleagues wanted to be nice and replaced 2.18
on the machine with 2.23, so I have to look for an alternative first.

> I can't reproduce float-cast-overflow-10.c.

I'll try your patch and take a look at float-cast-overflow-10.c.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-10 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #33 from Jakub Jelinek  ---
(In reply to Dominik Vogt from comment #32)
> On a machine with
>  * glibc-2.23

:(; I was hoping you could test #c24 patch against glibc 2.18

>  * kernel 4.4.0 + patch for the CVE
>  * CVE environment variable set to allow running the Asan tests

Yeah, that is a workaround if you have known fixed kernel.

Most of the issues can be fixed with:

--- libsanitizer/configure.tgt.jj   2017-01-31 14:49:14.0 +0100
+++ libsanitizer/configure.tgt  2017-02-10 13:29:44.571294678 +0100
@@ -40,9 +40,6 @@ case "${target}" in
   sparc*-*-linux*)
;;
   s390*-*-linux*)
-   if test x$ac_cv_sizeof_void_p = x4; then
-   UNSUPPORTED=1
-   fi
;;
   arm*-*-linux*)
;;
--- libsanitizer/sanitizer_common/sanitizer_internal_defs.h.jj  2016-11-09
15:22:41.0 +0100
+++ libsanitizer/sanitizer_common/sanitizer_internal_defs.h 2017-02-10
13:29:28.359506264 +0100
@@ -287,7 +287,12 @@ void NORETURN CheckFailed(const char *fi
 enum LinkerInitialized { LINKER_INITIALIZED = 0 };

 #if !defined(_MSC_VER) || defined(__clang__)
-# define GET_CALLER_PC() (uptr)__builtin_return_address(0)
+# if SANITIZER_S390_31
+#  define GET_CALLER_PC() \
+  (uptr)__builtin_extract_return_addr(__builtin_return_address(0))
+# else
+#  define GET_CALLER_PC() (uptr)__builtin_return_address(0)
+# endif
 # define GET_CURRENT_FRAME() (uptr)__builtin_frame_address(0)
 inline void Trap() {
   __builtin_trap();

With this, the only FAILs I'm seeing on asan.exp and ubsan.exp are (various opt
levels):
-m64 only:
FAIL: c-c++-common/ubsan/overflow-vec-1.c
FAIL: c-c++-common/ubsan/overflow-vec-2.c
both -m64 and -m31:
FAIL: g++.dg/asan/deep-stack-uaf-1.C

I can't reproduce float-cast-overflow-10.c.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-10 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #32 from Dominik Vogt  ---
On a machine with
 * glibc-2.23
 * kernel 4.4.0 + patch for the CVE
 * CVE environment variable set to allow running the Asan tests
 * patch from comment 24 applied

=>

In addition to the FAILs you've listed here:
https://gcc.gnu.org/ml/gcc-patches/2017-01/msg01814.html

(same test failing with different options listed only once)

Running target unix/-m31 
FAIL: c-c++-common/asan/memcmp-1.c   -O0  output pattern test, is
=
FAIL: c-c++-common/asan/misalign-1.c   -O2  output pattern test, is
= 
FAIL: c-c++-common/asan/misalign-2.c   -O2  output pattern test, is
= 
FAIL: c-c++-common/asan/strlen-overflow-1.c   -O0  output pattern test, is
= 
FAIL: c-c++-common/asan/strncpy-overflow-1.c   -O0  output pattern test, is
= 
...
Running target unix/-m64 
FAIL: c-c++-common/asan/null-deref-1.c   -O2  output pattern test, is
ASAN:DEADLYSIGNAL 
...
FAIL: c-c++-common/ubsan/float-cast-overflow-10.c   -O2  output pattern test,
is c-c++-common/ubsan/float-cast-overflow-7.h:147:1: runtime error: value
 is outside the range of representable values of \
type 'signed char' 
...
=== g++ tests === 


Running target unix/-m31 
FAIL: c-c++-common/asan/memcmp-1.c   -O0  output pattern test, is
= 
FAIL: c-c++-common/asan/misalign-1.c   -O2  output pattern test, is
= 
FAIL: c-c++-common/asan/misalign-2.c   -O2  output pattern test, is
= 
FAIL: c-c++-common/asan/strlen-overflow-1.c   -O0  output pattern test, is
= 
FAIL: g++.dg/asan/deep-tail-call-1.C   -O0  output pattern test, is
= 
...
Running target unix/-m64 
FAIL: c-c++-common/ubsan/float-cast-overflow-10.c   -O2  output pattern test,
is c-c++-common/ubsan/float-cast-overflow-7.h:147:1: runtime error: value
 is outside the range of representable values of \
type 'signed char' 

--

So, actually two more problems?

1) *san is not disabled with -m31 as it should(?)
2) ubsan/float-cast-overflow-10.c

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-09 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-02-09
 Ever confirmed|0   |1

--- Comment #31 from Jakub Jelinek  ---
(In reply to Jakub Jelinek from comment #24)
> Created attachment 40693 [details]
> gcc7-pr79341.patch
> 
> Does the attached patch work for you?  Only tested on s390x-linux (64-bit). 
> The intent is that while __tls_get_addr_internal is intercepted, both
> __tls_get_offset and __tls_get_addr_internal interceptors actually call
> original real __tls_get_offset, so it should work both with old and new
> glibc.

I've now filed that patch upstream: https://reviews.llvm.org/D29735
Feel free to comment on it there (especially if you'll be able to test it).

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-08 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #30 from Jakub Jelinek  ---
Following patch adds the RHEL{6,7} errata kernels to the whitelist.
SUSE and/or Debian would need to add theirs if they want.

--- libsanitizer/sanitizer_common/sanitizer_linux_s390.cc   2016-11-09
15:22:41.917353799 +0100
+++ libsanitizer/sanitizer_common/sanitizer_linux_s390.cc   2017-02-08
15:36:44.335857387 +0100
@@ -134,6 +134,18 @@ static bool FixedCVE_2016_2143() {
   if (ptr[0] == '.')
 patch = internal_simple_strtoll(ptr+1, , 10);
   if (major < 3) {
+if (major == 2 && minor == 6 && patch == 32 && ptr[0] == '-'
+   && internal_strstr(ptr, ".el6")) {
+  // Check RHEL6
+  int r1 = internal_simple_strtoll(ptr+1, , 10);
+  if (r1 >= 657) // 2.6.32-657.el6 or later
+   return true;
+  if (r1 == 642 && ptr[0] == '.') {
+   int r2 = internal_simple_strtoll(ptr+1, , 10);
+   if (r2 >= 9) // 2.6.32-642.9.1.el6 or later
+ return true;
+  }
+}
 // <3.0 is bad.
 return false;
   } else if (major == 3) {
@@ -143,6 +155,18 @@ static bool FixedCVE_2016_2143() {
 // 3.12.58+ is OK.
 if (minor == 12 && patch >= 58)
   return true;
+if (minor == 10 && patch == 0 && ptr[0] == '-'
+   && internal_strstr(ptr, ".el7")) {
+  // Check RHEL7
+  int r1 = internal_simple_strtoll(ptr+1, , 10);
+  if (r1 >= 426) // 3.10.0-426.el7 or later
+   return true;
+  if (r1 == 327 && ptr[0] == '.') {
+   int r2 = internal_simple_strtoll(ptr+1, , 10);
+   if (r2 >= 27) // 3.10.0-327.27.1.el7 or later
+ return true;
+  }
+}
 // Otherwise, bad.
 return false;
   } else if (major == 4) {

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-08 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #29 from Dominik Vogt  ---
$ uname -s -r
Linux 4.2.0-20151029.0.65fcf15.5a12af1.fc20.s390xperformance

I'm quite sure we had a working kernel on that machine at some time because I
believe to remember that I'd been the first one who was bitten by that kernel
bug.  Anyway, the machine is very busy at the moment, so the upgrade has to
wait for a couple of days.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-08 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #28 from Jakub Jelinek  ---
The bug has been introduced in
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/arch/s390/include/asm/mmu_context.h?id=6252d702c5311ce916caf75ed82e5c8245171c92
so I assume kernels 2.6.24 and earlier should be ok (unless some distro
backported the dynamic s390 page tables).  Though maybe that old kernels just
don't support 4 level page tables at all.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-08 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #27 from Jakub Jelinek  ---
The function says:
// 3.2.79+ is OK.
// 3.12.58+ is OK.
// Otherwise, bad.
// 4.1.21+ is OK.
// 4.4.6+ is OK.
// Otherwise, OK if 4.5+.
// Linux 5 and up are fine.
Perhaps it would be useful to do some check at distro kernels (RHEL, SUSE) and
if they are fixes for this, adjust that function.
E.g. for RHEL{6,7} it seems the fixes came in
2.6.32-642.11.1.el6 for RHEL6 and 3.10.0-327.28.2.el7 for RHE7.
The errata claims RHEL5 kernels are not affected, so it is strange that the
function doesn't accept very old kernels.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-08 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #26 from Dominik Vogt  ---
(We cannot upgrade the kernel before end of this or beginning of next week.)

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-08 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #25 from Dominik Vogt  ---
Looks better, but now we get this quite often:

--
==23722==ERROR: Your kernel seems to be vulnerable to CVE-2016-2143.  Using
ASa\
n, 
MSan, TSan, DFSan or LSan with such kernel can and will crash your 
machine, or worse.
--

I'll try to figure out what kernel version we need.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-08 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #24 from Jakub Jelinek  ---
Created attachment 40693
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40693=edit
gcc7-pr79341.patch

Does the attached patch work for you?  Only tested on s390x-linux (64-bit). 
The intent is that while __tls_get_addr_internal is intercepted, both
__tls_get_offset and __tls_get_addr_internal interceptors actually call
original real __tls_get_offset, so it should work both with old and new glibc.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-03 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #23 from Jakub Jelinek  ---
(In reply to Ulrich Weigand from comment #22)
> (In reply to Jakub Jelinek from comment #21)
> > Could libsanitizer call __tls_get_offset instead, after setting %r12 or
> > whatever else is needed for it to make work and then perhaps adjust the
> > result if needed?
> > E.g. on s390x __tls_get_offset is internally:
> > __tls_get_offset:\n\
> > la  %r2,0(%r2,%r12)\n\
> > jg  __tls_get_addr\n\
> > and in the interceptor:
> > #ifdef __s390x__
> >   "la %r2, 0(%r2,%r12)\n"
> >   "jg __interceptor___tls_get_addr_internal_protected\n"
> > #else
> > at which point the original %r2 and %r12 is lost and it is hard to call the
> > original __tls_get_offset, it might be better to pass the original %r2 and
> > %r12 values to some C function and from that compute the r2 + r12 the code
> > perhaps needs for its own thing, but then we could (again in assembly) call
> > the original __tls_get_offset again if needed.
> 
> Yes, it would appear to be safer to call __tls_get_offset instead.
> You probably do not even need the original %r12, but simply subtract
> %r12 (whatever it currently is) from %r2 before calling the original
> __tls_get_offset.  The value of %r12 is not used for anything except
> adding it to %r2.

If it is that easy, then perhaps glibc should drop __tls_get_addr_internal and
just call __tls_get_offset that way internally too?
Otherwise, if we still wrap __tls_get_addr_internal (perhaps for compatibility
reasons) and just make sure we call __tls_get_offset instead on s390{,x} with
the adjusted argument, we could support even older glibcs.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-03 Thread uweigand at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #22 from Ulrich Weigand  ---
(In reply to Jakub Jelinek from comment #21)
> Could libsanitizer call __tls_get_offset instead, after setting %r12 or
> whatever else is needed for it to make work and then perhaps adjust the
> result if needed?
> E.g. on s390x __tls_get_offset is internally:
> __tls_get_offset:\n\
> la  %r2,0(%r2,%r12)\n\
> jg  __tls_get_addr\n\
> and in the interceptor:
> #ifdef __s390x__
>   "la %r2, 0(%r2,%r12)\n"
>   "jg __interceptor___tls_get_addr_internal_protected\n"
> #else
> at which point the original %r2 and %r12 is lost and it is hard to call the
> original __tls_get_offset, it might be better to pass the original %r2 and
> %r12 values to some C function and from that compute the r2 + r12 the code
> perhaps needs for its own thing, but then we could (again in assembly) call
> the original __tls_get_offset again if needed.

Yes, it would appear to be safer to call __tls_get_offset instead.
You probably do not even need the original %r12, but simply subtract
%r12 (whatever it currently is) from %r2 before calling the original
__tls_get_offset.  The value of %r12 is not used for anything except
adding it to %r2.

> That said, if asan wants to intercept also what dlsym will internally call,
> then that will not really work.  But does libasan on other targets rely on
> dlsym calling __tls_get_addr internally in those cases?  That would be yet
> another reliance on glibc internals.

As I understand it, they do make that assumption; libsanitizer must get
involved at the point any new TLS data section is allocated.  Since this
allocation may happen as a result of a dlsym call, those cases have to
be intercepted as well.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-03 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #21 from Jakub Jelinek  ---
Could libsanitizer call __tls_get_offset instead, after setting %r12 or
whatever else is needed for it to make work and then perhaps adjust the result
if needed?
E.g. on s390x __tls_get_offset is internally:
__tls_get_offset:\n\
la  %r2,0(%r2,%r12)\n\
jg  __tls_get_addr\n\
and in the interceptor:
#ifdef __s390x__
  "la %r2, 0(%r2,%r12)\n"
  "jg __interceptor___tls_get_addr_internal_protected\n"
#else
at which point the original %r2 and %r12 is lost and it is hard to call the
original __tls_get_offset, it might be better to pass the original %r2 and %r12
values to some C function and from that compute the r2 + r12 the code perhaps
needs for its own thing, but then we could (again in assembly) call the
original __tls_get_offset again if needed.
That said, if asan wants to intercept also what dlsym will internally call,
then that will not really work.  But does libasan on other targets rely on
dlsym calling __tls_get_addr internally in those cases?  That would be yet
another reliance on glibc internals.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-03 Thread fw at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #20 from Florian Weimer  ---
(In reply to Andreas Krebbel from comment #19)

> As a debugging tool I think asan is a special case also regarding ABI
> compatibility. We probably do not want to export the internal symbol and
> make it part of the ABI because of a single user.

We have users in GCC and LLVM already, and both projects have many downstream
forks.  This means that it's going to be difficult to adjust all users as part
of a system glibc update.

GLIBC_PRIVATE is internal to glibc.  Please do not depend on these symbols. 
They can change or go away at any time.  For example, the prototype of
__libc_res_nsearch changed during a security update.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-03 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #19 from Andreas Krebbel  ---
koriakin stands for Marcin Koƛcielnicki who implemented LLVM sanitizer support
for z as part of a bounty. Ulrich Weigand led the discussions with him. CC'ing
Uli.

My personal opinion is that support of older Glibcs is not important in that
case. All relevant distros for z already have a Glibc with that symbol and asan
is new feature.

As a debugging tool I think asan is a special case also regarding ABI
compatibility. We probably do not want to export the internal symbol and make
it part of the ABI because of a single user. I think adding a comment to Glibc
code making people aware that this needs to be kept in sync with libsanitizer
would be sufficient.

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-03 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #18 from Jakub Jelinek  ---
Plus one thing is interception (which still requires that the private symbol
has stable ABI), and another thing is calling that symbol even when it isn't
called in the original program (which is what must be going on here, because on
glibc 2.18 nothing is calling a symbol which doesn't exist there when not using
libsanitizer).

[Bug sanitizer/79341] Many Asan tests fail on s390

2017-02-03 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

Jakub Jelinek  changed:

   What|Removed |Added

 CC||dodji at gcc dot gnu.org,
   ||dvyukov at gcc dot gnu.org,
   ||kcc at gcc dot gnu.org
  Component|other   |sanitizer

--- Comment #17 from Jakub Jelinek  ---
Yeah, I know that, I believe there have been some discussions between glibc and
libsanitizer and I was hoping the libsanitizer maintainers would get it
resolved, but apparently it has not been.