[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-12-04 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #24 from peien luo  ---
(In reply to Dmitry Vyukov from comment #23)
> Please provide disassembly of the function that contains the PC
> (__gnu_cxx::__normal_iterator...).
> Did we fix any bugs that lead to missed __tsan_func_exit callbacks?
> 
> Before we go any deeper, I would suggest to retest with the latest gcc.
> There might have been bugs, and they may be fixed now. Even if a fix will be
> backported to 4.8 branch, you will still need to update the compiler.

I tried 4.9.4 today, and there seems to be a different error in gdb.

   0x7fe763ba774e <+0>: push   %rbp
   0x7fe763ba774f <+1>: mov%rsp,%rbp
   0x7fe763ba7752 <+4>: push   %r14
   0x7fe763ba7754 <+6>: push   %r13
   0x7fe763ba7756 <+8>: push   %r12
   0x7fe763ba7758 <+10>:push   %rbx
   0x7fe763ba7759 <+11>:sub$0x1000f0,%rsp
=> 0x7fe763ba7760 <+18>:mov%rdi,-0x1000e8(%rbp)
   0x7fe763ba7767 <+25>:mov%rsi,-0x1000f0(%rbp)
   0x7fe763ba776e <+32>:mov%rdx,-0x1000f8(%rbp)
   0x7fe763ba7775 <+39>:mov%rcx,-0x100100(%rbp)
   0x7fe763ba777c <+46>:mov%r8,-0x100108(%rbp)
   0x7fe763ba7783 <+53>:mov%r9d,-0x10010c(%rbp)
   0x7fe763ba778a <+60>:mov0x8(%rbp),%rax
   0x7fe763ba778e <+64>:mov%rax,%rdi
   0x7fe763ba7791 <+67>:callq  0x7fe763871660
<__tsan_func_entry(void*)>

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-12-03 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #22 from peien luo  ---
The bt only shows a stack size of 27. No recursion. I modified the tsan code to
print out what's in the shadow stack when it's about to overflow. It looks most
of the addresses are:

__gnu_cxx::__normal_iterator > >
std::__unguarded_partition_pivot<__gnu_cxx::__normal_iterator > >,
__gnu_cxx::__ops::_Iter_comp_iter
>(__gnu_cxx::__normal_iterator
> >, __gnu_cxx::__normal_iterator > >, __gnu_cxx::__ops::_Iter_comp_iter)

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-12-01 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #20 from peien luo  ---
(In reply to Dmitry Vyukov from comment #18)
> Looks like shadow stack overflow.
> Do you use fibers, ucontext, longjmp, exceptions or any other non-obvious
> control flow constructs?
> Fibers and exceptions are not supported. Longjmp should work.

(gdb) p &(thr->shadow_stack[0])
$9 = (unsigned long *) 0x7f9842712080
(gdb) p thr->shadow_stack_pos 
$10 = (__sanitizer::uptr *) 0x7f9842762b68

so it actually took the 'shadow stack' size of 330472, then it crashed.
is that huge number abnormal?

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-11-30 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #19 from peien luo  ---
(In reply to Dmitry Vyukov from comment #18)
> Looks like shadow stack overflow.
> Do you use fibers, ucontext, longjmp, exceptions or any other non-obvious
> control flow constructs?
> Fibers and exceptions are not supported. Longjmp should work.

No, there's no use of ucontext, longjmp stuff. exceptions are not used either.

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-11-07 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #17 from peien luo  ---
(In reply to Dmitry Vyukov from comment #16)
> > The stack size limit in my box is 8M. I have also checked /proc/limits.
> 
> So, is increasing stack size help?
> Tsan increases stack consumption. 8MB is not that much provided that you
> have 1MB frames.
> 
> > By enabling -fstack-protector-all, in gdb it still may get segfault here 
> > (at function entry).
> 
> Stack protector will not help to detect/prevent stack overflow. It only
> prevents PC overwrite on buffer overflows.

Increased to 80MB, no help.

In a 4.9.0 tsan case, I have come across a segfault in gdb with back trace:
23  void MutexSet::Add(u64 id, bool write, u64 epoch) {
24// Look up existing mutex with the same id.
25for (uptr i = 0; i < size_; i++) {
26  if (descs_[i].id == id) {
27descs_[i].count++;
28descs_[i].epoch = epoch;
29return;
30  }

(gdb) p size_
$2 = 139907922088632

(gdb) bt
#0  0x7f3ed9c5f37c in __tsan::MutexSet::Add
(this=this@entry=0x7f3ec6f63080, id=55166398195828824, 
write=write@entry=true, epoch=1680548) at
../../../../libsanitizer/tsan/tsan_mutexset.cc:26
#1  0x7f3ed9c555e5 in __tsan::MutexLock (thr=thr@entry=0x7f3ec6e78840,
pc=pc@entry=139907918263873, 
addr=addr@entry=138040248895576) at
../../../../libsanitizer/tsan/tsan_rtl_mutex.cc:109
#2  0x7f3ed9c4f6ae in __interceptor_pthread_mutex_lock (m=0x7d8c0858)
at ../../../../libsanitizer/tsan/tsan_interceptors.cc:811
...

something overflowed?

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-11-06 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #14 from peien luo  ---
(In reply to Dmitry Vyukov from comment #12)
> The crash in gdb looks like stack overflow (unsurprising if there are 1MB
> frames). Does increasing thread stack size or reducing frame size (there
> must something very big on the stack) help?

I tried gcc 4.9.4, 4.9.3, 4.9.2, 4.9.1, 4.9.0 today and found in this case, the
problem began to occur compiled with 4.9.0.

I tried to replace libsanitizer in 4.9.0 with the one in 4.8.5, no issue found.

The difference between the disassemble code at that function entry is:

4.8.5:
   0x7f224dab0620 <+0>: push   %r15
   0x7f224dab0622 <+2>: mov%r9d,%r15d
   0x7f224dab0625 <+5>: push   %r14
   0x7f224dab0627 <+7>: push   %r13
   0x7f224dab0629 <+9>: mov%rsi,%r13
   0x7f224dab062c <+12>:push   %r12
   0x7f224dab062e <+14>:push   %rbp
   0x7f224dab062f <+15>:mov%rdi,%rbp
   0x7f224dab0632 <+18>:lea0x30(%rbp),%r14
   0x7f224dab0636 <+22>:push   %rbx
   0x7f224dab0637 <+23>:sub$0x1000f8,%rsp
   0x7f224dab063e <+30>:mov0x100128(%rsp),%rdi
   0x7f224dab0646 <+38>:lea0x50(%rsp),%rbx
   0x7f224dab064b <+43>:mov%rdx,0x28(%rsp)
   0x7f224dab0650 <+48>:mov%rcx,0x38(%rsp)
   0x7f224dab0655 <+53>:mov%r8,0x30(%rsp)
   0x7f224dab065a <+58>:mov%fs:0x28,%rax
   0x7f224dab0663 <+67>:mov%rax,0x1000e8(%rsp)
   0x7f224dab066b <+75>:xor%eax,%eax
   0x7f224dab066d <+77>:callq  0x7f224d69ae50
<__tsan_func_entry(void*)>


4.9.0
   0x7fc63563a710 <+0>: push   %rbp
   0x7fc63563a711 <+1>: mov%rsp,%rbp
   0x7fc63563a714 <+4>: push   %r15
   0x7fc63563a716 <+6>: push   %r14
   0x7fc63563a718 <+8>: push   %r13
   0x7fc63563a71a <+10>:push   %r12
   0x7fc63563a71c <+12>:mov%rdi,%r15
   0x7fc63563a71f <+15>:push   %rbx
   0x7fc63563a720 <+16>:mov%rsi,%r13
   0x7fc63563a723 <+19>:mov%r9d,%r14d
   0x7fc63563a726 <+22>:lea-0x1000d0(%rbp),%rbx
   0x7fc63563a72d <+29>:sub$0x1000e8,%rsp
=> 0x7fc63563a734 <+36>:mov%rdi,-0x1000e8(%rbp)
   0x7fc63563a73b <+43>:mov0x8(%rbp),%rdi
   0x7fc63563a73f <+47>:mov%rdx,-0x1000f0(%rbp)
   0x7fc63563a746 <+54>:mov%rcx,-0x100100(%rbp)
   0x7fc63563a74d <+61>:mov%r8,-0x1000f8(%rbp)
   0x7fc63563a754 <+68>:mov%fs:0x28,%rax
   0x7fc63563a75d <+77>:mov%rax,-0x38(%rbp)
   0x7fc63563a761 <+81>:xor%eax,%eax
   0x7fc63563a763 <+83>:callq  0x7fc63527d1e0
<__tsan_func_entry(void*)>

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-11-05 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #13 from peien luo  ---
(In reply to Dmitry Vyukov from comment #12)
> The crash in gdb looks like stack overflow (unsurprising if there are 1MB
> frames). Does increasing thread stack size or reducing frame size (there
> must something very big on the stack) help?

The stack size limit in my box is 8M. I have also checked /proc/limits.

By enabling -fstack-protector-all, in gdb it still may get segfault here (at
function entry).

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-10-29 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #11 from peien luo  ---
Sorry for the previous comment regarding running in gdb. the result seems to be
random:

Sometimes it can runs fine
Sometimes it gets a SEGFAULT in calling to a function, gdb says:
   0x7ff0fa19b466 <+22>:lea-0x100060(%rbp),%rbx
   0x7ff0fa19b46d <+29>:sub$0x1000d8,%rsp
=> 0x7ff0fa19b474 <+36>:mov%rdi,-0x1000d8(%rbp)
   0x7ff0fa19b47b <+43>:mov0x8(%rbp),%rdi
   0x7ff0fa19b47f <+47>:mov%rdx,-0x1000e0(%rbp)
   0x7ff0fa19b486 <+54>:mov%rcx,-0x1000f0(%rbp)
   0x7ff0fa19b48d <+61>:mov%r8,-0x1000e8(%rbp)
   0x7ff0fa19b494 <+68>:callq  0x7ff0f9e12e00
<__tsan_func_entry(void*)>
   0x7ff0fa19b499 <+73>:mov%r15,%rax
   0x7ff0fa19b49c <+76>:add$0x30,%rax
   0x7ff0fa19b4a0 <+80>:mov%rax,%rdi
   0x7ff0fa19b4a3 <+83>:mov%rax,-0x1000c8(%rbp)
   0x7ff0fa19b4aa <+90>:callq  0x7ff0f9e1c7e0
<__interceptor_pthread_mutex_lock(void*)>
   0x7ff0fa19b4af <+95>:mov%rbx,%rdi
   0x7ff0fa19b4b2 <+98>:callq  0x7ff0f9e12020
<__tsan_vptr_update(void**, void*)>

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-10-29 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #10 from peien luo  ---
It's a centOS7, kernel has been updated to 3.10.0-327.36.3.el7.x86_64, the
problem still occurs. Some new findings:

1, With gcc 4.8.5, it runs fine for this specific case.
2, With gcc 4.9.4, it stucks at some point, the ps says:
$ ps -flp 13600
F S UID PID   PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY  TIME
CMD
0 D god   13600  13597 89  80   0 - 26038299885 exit 23:15 pts/9 00:15:28
./test_metaserver
3, With gdb, it runs OK by 'set disable-randomization off' it runs ok as well.
(I need to check it again)

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-10-11 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #8 from peien luo  ---
In another case, the process got stuck, compiled with gcc 4.9.4. I will try a
different version of gcc. The proc stack info is:

[god@localhost 5019]$ cat task/*/status | grep State
State:  D (disk sleep)
State:  D (disk sleep)
State:  D (disk sleep)
State:  D (disk sleep)
State:  D (disk sleep)
State:  D (disk sleep)
State:  D (disk sleep)
State:  S (sleeping)
[god@localhost 5019]$ cat task/*/stack
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] 0x
[god@localhost 5019]$ cat stack
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-09-29 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #7 from peien luo  ---
tried, still got D state, build with gcc 4.9.4

[god@localhost 21586]$ cat stack
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] retint_signal+0x48/0x8c
[] 0x
[god@localhost 21586]$ cat status
Name:   test_metaserver
State:  D (disk sleep)
Tgid:   21586
Ngid:   0
Pid:21586
PPid:   12499
TracerPid:  0
Uid:1000100010001000
Gid:1000100010001000
FDSize: 256
Groups: 1000 
VmPeak: 104153806860 kB
VmSize: 104153793252 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM:342544 kB
VmRSS:342544 kB
VmData: 104153254936 kB
VmStk:  1048 kB
VmExe: 18392 kB
VmLib:  5992 kB
VmPTE:  1904 kB
VmSwap:0 kB
Threads:8
SigQ:   0/63365
SigPnd: 
ShdPnd: 
SigBlk: 
SigIgn: 1000
SigCgt: 00018000
CapInh: 
CapPrm: 
CapEff: 
CapBnd: 001f
Seccomp:0
Cpus_allowed:   ,,,
Cpus_allowed_list:  0-127
Mems_allowed:  
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0001
Mems_allowed_list:  0
voluntary_ctxt_switches:442
nonvoluntary_ctxt_switches: 9

[god@localhost 21586]$ cat task/*/stack
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] retint_signal+0x48/0x8c
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] pipe_wait+0x70/0xc0
[] pipe_write+0x236/0x5b0
[] do_sync_write+0x8d/0xd0
[] dump_write+0x52/0x70
[] dump_seek+0xa4/0xe0
[] elf_core_dump+0x896/0x950
[] do_coredump+0x882/0xb10
[] get_signal_to_deliver+0x1c7/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x

[god@localhost 21586]$ cat task/*/stack
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] retint_signal+0x48/0x8c
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] pipe_wait+0x70/0xc0
[] pipe_write+0x236/0x5b0
[] do_sync_write+0x8d/0xd0
[] dump_write+0x52/0x70
[] dump_seek+0xa4/0xe0
[] elf_core_dump+0x896/0x950
[] do_coredump+0x882/0xb10
[] get_signal_to_deliver+0x1c7/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x

[god@localhost ~]$ g++ -v
Using 

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-09-16 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #5 from peien luo  ---
(In reply to Dmitry Vyukov from comment #4)
> Unkillable processed in D state usually mean kernel bugs (and there are lots
> of them: https://github.com/google/syzkaller/wiki/Found-Bugs).
> 
> Please post results of 'cat /proc/PID/task/*/stack` and `cat
> /proc/PID/task/*/status`. Sometimes hangs happen due to secondary threads in
> the process. Maybe we will be able to figure out something from that info.
> However, I am not sure what we can do about a process hanged in D state,
> generally user must not be able to create them.

It can be killed by kill -9.

I have updated my centos 7 to the latest. Problem still occurs. The kernel
stack output is:

$ cat task/*/stack
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] pipe_wait+0x70/0xc0
[] pipe_write+0x236/0x5b0
[] do_sync_write+0x8d/0xd0
[] dump_write+0x52/0x70
[] dump_seek+0xa4/0xe0
[] elf_core_dump+0x896/0x950
[] do_coredump+0x882/0xb10
[] get_signal_to_deliver+0x1c7/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] retint_signal+0x48/0x8c
[] 0x
[god@localhost 12987]$ ls task
12987  12988  12989  12990  12991  12992  12993  12994


and status:
$ cat task/*/status
Name:   test_metaserver
State:  D (disk sleep)
Tgid:   12987
Ngid:   0
Pid:12987
PPid:   11646
TracerPid:  0
Uid:1000100010001000
Gid:1000100010001000
FDSize: 256
Groups: 1000 
VmPeak: 104153197488 kB
VmSize: 104153197488 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM:265724 kB
VmRSS:265724 kB
VmData: 104153051792 kB
VmStk:   136 kB
VmExe: 18492 kB
VmLib:  5992 kB
VmPTE:  1288 kB
VmSwap:0 kB
Threads:8
SigQ:   0/63365
SigPnd: 
ShdPnd: 
SigBlk: 
SigIgn: 1000
SigCgt: 00018000
CapInh: 
CapPrm: 
CapEff: 
CapBnd: 001f
Seccomp:0
Cpus_allowed:   ,,,
Cpus_allowed_list:  0-127
Mems_allowed:  
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0001
Mems_allowed_list:  0
voluntary_ctxt_switches:96
nonvoluntary_ctxt_switches: 8
Name:   test_metaserver
State:  D (disk sleep)
Tgid:   12987
Ngid:   0
Pid:12988
PPid:   11646
TracerPid:  0
Uid:1000100010001000
Gid:1000100010001000
FDSize: 256
Groups: 1000 
VmPeak: 104153197488 kB
VmSize: 104153197488 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM:265724 kB
VmRSS:265724 kB
VmData: 104153051792 kB
VmStk:   136 kB
VmExe: 18492 kB
VmLib:  5992 kB
VmPTE:  1288 kB
VmSwap:0 kB
Threads:8
SigQ:   0/63365
SigPnd: 
ShdPnd: 
SigBlk: fffbfeff
SigIgn: 1000
SigCgt: 00018000
CapInh: 
CapPrm: 
CapEff: 
CapBnd: 001f
Seccomp:0
Cpus_allowed:   ,,,
Cpus_allowed_list:  0-127
Mems_allowed:  
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0001
Mems_allowed_list:  0
voluntary_ctxt_switches:6
nonvoluntary_ctxt_switches: 0
Name:   test_metaserver
State:  

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-09-16 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #3 from peien luo  ---
The process stuck can be reproduced, the kernel call trace is like:

Sep 16 09:38:37 localhost kernel: test_metaserver D 8803f9307300 0 
4250   3896 0x0080
Sep 16 09:38:37 localhost kernel: 880424b4bcd0 0082
8803f9307300 880424b4bfd8
Sep 16 09:38:37 localhost kernel: 880424b4bfd8 880424b4bfd8
8803f9307300 8803f9307300
Sep 16 09:38:37 localhost kernel: 880421c29f40 880421c29fb8
8803faf06c40 8803f9307300
Sep 16 09:38:37 localhost kernel: Call Trace:
Sep 16 09:38:37 localhost kernel: [] schedule+0x29/0x70
Sep 16 09:38:37 localhost kernel: [] do_exit+0x1e4/0xa60
Sep 16 09:38:37 localhost kernel: [] ? update_curr+0xcc/0x150
Sep 16 09:38:37 localhost kernel: [] ?
account_entity_dequeue+0xae/0xd0
Sep 16 09:38:37 localhost kernel: [] do_group_exit+0x3f/0xa0
Sep 16 09:38:37 localhost kernel: []
get_signal_to_deliver+0x1d0/0x6d0
Sep 16 09:38:37 localhost kernel: [] do_signal+0x57/0x6c0
Sep 16 09:38:37 localhost kernel: [] ? ktime_get+0x4c/0xd0
Sep 16 09:38:37 localhost kernel: [] ?
hrtimer_nanosleep+0xd3/0x170
Sep 16 09:38:37 localhost kernel: [] ?
hrtimer_get_res+0x50/0x50
Sep 16 09:38:37 localhost kernel: []
do_notify_resume+0x5f/0xb0
Sep 16 09:38:37 localhost kernel: [] int_signal+0x12/0x17

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-09-16 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #2 from peien luo  ---
(In reply to Dmitry Vyukov from comment #1)
> Hello,
> 
> Shadow stack size was increased several times, and as far as I remember we
> now have a guard page at the end. Please retest with latest gcc/clang, or
> provide a reproducer.

I moved to another box (a virtual machine) to test the new gcc 4.9.4 (because
the other environment is a shared server I can't make many changes on it.)

What I observed is: without tsan, the process runs fine. With tsan turned on,
then it got fully stuck at some point. (D state, cannot attach or trace). I
haven't yet figured out what caused that. Here is a /proc stack when it got
stuck:

$ cat syscall 
35 0x7ffca05f77e0 0x7ffca05f77e0 0x0 0x8 0x7ffca05f78f0 0x7ffca05f7730
0x7ffca05f77d0 0x7f24ff6f349d

$ cat stack 
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x

[Bug sanitizer/77538] New: segmentation fault: thread sanitizer shadow stack overflow

2016-09-09 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

Bug ID: 77538
   Summary: segmentation fault: thread sanitizer shadow stack
overflow
   Product: gcc
   Version: 4.8.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: coollpe at hotmail dot com
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org
  Target Milestone: ---

The stack back trace showed there is an overflow of shadow stack and caused the
corruption of mset (size_ is written as a strange number in this overflow)

[Switching to Thread 0x7fffe35fc700 (LWP 29912)]
Hardware watchpoint 6: *0x7fffe3562080

Old value = 1
New value = -150984400
__tsan::FuncEntry (thr=thr@entry=0x7fffe3477840, pc=pc@entry=140737337370928)
at ../../../../libsanitizer/tsan/tsan_rtl.cc:583
583 ../../../../libsanitizer/tsan/tsan_rtl.cc: No such file or directory.
(gdb) bt
#0  __tsan::FuncEntry (thr=thr@entry=0x7fffe3477840,
pc=pc@entry=140737337370928) at ../../../../libsanitizer/tsan/tsan_rtl.cc:583
#1  0x76f6c97d in ScopedInterceptor::ScopedInterceptor
(this=0x7fffe34766d0, thr=0x7fffe3477840, fname=,
pc=140737337370928)
at ../../../../libsanitizer/tsan/tsan_interceptors.cc:158
#2  0x76f6e0d7 in __interceptor_memcmp (s1=0x7d06000b1108,
s2=0x7d06001a85a8, n=16) at
../../../../libsanitizer/tsan/tsan_interceptors.cc:470
#3  0x77002930 in std::char_traits::compare (__s1=0x7d06000b1108
"table_empty_node", __s2=0x7d06001a85a8 "table_empty_node", __n=16)
at
/home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/bits/char_traits.h:255
#4  0x77002e66 in std::operator== (__lhs=..., __rhs=...) at
/home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/bits/basic_string.h:2497
#5  0x77015ed2 in std::equal_to::operator()
(this=0x7d0ec752, __x=..., __y=...)
at
/home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/bits/stl_function.h:208
#6  0x773accdb in __gnu_cxx::hashtable<std::pair, std::string, __gnu_cxx::hash,
std::_Select1st<std::pair >, std::equal_to,
std::allocator<armor::Table*> >::find (this=0x7d0ec750, __key=...)
at
/home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/backward/hashtable.h:539
#7  0x773a921f in __gnu_cxx::hash_map<std::string, armor::Table*,
__gnu_cxx::hash, std::equal_to,
std::allocator<armor::Table*> >::find (
this=0x7d0ec750, __key=...) at
/home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/ext/hash_map:217

gdb output:
(gdb) frame 0
#0  __tsan::FuncEntry (thr=thr@entry=0x7fffe3477840,
pc=pc@entry=140737337370928) at ../../../../libsanitizer/tsan/tsan_rtl.cc:583
583 in ../../../../libsanitizer/tsan/tsan_rtl.cc
(gdb) p thr->shadow_stack_pos
$24 = (__sanitizer::uptr *) 0x7fffe3562080
(gdb) p >shadow_stack[kShadowStackSize]
$25 = (unsigned long *) 0x7fffe3562080
(gdb) p >shadow_stack_pos[0]
$26 = (__sanitizer::uptr *) 0x7fffe3562080

The source code:
libsanitizer/tsan/tsan_rtl.cc:582
  thr->shadow_stack_pos[0] = pc;

The definition:
libsanitizer/tsan/tsan_rtl.h:360
  uptr shadow_stack[kShadowStackSize];
#else
  // Go uses satellite shadow stack with dynamic size.
  uptr *shadow_stack;
  uptr *shadow_stack_end;
#endif
  MutexSet mset;

So the overflow caused the corruption of mset, and later the size_ in mset is
corrupted, this caused the segmentation fault when accessed later.