https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71388

            Bug ID: 71388
           Summary: [6/7 regression] wrong code, DSE removes memset in TBB
                    allocate_scheduler (causes run-time crashes)
           Product: gcc
           Version: 6.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: david.abdurachmanov at gmail dot com
  Target Milestone: ---

Created attachment 38626
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38626&action=edit
pre-processed file

We found this crashing one of our test cases in GCC 6.1.1 port of our software.
Looks like wrong code in Intel TBB.

TBB version: tbb44_20151115oss (also affects newer versions)
First bad commit: 8a36d0ec201ef1511b372523f72a763b836107b0 or r222135
Last good commit: 8b2942f7c961ee83bb0ff605129165ecdf6ac8f6 or r222134

Potentially wrongly compiled code is from src/tbb/custom_scheduler.h

111 public:
112     static generic_scheduler* allocate_scheduler( market& m ) {
113         void* p = NFS_Allocate(1, sizeof(scheduler_type), NULL);
114         std::memset(p, 0, sizeof(scheduler_type));
115         scheduler_type* s = new( p ) scheduler_type( m );
116         s->assert_task_pool_valid();
117         ITT_SYNC_CREATE(s, SyncType_Scheduler, SyncObj_TaskPoolSpinning);
118         return s;
119     }

>From our developer: What is happening is a class instance is being created with
placement new where the address is reused from a thread which has gone away.
However, (at least) one of the member data of the newly created object is non-0
and it is that non-0 value which ultimately leads to the crash.

Object size here is 408 bytes. A call to memset is not generated and don't seem
to be inlined also.

Can be cured if TTB is compiled with CXXFLAGS='-fno-builtin-memset' which will
force GCC to generate a call to memset.

Below you can find examples of
tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::allocate_scheduler(tbb::internal::market&)
symbol with good and bad GCC revision.

Note that in bad revision we lost: rep stos %rax,%es:(%rdi)

Attaching pre-processed scheduler.ii (not a minimal test case), compiled as:
g++ -o scheduler.o -c -g -O2 -m64 -mrtm -fPIC scheduler.ii

### GOOD ###

0000000000023df0
<tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::allocate_scheduler(tbb::internal::market&)>:
   23df0:       55                      push   %rbp
   23df1:       53                      push   %rbx
   23df2:       31 d2                   xor    %edx,%edx
   23df4:       48 89 fd                mov    %rdi,%rbp
   23df7:       be 98 01 00 00          mov    $0x198,%esi
   23dfc:       bf 01 00 00 00          mov    $0x1,%edi
   23e01:       48 83 ec 08             sub    $0x8,%rsp
   23e05:       e8 96 9a fe ff          callq  d8a0
<tbb::internal::NFS_Allocate(unsigned long, unsigned long, void*)@plt>
   23e0a:       48 8d 78 08             lea    0x8(%rax),%rdi
   23e0e:       48 89 c1                mov    %rax,%rcx
   23e11:       48 89 c3                mov    %rax,%rbx
   23e14:       48 c7 00 00 00 00 00    movq   $0x0,(%rax)
   23e1b:       48 c7 80 90 01 00 00    movq   $0x0,0x190(%rax)
   23e22:       00 00 00 00
   23e26:       31 c0                   xor    %eax,%eax
   23e28:       48 83 e7 f8             and    $0xfffffffffffffff8,%rdi
   23e2c:       48 89 ee                mov    %rbp,%rsi
   23e2f:       48 29 f9                sub    %rdi,%rcx
   23e32:       81 c1 98 01 00 00       add    $0x198,%ecx
   23e38:       c1 e9 03                shr    $0x3,%ecx
   23e3b:       f3 48 ab                rep stos %rax,%es:(%rdi)
   23e3e:       48 89 df                mov    %rbx,%rdi
   23e41:       e8 5a e4 ff ff          callq  222a0
<tbb::internal::generic_scheduler::generic_scheduler(tbb::internal::market&)>
   23e46:       48 8d 05 13 1c 21 00    lea    0x211c13(%rip),%rax        #
235a60 <vtable for
tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>>
   23e4d:       48 83 c0 10             add    $0x10,%rax
   23e51:       48 89 03                mov    %rax,(%rbx)
   23e54:       48 8d 05 35 2b 21 00    lea    0x212b35(%rip),%rax        #
236990 <__itt_sync_create_ptr__3_0>
   23e5b:       48 8b 00                mov    (%rax),%rax
   23e5e:       48 85 c0                test   %rax,%rax
   23e61:       74 1e                   je     23e81
<tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::allocate_scheduler(tbb::internal::market&)+0x91>
   23e63:       48 8d 15 fe 25 21 00    lea    0x2125fe(%rip),%rdx        #
236468 <tbb::SyncObj_TaskPoolSpinning>
   23e6a:       48 8d 35 2f 26 21 00    lea    0x21262f(%rip),%rsi        #
2364a0 <tbb::SyncType_Scheduler>
   23e71:       b9 02 00 00 00          mov    $0x2,%ecx
   23e76:       48 89 df                mov    %rbx,%rdi
   23e79:       48 8b 12                mov    (%rdx),%rdx
   23e7c:       48 8b 36                mov    (%rsi),%rsi
   23e7f:       ff d0                   callq  *%rax
   23e81:       48 83 c4 08             add    $0x8,%rsp
   23e85:       48 89 d8                mov    %rbx,%rax
   23e88:       5b                      pop    %rbx
   23e89:       5d                      pop    %rbp
   23e8a:       c3                      retq
   23e8b:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)

### BAD ###

0000000000023dd0
<tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::allocate_scheduler(tbb::internal::market&)>:
   23dd0:       55                      push   %rbp
   23dd1:       53                      push   %rbx
   23dd2:       31 d2                   xor    %edx,%edx
   23dd4:       48 89 fd                mov    %rdi,%rbp
   23dd7:       be 98 01 00 00          mov    $0x198,%esi
   23ddc:       bf 01 00 00 00          mov    $0x1,%edi
   23de1:       48 83 ec 08             sub    $0x8,%rsp
   23de5:       e8 b6 9a fe ff          callq  d8a0
<tbb::internal::NFS_Allocate(unsigned long, unsigned long, void*)@plt>
   23dea:       48 89 ee                mov    %rbp,%rsi
   23ded:       48 89 c7                mov    %rax,%rdi
   23df0:       48 89 c3                mov    %rax,%rbx
   23df3:       e8 a8 e4 ff ff          callq  222a0
<tbb::internal::generic_scheduler::generic_scheduler(tbb::internal::market&)>
   23df8:       48 8d 05 61 1c 21 00    lea    0x211c61(%rip),%rax        #
235a60 <vtable for
tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>>
   23dff:       48 83 c0 10             add    $0x10,%rax
   23e03:       48 89 03                mov    %rax,(%rbx)
   23e06:       48 8d 05 83 2b 21 00    lea    0x212b83(%rip),%rax        #
236990 <__itt_sync_create_ptr__3_0>
   23e0d:       48 8b 00                mov    (%rax),%rax
   23e10:       48 85 c0                test   %rax,%rax
   23e13:       74 1e                   je     23e33
<tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::allocate_scheduler(tbb::internal::market&)+0x63>
   23e15:       48 8d 15 4c 26 21 00    lea    0x21264c(%rip),%rdx        #
236468 <tbb::SyncObj_TaskPoolSpinning>
   23e1c:       48 8d 35 7d 26 21 00    lea    0x21267d(%rip),%rsi        #
2364a0 <tbb::SyncType_Scheduler>
   23e23:       b9 02 00 00 00          mov    $0x2,%ecx
   23e28:       48 89 df                mov    %rbx,%rdi
   23e2b:       48 8b 12                mov    (%rdx),%rdx
   23e2e:       48 8b 36                mov    (%rsi),%rsi
   23e31:       ff d0                   callq  *%rax
   23e33:       48 83 c4 08             add    $0x8,%rsp
   23e37:       48 89 d8                mov    %rbx,%rax
   23e3a:       5b                      pop    %rbx
   23e3b:       5d                      pop    %rbp
   23e3c:       c3                      retq
   23e3d:       0f 1f 00                nopl   (%rax)

### DIFF ###

  1 --- 20150415.s  2016-06-02 16:01:06.000000000 +0200
  2 +++ 20150415.bad.s      2016-06-02 16:01:54.000000000 +0200
  3 @@ -6,31 +6,20 @@
  4         be 98 01 00 00          mov    $0x198,%esi
  5         bf 01 00 00 00          mov    $0x1,%edi
  6         48 83 ec 08             sub    $0x8,%rsp
  7 -       e8 96 9a fe ff          callq  d8a0
<tbb::internal::NFS_Allocate(unsigned long, unsigned long, void*)@plt>
  8 -       48 8d 78 08             lea    0x8(%rax),%rdi
  9 -       48 89 c1                mov    %rax,%rcx
 10 -       48 89 c3                mov    %rax,%rbx
 11 -       48 c7 00 00 00 00 00    movq   $0x0,(%rax)
 12 -       48 c7 80 90 01 00 00    movq   $0x0,0x190(%rax)
 13 -       00 00 00 00
 14 -       31 c0                   xor    %eax,%eax
 15 -       48 83 e7 f8             and    $0xfffffffffffffff8,%rdi
 16 +       e8 b6 9a fe ff          callq  d8a0
<tbb::internal::NFS_Allocate(unsigned long, unsigned long, void*)@plt>
 17         48 89 ee                mov    %rbp,%rsi
 18 -       48 29 f9                sub    %rdi,%rcx
 19 -       81 c1 98 01 00 00       add    $0x198,%ecx
 20 -       c1 e9 03                shr    $0x3,%ecx
 21 -       f3 48 ab                rep stos %rax,%es:(%rdi)
 22 -       48 89 df                mov    %rbx,%rdi
 23 -       e8 5a e4 ff ff          callq  222a0
<tbb::internal::generic_scheduler::generic_scheduler(tbb::internal::market&)>
 24 -       48 8d 05 13 1c 21 00    lea    0x211c13(%rip),%rax        # 235a60
<vtable for
tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>>
 25 +       48 89 c7                mov    %rax,%rdi
 26 +       48 89 c3                mov    %rax,%rbx
 27 +       e8 a8 e4 ff ff          callq  222a0
<tbb::internal::generic_scheduler::generic_scheduler(tbb::internal::market&)>
 28 +       48 8d 05 61 1c 21 00    lea    0x211c61(%rip),%rax        # 235a60
<vtable for
tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>>
 29         48 83 c0 10             add    $0x10,%rax
 30         48 89 03                mov    %rax,(%rbx)
 31 -       48 8d 05 35 2b 21 00    lea    0x212b35(%rip),%rax        # 236990
<__itt_sync_create_ptr__3_0>
 32 +       48 8d 05 83 2b 21 00    lea    0x212b83(%rip),%rax        # 236990
<__itt_sync_create_ptr__3_0>
 33         48 8b 00                mov    (%rax),%rax
 34         48 85 c0                test   %rax,%rax
 35 -       74 1e                   je     23e81
<tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::allocate_scheduler(tbb::internal::market&)+0x91>
 36 -       48 8d 15 fe 25 21 00    lea    0x2125fe(%rip),%rdx        # 236468
<tbb::SyncObj_TaskPoolSpinning>
 37 -       48 8d 35 2f 26 21 00    lea    0x21262f(%rip),%rsi        # 2364a0
<tbb::SyncType_Scheduler>
 38 +       74 1e                   je     23e33
<tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::allocate_scheduler(tbb::internal::market&)+0x63>
 39 +       48 8d 15 4c 26 21 00    lea    0x21264c(%rip),%rdx        # 236468
<tbb::SyncObj_TaskPoolSpinning>
 40 +       48 8d 35 7d 26 21 00    lea    0x21267d(%rip),%rsi        # 2364a0
<tbb::SyncType_Scheduler>
 41         b9 02 00 00 00          mov    $0x2,%ecx
 42         48 89 df                mov    %rbx,%rdi
 43         48 8b 12                mov    (%rdx),%rdx
 44 @@ -41,4 +30,4 @@
 45         5b                      pop    %rbx
 46         5d                      pop    %rbp
 47         c3                      retq
 48 -       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
 49 +       0f 1f 00                nopl   (%rax)

Reply via email to