https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125781

            Bug ID: 125781
           Summary: Valgrind reports that futex() is called with bad
                    pointer for stop_callback
           Product: gcc
           Version: 17.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mac at mcrowe dot com
  Target Milestone: ---

Created attachment 64735
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=64735&action=edit
Add reproduction to test suite

Valgrind reports that futex() is called for a semaphore that has fallen out of
scope for:

--8<--
{
  std::jthread t([](std::stop_token stop_token)-> void {
    std::stop_callback scoped_callback(stop_token, []-> void {
        puts(s: "In callback");
      });

    while (!stop_token.stop_requested())
      ;
  });
  std::this_thread::sleep_for(rtime: std::chrono::milliseconds(rep: 100));
}
-->8--

almost every time _if_ I add a sleep() to
__platform_semaphore_impl::_M_release() in semaphore_base.h:

--8<--
     _GLIBCXX_ALWAYS_INLINE ptrdiff_t
     _M_release(ptrdiff_t __update) noexcept
     {
       auto __old = __atomic_impl::fetch_add(&_M_counter, __update,
                                             memory_order::release);
+      sleep(1);
       if (__old == 0 && __update > 0)
         __atomic_notify_address(&_M_counter, true, true);
       return __old;
     }
-->8--

to force the code to run in the right order.

==1627599== Syscall param futex(futex) points to uninitialised byte(s)
==1627599==    at 0x4D667B9: syscall (syscall.S:38)
==1627599==    by 0x401629: __atomic_notify_address<int> (atomic_wait.h:399)
==1627599==    by 0x401629: _M_release (semaphore_base.h:307)
==1627599==    by 0x401629: release (semaphore:73)
==1627599==    by 0x401629: std::stop_token::_Stop_state_t::_M_request_stop()
[clone .isra.0] (stop_token:265)
==1627599==    by 0x4012EF: request_stop (stop_token:538)
==1627599==    by 0x4012EF: request_stop (thread:250)
==1627599==    by 0x4012EF: ~jthread (thread:178)
==1627599==    by 0x4012EF: main (race.cc:38)
==1627599==  Address 0x5a4ee50 is in a rw- anonymous segment

(I've seen the same valgrind complaint with more complex code very occasionally
without the sleep().)

It looks like the __atomic_impl::fetch_add() in _M_release() is enough for
_M_acquire() to return. Once that happens the std::stop_callback goes out of
scope along with the binary_semaphore that it contains so _M_counter is no
longer in scope when it is passed to __atomic_notify_address() by _M_release().
(I can't say that I really understand the nuances here so this analysis may be
wrong.)

I've attached a patch which adds the sleep along with a test for the test suite
to reproduce the problem. The only way I've found to actually run it under
valgrind is to snarf the compilation commands from the make
check-target-libstdc++-v3 log, use them to compile the file myself and then run
it with:

 LD_LIBRARY_PATH=../../../libstdc++-v3/src/.libs/ valgrind ./race.exe

Reproduced with Valgrind 3.24.0 and 3.25.1 using GCC 13.4.0 and current master
(abb7e70024df110eaca30d227b584bd90bd3e845) on Debian 13 on x86-64 (Ryzen
7950X).

Reply via email to