https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125781
Bug ID: 125781
Summary: Valgrind reports that futex() is called with bad
pointer for stop_callback
Product: gcc
Version: 17.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: mac at mcrowe dot com
Target Milestone: ---
Created attachment 64735
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=64735&action=edit
Add reproduction to test suite
Valgrind reports that futex() is called for a semaphore that has fallen out of
scope for:
--8<--
{
std::jthread t([](std::stop_token stop_token)-> void {
std::stop_callback scoped_callback(stop_token, []-> void {
puts(s: "In callback");
});
while (!stop_token.stop_requested())
;
});
std::this_thread::sleep_for(rtime: std::chrono::milliseconds(rep: 100));
}
-->8--
almost every time _if_ I add a sleep() to
__platform_semaphore_impl::_M_release() in semaphore_base.h:
--8<--
_GLIBCXX_ALWAYS_INLINE ptrdiff_t
_M_release(ptrdiff_t __update) noexcept
{
auto __old = __atomic_impl::fetch_add(&_M_counter, __update,
memory_order::release);
+ sleep(1);
if (__old == 0 && __update > 0)
__atomic_notify_address(&_M_counter, true, true);
return __old;
}
-->8--
to force the code to run in the right order.
==1627599== Syscall param futex(futex) points to uninitialised byte(s)
==1627599== at 0x4D667B9: syscall (syscall.S:38)
==1627599== by 0x401629: __atomic_notify_address<int> (atomic_wait.h:399)
==1627599== by 0x401629: _M_release (semaphore_base.h:307)
==1627599== by 0x401629: release (semaphore:73)
==1627599== by 0x401629: std::stop_token::_Stop_state_t::_M_request_stop()
[clone .isra.0] (stop_token:265)
==1627599== by 0x4012EF: request_stop (stop_token:538)
==1627599== by 0x4012EF: request_stop (thread:250)
==1627599== by 0x4012EF: ~jthread (thread:178)
==1627599== by 0x4012EF: main (race.cc:38)
==1627599== Address 0x5a4ee50 is in a rw- anonymous segment
(I've seen the same valgrind complaint with more complex code very occasionally
without the sleep().)
It looks like the __atomic_impl::fetch_add() in _M_release() is enough for
_M_acquire() to return. Once that happens the std::stop_callback goes out of
scope along with the binary_semaphore that it contains so _M_counter is no
longer in scope when it is passed to __atomic_notify_address() by _M_release().
(I can't say that I really understand the nuances here so this analysis may be
wrong.)
I've attached a patch which adds the sleep along with a test for the test suite
to reproduce the problem. The only way I've found to actually run it under
valgrind is to snarf the compilation commands from the make
check-target-libstdc++-v3 log, use them to compile the file myself and then run
it with:
LD_LIBRARY_PATH=../../../libstdc++-v3/src/.libs/ valgrind ./race.exe
Reproduced with Valgrind 3.24.0 and 3.25.1 using GCC 13.4.0 and current master
(abb7e70024df110eaca30d227b584bd90bd3e845) on Debian 13 on x86-64 (Ryzen
7950X).