Ref: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636805.html,
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636804.html
I found a couple of bugs in this patch set.
#1: In atomic_wait.h, we have __wait_flags defined to include:
__do_spin = 4,
__spin_only = 8 | __do_spin, // implies __do_spin
So __spin_only is defined as two bits, which breaks when we test `__args &
__wait_flags::__spin_only` in __wait_impl(). The test evaluates true even
when we have only set __do_spin (bit 2) without bit 3, which is the
default at least on Linux and which is supposed to request a limited
number of spins before calling futex() and sleeping. You can observe this
by seeing that a thread blocked on std::counting_semaphore::acquire()
consumes 100% CPU indefinitely.
There is another instance in atomic_timed_wait.h.
Probably __spin_only should just be 8, and any configuration that wants
it should be responsible for manually setting __do_spin as well.
#2: __atomic_wait_address() does not populate __args._M_old before passing
to __wait_impl(), so it remains at its default value 0 (set by the
constructor). As a result, `__wait_impl` is effectively always waiting on
the value 0 (i.e. until `*__addr != 0`) rather than whatever value is
actually wanted. This is of course wrong, and can deadlock under the
right race condition.
To fix, we need something like `__args._M_old = __val;` inside the loop in
__atomic_wait_address(), so that we always wait on the exact value that
the predicate __pred() rejected. Again, there are similar instances in
atomic_timed_wait.h.
(In the current libstdc++ implementation, the predicate function loads the
value itself and doesn't return it, so we wait instead on a possibly
different value that was loaded somewhere else in _S_do_spin(). That is
also wrong, because the latter value might have been acceptable to the
predicate, and if so then waiting on it may deadlock. A classic TOCTOU.
This is Bug #104928, reported in March 2022 and still unfixed:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104928. The patch proposed
here would fix it, though.)
--
Nate Eldredge
n...@thatsmathematics.com