Updated patch. I reverted the memory order change (and will submit that as another patch) and fixed some spelling and grammar errors.
On Wed, Feb 9, 2022 at 2:43 AM Jonathan Wakely <jwak...@redhat.com> wrote: > On Wed, 9 Feb 2022 at 00:57, Thomas Rodgers via Libstdc++ > <libstd...@gcc.gnu.org> wrote: > > > > This issue was observed as a deadloack in > > 29_atomics/atomic/wait_notify/100334.cc on vxworks. When a wait is > > "laundered" (e.g. type T* does not suffice as a waitable address for the > > platform's native waiting primitive), the address waited is that of the > > _M_ver member of __waiter_pool_base, so several threads may wait on the > > same address for unrelated atomic<T>'s. As noted in the PR, the > > implementation correctly exits the wait for the thread who's data > > changed, but not for any other threads waiting on the same address. > > > > As noted in the PR the __waiter::_M_do_wait_v member was correctly > exiting > > but the other waiters were not reloaded the value of _M_ver before > > re-entering the wait. > > > > Moving the spin call inside the loop accomplishes this, and is > > consistent with the predicate accepting version of __waiter::_M_do_wait. > > There is a change to the memory order in _S_do_spin_v which is not > described in the commit msg or the changelog. Is that unintentional? > > (Aside: why do we even have _S_do_spin_v, it's called in exactly one > place, so could just be inlined into _M_do_spin_v, couldn't it?) > >
From b39283d5100305e7a95d59324059de9952d3a858 Mon Sep 17 00:00:00 2001 From: Thomas Rodgers <rodg...@appliantology.com> Date: Tue, 8 Feb 2022 16:33:36 -0800 Subject: [PATCH] libstdc++: Fix deadlock in atomic wait [PR104442] This issue was observed as a deadlock in 29_atomics/atomic/wait_notify/100334.cc on vxworks. When a wait is "laundered" (e.g. type T* does not suffice as a waitable address for the platform's native waiting primitive), the address waited is that of the _M_ver member of __waiter_pool_base, so several threads may wait on the same address for unrelated atomic<T> objects. As noted in the PR, the implementation correctly exits the wait for the thread whose data changed, but not for any other threads waiting on the same address. As noted in the PR the __waiter::_M_do_wait_v member was correctly exiting but the other waiters were not reloading the value of _M_ver before re-entering the wait. Moving the spin call inside the loop accomplishes this, and is consistent with the predicate accepting version of __waiter::_M_do_wait. libstdc++-v3/ChangeLog: PR libstdc++/104442 * include/bits/atomic_wait.h (__waiter::_M_do_wait_v): Move spin loop inside do loop so that threads failing the wait, reload _M_ver. --- libstdc++-v3/include/bits/atomic_wait.h | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/libstdc++-v3/include/bits/atomic_wait.h b/libstdc++-v3/include/bits/atomic_wait.h index d7de0d7eb9e..6ce7f9343cf 100644 --- a/libstdc++-v3/include/bits/atomic_wait.h +++ b/libstdc++-v3/include/bits/atomic_wait.h @@ -388,12 +388,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION void _M_do_wait_v(_Tp __old, _ValFn __vfn) { - __platform_wait_t __val; - if (__base_type::_M_do_spin_v(__old, __vfn, __val)) - return; - do { + __platform_wait_t __val; + if (__base_type::_M_do_spin_v(__old, __vfn, __val)) + return; __base_type::_M_w._M_do_wait(__base_type::_M_addr, __val); } while (__detail::__atomic_compare(__old, __vfn())); -- 2.34.1