https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101037
Pekka S <p...@gcc-bugzilla.mail.kapsi.fi> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |p...@gcc-bugzilla.mail.kaps | |i.fi --- Comment #3 from Pekka S <p...@gcc-bugzilla.mail.kapsi.fi> --- Created attachment 52203 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52203&action=edit Ensure _M_ver and _M_cv/_M_mtx are synchronized Hi. I suspect this is not a w64-mingw32 specific error but present on all platforms that do not define _GLIBCXX_HAVE_PLATFORM_WAIT. If I have not mistaken the problem lies within __waiter_pool::_M_do_wait and that _M_ver and _M_cv/_M_mtx are not properly synchronized: stdlibc++v3/include/bits/atomic_wait.h: 254 void 255 _M_do_wait(const __platform_wait_t* __addr, __platform_wait_t __old) noexcept 256 { 257 #ifdef _GLIBCXX_HAVE_PLATFORM_WAIT 258 __platform_wait(__addr, __old); 259 #else 260 __platform_wait_t __val; 261 __atomic_load(__addr, &__val, __ATOMIC_RELAXED); 262 if (__val == __old) 263 { 264 lock_guard<mutex> __l(_M_mtx); 265 _M_cv.wait(_M_mtx); 266 } 267 #endif // __GLIBCXX_HAVE_PLATFORM_WAIT 268 } In this case _GLIBCXX_HAVE_PLATFORM_WAIT is not defined and __addr will point to __waiter_pool_base::_M_ver (I do not know why it is called ver). _M_ver is used to indicate if a thread has been notified. __old is the value of _M_ver when the conditional wait was entered at __waiter::_M_do_wait_v or __waiter::_M_do_wait. (__waiter and __waiter_pool are different classes, __waiter::_M_do_wait calls __waiter__pool::_M_do_wait unless a predicate on which to wait passes.) 310 if (_M_laundered()) 311 { 312 __atomic_fetch_add(_M_addr, 1, __ATOMIC_ACQ_REL); 313 __all = true; 314 } Each time a thread is notified _M_ver is incremented by one. _M_laundered simply checks if _M_addr is equal to _M_ver (of __waiter_base::_M_w). This is where the problem is: A waiting thread might observe a (soon-to-be) stale value which is equal to __old. This is because between __atomic_load and _M_cv.wait the value of _M_ver might change. So, in other words, _M_cv.notify_all (note that notify_one never actually happens since __all is forced as true) might occur before _M_cv.wait is properly registered and/or actually waiting. As __waiter_bool::_M_do_wait can only observe _M_ver (e.g. not a value of atomic_flag on which the user expects the wait to happen) it must not wait based on a stale information -- so, it absolutely must not miss any notifies, or it will not be able to return to __waiter::_M_do_wait which does the predicate level checking. My suggestion is that __l(_M_mtx) shall be moved outside the `if (__val == __old)' block and _M_ver shall be only modified if the _M_mtx mutex is held. Since _M_cv.wait(_M_mtx) releases the mutex when the thread is actually waiting this should "just work". Attached patch does that and bit more: _M_laundered is moved from __waiter_base to __waiter_pool_base since it must be able to access _M_mtx. _M_enter_notify is added along with _M_enter, _M_leave and __detail::__atomic_load_value helpers. _M_mtx and _M_cv are marked mutable so that __waiter_pool::_M_do_wait / _M_do_spin can be made const. Instead of __addr it uses _M_ver (and adds _M_laundered(__addr) assertation) to clearly communicate that non-_GLIBCXX_HAVE_PLATFORM_WAIT code is specialized. Also, added const and/or noexcept where due.