I need to look at this a bit more (and not on my phone, at lunch).
Ultimately, C++26 expects to add predicate waits and returning a
‘tri-state’ result isn’t something that’s been considered or likely to be
approved.

On Mon, Dec 11, 2023 at 12:18 PM Jonathan Wakely <jwakely....@gmail.com>
wrote:

> CCing Tom's current address, as he's not @redhat.com now.
>
> On Mon, 11 Dec 2023, 19:24 Nate Eldredge, <n...@thatsmathematics.com>
> wrote:
>
>> On Mon, 11 Dec 2023, Nate Eldredge wrote:
>>
>> > To fix, we need something like `__args._M_old = __val;` inside the loop
>> in
>> > __atomic_wait_address(), so that we always wait on the exact value that
>> the
>> > predicate __pred() rejected.  Again, there are similar instances in
>> > atomic_timed_wait.h.
>>
>> Thinking through this, there's another problem.  The main loop in
>> __atomic_wait_address() would then be:
>>
>>        while (!__pred(__val))
>>          {
>>            __args._M_old = __val;
>>            __detail::__wait_impl(__wait_addr, &__args);
>>            __val = __vfn();
>>          }
>>
>> The idea being that we only call __wait_impl to wait on a value that the
>> predicate said was unacceptable.  But looking for instance at its caller
>> __atomic_semaphore::_M_acquire() in bits/semaphore_base.h, the predicate
>> passed in is _S_do_try_acquire(), whose code is:
>>
>>      _S_do_try_acquire(__detail::__platform_wait_t* __counter,
>>                        __detail::__platform_wait_t __old) noexcept
>>      {
>>        if (__old == 0)
>>          return false;
>>
>>        return __atomic_impl::compare_exchange_strong(__counter,
>>                                                      __old, __old - 1,
>>
>>  memory_order::acquire,
>>
>>  memory_order::relaxed);
>>      }
>>
>> It returns false if the value passed in was unacceptable (i.e. zero),
>> *or*
>> if it was nonzero (let's say 1) but the compare_exchange failed because
>> another thread swooped in and modified the semaphore counter.  In that
>> latter case, __atomic_wait_address() would pass 1 to __wait_impl(), which
>> is likewise bad.  If the counter is externally changed back to 1 just
>> before we call __platform_wait (that's the futex call), we would go to
>> sleep waiting on a semaphore that is already available: deadlock.
>>
>> I guess there's a couple ways to fix it.
>>
>> You could have the "predicate" callback instead return a tri-state value:
>> "all done, stop waiting" (like current true), "value passed is not
>> acceptable" (like current false), and "value was acceptable but something
>> else went wrong".  Only the second case should result in calling
>> __wait_impl().  In the third case, __atomic_wait_address() should
>> just reload the value (using __vfn()) and loop again.
>>
>> Or, make the callback __pred() a pure predicate that only tests its input
>> value for acceptability, without any other side effects.  Then have
>> __atomic_wait_address() simply return as soon as __pred(__val) returns
>> true.  It would be up to the caller to actually decrement the semaphore
>> or
>> whatever, and to call __atomic_wait_address() again if this fails.  In
>> that case, __atomic_wait_address should probably return the final value
>> that was read, so the caller can immediately do something like a
>> compare-exchange using it, and not have to do an additional load and
>> predicate test.
>>
>> Or, make __pred() a pure predicate as before, and give
>> __atomic_wait_address yet one more callback function argument, call it
>> __taker(), whose job is to acquire the semaphore etc, and have
>> __atomic_wait_address call it after __pred(__val) returns true.
>>
>> --
>> Nate Eldredge
>> n...@thatsmathematics.com
>>
>>

Reply via email to