https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122878

--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> ---

There's another problem with our semaphore which doesn't directly affect this
benchmark.

semaphore::try_acquire_for will keep trying to acquire the semaphore in a loop
for the specific duration, but we don't reduce the duration on each loop. So a
call to try_acquire_to(10ms) might wait for 9ms then get woken (either because
the semaphore became available, or the wait was interrupted by a signal, or
just wake spuriously), but then fail to acquire the semaphore (because another
thread acquired it, or it wasn't available at all and we woke spuriously), and
then wait for another 10ms. This will result in a total wait of 19ms instead of
the requested 10ms. When we loop we should reduce the time for the next wait by
the elapsed duration.

    template<typename _Rep, typename _Period>
      bool
      _M_try_acquire_for(const chrono::duration<_Rep, _Period>& __rtime)
noexcept
      {
        auto __val = _M_get_current();
        while (!_M_do_try_acquire(__val))
          if (__val == 0)
            {
              if (!std::__atomic_wait_address_for_v(&_M_counter, 0,
                                                    __ATOMIC_ACQUIRE,
                                                    __rtime, true))
                return false; // timed out
              __val = _M_get_current();
            }
        return true;
      }

This affects the _M_try_acquire_for member in __semaphore_impl and the one in
__platform_semaphore.

Reply via email to