On 29/05/20 07:17 +0100, Mike Crowe via Libstdc++ wrote:
> > diff --git a/libstdc++-v3/include/bits/atomic_futex.h 
> > b/libstdc++-v3/include/bits/atomic_futex.h
> > index 4375129..5f95ade 100644
> > --- a/libstdc++-v3/include/bits/atomic_futex.h
> > +++ b/libstdc++-v3/include/bits/atomic_futex.h
> > @@ -229,11 +229,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >       _M_load_when_equal_until(unsigned __val, memory_order __mo,
> >       const chrono::time_point<_Clock, _Duration>& __atime)
> >       {
> > -   const typename _Clock::time_point __c_entry = _Clock::now();
> > -   const __clock_t::time_point __s_entry = __clock_t::now();
> > -   const auto __delta = __atime - __c_entry;
> > -   const auto __s_atime = __s_entry + __delta;
> > -   return _M_load_when_equal_until(__val, __mo, __s_atime);
> > +   typename _Clock::time_point __c_entry = _Clock::now();
> > +   do {
> > +     const __clock_t::time_point __s_entry = __clock_t::now();
> > +     const auto __delta = __atime - __c_entry;
> > +     const auto __s_atime = __s_entry + __delta;
> > +     if (_M_load_when_equal_until(__val, __mo, __s_atime))
> > +       return true;

On Friday 11 September 2020 at 10:47:30 +0100, Jonathan Wakely wrote:
> I wonder if we can avoid looping if _Clock::is_steady is true i.e.
> 
>           if _GLIBCXX_CONSTEXPR (_Clock::is_steady)
>             return false
> 
> If _Clock is steady then it can't be adjusted to move backwards. I am
> not 100% sure, but I don't think a steady clock can run "slow" either.
> But I'm checking with LWG about that. We should keep the loop for now.

The C++20 final draft (p1239) claims: "C1::is_steady [is] true if t1 <= t2
is always true and the time between clock ticks is constant, otherwise
false."

which doesn't appear to mean that the clock cannot run slow. All it means
is that if the clock runs slow then it must be consistently slow. However,
IMO since "time" is not absolute and the quoted text does not say which
clock the clock ticks need to be constant with reference to, it loses all
meaning if you try to read it too carefully!

> The reason for wanting to return early is that _Clock::now() can be
> quite expensive, so avoiding the second call if it's not needed would
> be significant.

It looks like[1] clock_gettime can be expensive even with vdso, and we have
no idea how much work the custom clock's now() method is actually doing, so
I think this is worth considering.

[1] 
https://stackoverflow.com/questions/45863729/clock-gettime-might-be-very-slow-even-using-vdso

Let's suppose I have a system with an appallingly bad local clock which
means that I can't really rely on steady_clock for long timeouts. I might
invent my own good_steady_clock that consults some other hardware that has
a reliable monotonic clock. The chances are that its now() method is quite
expensive. Based on the wording in the standard quoted above, I would feel
justified in marking my clock as is_steady=true yet it would be essential
for _M_load_when_equal_until to call _Clock::now() a second time to ensure
that the call didn't return early.

However, with the wording tweaked to say "the time between clock ticks is
equal to the time between steady_clock ticks" then
good_steady_clock::is_steady would need to be false and the optimisation
would be guaranteed to be safe. With this definition the only ways a custom
clock with is_steady=true could differ from steady_clock are in its epoch
and duration type.

Looking through the rest of libstdc++, I can only see the value of
is_steady being used once, and that's for the same optimisation in
this_thread::sleep_for. It looks like slow_clock used for the tests has
is_steady = false.

So, I think that we should apply the optimisation you've described and then
perhaps try to get the wording tweaked to clarify that this is acceptable.
If you're in agreement then perhaps I could have a go at doing that?

> Related to this, I've been experimenting with a change to wait_for
> that checks __rtime <= 0 and if so, uses __clock_t::time_point::min()
> for the wait_until call, so that we don't bother to call
> __clock_t::now(). For a non-positive duration we don't need to
> determine the precise time_point in the past, because it's in the past
> anyway. Using time_point::min() is as good as __clock_t::now() - rtime
> (as long as we don't overflow anything when converting it to a struct
> timespec, or course!)

That sounds good to me.

> Using future.wait_for(0s) to poll for a future becoming ready takes a
> long time currently: https://wandbox.org/permlink/Z1arsDE7eHW9JrqJ
> Using wait_until(C::time_point::min()) is faster, because it doesn't
> call C::now().
> 
> This probably isn't important for most timed waiting functions in the
> library, because I don't see any reason to use wait_for(0s) to poll a
> mutex, condition_variable or atomic. But it's reasonable to do for
> futures.

This is presumably necessary because std::future lacks a try_wait() or
is_ready() method.

> I already added (__rtime <= __rtime.zero()) to this_thread::sleep_for
> in d1a74a287ee1a84b90e5675904dac7f945cffca1.

I suppose the only downside to this is for those with the misguided belief
that sleep_for(0s) is equivalent to sched_yield(). I don't think that
matters.

> The extra branch to check rtime <= rtime.zero() or atime < C::now() is
> probably insignificant compared to the cost of unnecessary calls to
> now() on one or more clocks.

Agreed.

Thanks.

Mike.

Reply via email to