On Mon, Mar 18, 2019 at 10:14:22AM -0700 [email protected] wrote:
> Phil Auld <[email protected]> writes:
>
> > On Fri, Mar 15, 2019 at 05:03:47PM +0100 Peter Zijlstra wrote:
> >> On Fri, Mar 15, 2019 at 11:30:42AM -0400, Phil Auld wrote:
> >>
> >> >> I'll rework the maths in the averaged version and post v2 if that makes
> >> >> sense.
> >> >
> >> > It may have the extra timer fetch, although maybe I could rework it so
> >> > that it used the
> >> > nsstart time the first time and did not need to do it twice in a row. I
> >> > had originally
> >> > reverted the hrtimer_forward_now() to hrtimer_forward() but put that
> >> > back.
> >>
> >> Sure; but remember, simpler is often better, esp. for code that
> >> typically 'never' runs.
> >
> > I reworked it to the below. This settles a bit faster. The average is sort
> > of squishy because
> > it's 3 samples divided by 4. And if we stay in a single call after
> > updating the period the "average"
> > will be even less accurate.
> >
> > It settles at a larger value faster so produces fewer messages and none of
> > the callback supressed ones.
> > The added complexity may not be worth it, though.
> >
> > I think this or your version, either one, would work.
> >
> > What needs to happen now to get one of them to land somewhere? Should I
> > just repost one with my
> > signed-off and let you add whatever other tags? And if so do you have a
> > preference for which one?
> >
> > Also, Ben, thoughts?
>
> It would probably make sense to have it just be ++count > 4 then I
> think? But otherwise yeah, I'm fine with either.
>
That would certainly work, of course :)
I liked the 3 to catch it as soon as possible but in the end one more loop
won't likely matter,
and it may prevent more since we are more likely to have it right.
Cheers,
Phil
> >
> > Cheers,
> > Phil
> >
> > --
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index ea74d43924b2..297fd228fdb0 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -4885,6 +4885,8 @@ static enum hrtimer_restart
> > sched_cfs_slack_timer(struct hrtimer *timer)
> > return HRTIMER_NORESTART;
> > }
> >
> > +extern const u64 max_cfs_quota_period;
> > +
> > static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer)
> > {
> > struct cfs_bandwidth *cfs_b =
> > @@ -4892,14 +4894,46 @@ static enum hrtimer_restart
> > sched_cfs_period_timer(struct hrtimer *timer)
> > unsigned long flags;
> > int overrun;
> > int idle = 0;
> > + int count = 0;
> > + u64 start, now;
> >
> > raw_spin_lock_irqsave(&cfs_b->lock, flags);
> > + now = start = ktime_to_ns(hrtimer_cb_get_time(timer));
> > for (;;) {
> > - overrun = hrtimer_forward_now(timer, cfs_b->period);
> > + overrun = hrtimer_forward(timer, now, cfs_b->period);
> > if (!overrun)
> > break;
> >
> > + if (++count > 3) {
> > + u64 new, old = ktime_to_ns(cfs_b->period);
> > +
> > + /* rough average of the time each loop is taking
> > + * really should be (n-s)/3 but this is easier for the
> > machine
> > + */
> > + new = (now - start) >> 2;
> > + if (new < old)
> > + new = old;
> > + new = (new * 147) / 128; /* ~115% */
> > + new = min(new, max_cfs_quota_period);
> > +
> > + cfs_b->period = ns_to_ktime(new);
> > +
> > + /* since max is 1s, this is limited to 1e9^2, which
> > fits in u64 */
> > + cfs_b->quota *= new;
> > + cfs_b->quota /= old;
> > +
> > + pr_warn_ratelimited(
> > + "cfs_period_timer[cpu%d]: period too short, scaling up (new
> > cfs_period_us %lld, cfs_quota_us = %lld)\n",
> > + smp_processor_id(),
> > + new/NSEC_PER_USEC,
> > + cfs_b->quota/NSEC_PER_USEC);
> > +
> > + /* reset count so we don't come right back in here */
> > + count = 0;
> > + }
> > +
> > idle = do_sched_cfs_period_timer(cfs_b, overrun, flags);
> > + now = ktime_to_ns(hrtimer_cb_get_time(timer));
> > }
> > if (idle)
> > cfs_b->period_active = 0;
--