Re: [RFC] sched/fair: hard lockup in sched_cfs_period_timer

bsegall Wed, 13 Mar 2019 10:45:12 -0700

Phil Auld <[email protected]> writes:

> On Mon, Mar 11, 2019 at 04:25:36PM -0400 Phil Auld wrote:
>> On Mon, Mar 11, 2019 at 10:44:25AM -0700 [email protected] wrote:
>> > Letting it spin for 100ms and then only increasing by 6% seems extremely
>> > generous. If we went this route I'd probably say "after looping N
>> > times, set the period to time taken / N + X%" where N is like 8 or
>> > something. I think I'd probably perfer something like this to the
>> > previous "just abort and let it happen again next interrupt" one.
>> 
>> Okay. I'll try to spin something up that does this. It may be a little 
>> trickier to keep the quota proportional to the new period. I think that's 
>> important since we'll be changing the user's setting.
>> 
>> Do you mean to have it break when it hits N and recalculates the period or 
>> reset the counter and keep going?
>> 
>
> Let me know what you think of the below. It's working nicely. I like your 
> suggestion to limit it quickly based on number of loops and use that to 
> scale up. I think it is best to break out and let it fire again if needed. 
> The warning fires once, very occasionally twice, and then things are quiet.
>
> If that looks reasonable I'll do some more testing and spin it up as a real 
> patch submission.


Yeah, this looks reasonable. I should probably see how unreasonable the
other thing would be, but if your previous periods were kinda small (and
it's just that the machine crashing isn't an ok failure mode) I suppose
it's not a big deal.

>
> Cheers,
> Phil
> ---
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 310d0637fe4b..54b30adfc89e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4859,19 +4859,51 @@ static enum hrtimer_restart 
> sched_cfs_slack_timer(struct hrtimer *timer)
>       return HRTIMER_NORESTART;
>  }
>  
> +extern const u64 max_cfs_quota_period;
> +int cfs_period_autotune_loop_limit   = 8;
> +int cfs_period_autotune_cushion_pct  = 15; /* percentage added to period 
> recalculation */
> +
>  static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer)
>  {
>       struct cfs_bandwidth *cfs_b =
>               container_of(timer, struct cfs_bandwidth, period_timer);
> +     s64 nsstart, nsnow, new_period;
>       int overrun;
>       int idle = 0;
> +     int count = 0;
>  
>       raw_spin_lock(&cfs_b->lock);
> +     nsstart = ktime_to_ns(hrtimer_cb_get_time(timer));
>       for (;;) {
>               overrun = hrtimer_forward_now(timer, cfs_b->period);
>               if (!overrun)
>                       break;
>  
> +             if (++count > cfs_period_autotune_loop_limit) {
> +                     ktime_t old_period = ktime_to_ns(cfs_b->period);
> +
> +                     nsnow = ktime_to_ns(hrtimer_cb_get_time(timer));
> +                     new_period = (nsnow - 
> nsstart)/cfs_period_autotune_loop_limit;
> +
> +                     /*  Make sure new period will be larger than old. */
> +                     if (new_period < old_period) {
> +                             new_period = old_period;
> +                     }
> +                     new_period += (new_period *  
> cfs_period_autotune_cushion_pct) / 100;

This ordering means that it will always increase by at least 15%. This
is a bit odd but probably a good thing; I'd just change the comment to
make it clear this is deliberate.

> +
> +                     if (new_period >  max_cfs_quota_period)
> +                             new_period = max_cfs_quota_period;
> +
> +                     cfs_b->period = ns_to_ktime(new_period);
> +                     cfs_b->quota += (cfs_b->quota * ((new_period - 
> old_period) * 100)/old_period)/100;

In general it makes sense to do fixed point via 1024 or something that
can be optimized into shifts (and a larger number is better in general
for better precision).

> +                     pr_warn_ratelimited(
> +                             "cfs_period_timer[cpu%d]: period too short, 
> scaling up (new cfs_period_us %lld, cfs_quota_us = %lld)\n",
> +                             smp_processor_id(), 
> cfs_b->period/NSEC_PER_USEC, cfs_b->quota/NSEC_PER_USEC);
> +
> +                     idle = 0;
> +                     break;
> +             }
> +
>               idle = do_sched_cfs_period_timer(cfs_b, overrun);
>       }
>       if (idle)

Re: [RFC] sched/fair: hard lockup in sched_cfs_period_timer

Reply via email to