On 2020/12/15 0:48, Peter Zijlstra wrote:
> We compute the average cost of the total scan, but then use it as a
> per-cpu scan cost when computing the scan proportion. Fix this by
> properly computing a per-cpu scan cost.
> 
> This also fixes a bug where we would terminate early (!--nr, case) and
> not account that cost at all.

I'm a bit worried this may introduce a regression under heavy load.
The overhead of adding another cpu_clock() and calculation becomes 
significant when sis_scan is throttled by nr.

I'm not sure if it's a good idea to not account the scan cost at all
when sis_scan is throttled, that is, remove the first cpu_clock() as
well. The avg scan cost remains the value when the system is not very
busy, and when the load comes down and span avg idle > span avg cost,
we account the cost again. This should make select_idle_cpu() a bit
faster when the load is very high.

Thanks,
-Aubrey
> 
> Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
> ---
>  kernel/sched/fair.c |   13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6144,10 +6144,10 @@ static inline int select_idle_smt(struct
>  static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, 
> int target)
>  {
>       struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
> +     int cpu, loops = 1, nr = INT_MAX;
> +     int this = smp_processor_id();
>       struct sched_domain *this_sd;
>       u64 time;
> -     int this = smp_processor_id();
> -     int cpu, nr = INT_MAX;
>  
>       this_sd = rcu_dereference(*this_cpu_ptr(&sd_llc));
>       if (!this_sd)
> @@ -6175,14 +6175,19 @@ static int select_idle_cpu(struct task_s
>       }
>  
>       for_each_cpu_wrap(cpu, cpus, target) {
> -             if (!--nr)
> -                     return -1;
>               if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
>                       break;
> +
> +             if (loops >= nr) {
> +                     cpu = -1;
> +                     break;
> +             }
> +             loops++;
>       }
>  
>       if (sched_feat(SIS_PROP)) {
>               time = cpu_clock(this) - time;
> +             time = div_u64(time, loops);
>               update_avg(&this_sd->avg_scan_cost, time);
>       }
>  
> 
> 

Reply via email to