On Tue, Aug 22 2017 at 07:48, Vincent Guittot wrote: > On 21 August 2017 at 17:21, Brendan Jackman <brendan.jack...@arm.com> wrote: >> The current use of returning NULL from find_idlest_group is broken in > [snip] >> --- >> kernel/sched/fair.c | 34 +++++++++++++++++++--------------- >> 1 file changed, 19 insertions(+), 15 deletions(-) >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index 64618d768546..7cb5ed719cf9 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -5382,26 +5382,29 @@ static unsigned long capacity_spare_wake(int cpu, >> struct task_struct *p) >> * domain. >> */ >> static struct sched_group * >> -find_idlest_group(struct sched_domain *sd, struct task_struct *p, >> - int this_cpu, int sd_flag) >> +find_idlest_group(struct sched_domain *sd, struct task_struct *p, int >> sd_flag) >> { >> struct sched_group *idlest = NULL, *group = sd->groups; >> + struct sched_group *local_group = sd->groups; >> struct sched_group *most_spare_sg = NULL; >> - unsigned long min_runnable_load = ULONG_MAX, this_runnable_load = 0; >> - unsigned long min_avg_load = ULONG_MAX, this_avg_load = 0; >> + unsigned long min_runnable_load = ULONG_MAX, this_runnable_load = >> ULONG_MAX; >> + unsigned long min_avg_load = ULONG_MAX, this_avg_load = ULONG_MAX; > > Good catch for this_runnable_load, indeed there is a problem is > local_group is not allowed > >> unsigned long most_spare = 0, this_spare = 0; >> int >> load_idx = sd->forkexec_idx; >> int imbalance_scale = 100 + (sd->imbalance_pct-100)/2; >> unsigned long imbalance = scale_load_down(NICE_0_LOAD) * >> (sd->imbalance_pct-100) / 100; >> >> + if (!cpumask_intersects(sched_domain_span(sd), &p->cpus_allowed)) >> + return NULL; > > You should put that test above at upper level in the while (sd) loop > or even before the while(sd) loop as there is even no need to start > this while loop in this case. There is no need to call > find_idlest_group if there is no cpu allowed and no group to find. > Then, you can keep the current behavior of find_idlest_group() which > return null if there no best group than the local one
True. I have to admit the reason I didn't do this is because due to the current "else while", doing this outside the loop would have meant another level of indentation. But actually if I break out a new function as in Peter's proposal I can avoid that (plus I probably shouldn't let code beauty overrule warm-path performance :| ). >> + >> if (sd_flag & SD_BALANCE_WAKE) >> load_idx = sd->wake_idx; >> >> do { >> unsigned long load, avg_load, runnable_load; >> unsigned long spare_cap, max_spare_cap; >> - int local_group; >> + bool group_is_local = group == local_group; >> int i; >> >> /* Skip over this group if it has no CPUs allowed */ > > [snip] > >> @@ -5927,8 +5927,12 @@ select_task_rq_fair(struct task_struct *p, int >> prev_cpu, int sd_flag, int wake_f >> continue; >> } >> >> - group = find_idlest_group(sd, p, cpu, sd_flag); >> + group = find_idlest_group(sd, p, sd_flag); >> if (!group) { >> + break; >> + } else if (group == sd->groups) { >> + new_cpu = cpu; > > The " new_cpu = cpu" above should be put before the while(sd) loop > like in Peter's proposal I don't think that would work - I've responded along with Joel in the other subthread. >> + /* Now try balancing at a lower domain level of cpu >> */ >> sd = sd->child; >> continue; >> } >> -- >> 2.14.1 >>