On Tue, Apr 06, 2021 at 04:31:35PM +0200, Vincent Guittot wrote: > On Tue, 6 Apr 2021 at 06:11, Ricardo Neri > <ricardo.neri-calde...@linux.intel.com> wrote: > > > > Introduce arch_sched_asym_prefer_early() so that architectures with SMT > > can delay the decision to label a candidate busiest group as > > group_asym_packing. > > > > When using asymmetric packing, high priority idle CPUs pull tasks from > > scheduling groups with low priority CPUs. The decision on using asymmetric > > packing for load balancing is done after collecting the statistics of a > > candidate busiest group. However, this decision needs to consider the > > state of SMT siblings of dst_cpu. > > > > Cc: Aubrey Li <aubrey...@intel.com> > > Cc: Ben Segall <bseg...@google.com> > > Cc: Daniel Bristot de Oliveira <bris...@redhat.com> > > Cc: Dietmar Eggemann <dietmar.eggem...@arm.com> > > Cc: Joel Fernandes (Google) <j...@joelfernandes.org> > > Cc: Mel Gorman <mgor...@suse.de> > > Cc: Quentin Perret <qper...@google.com> > > Cc: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com> > > Cc: Steven Rostedt <rost...@goodmis.org> > > Cc: Tim Chen <tim.c.c...@linux.intel.com> > > Reviewed-by: Len Brown <len.br...@intel.com> > > Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com> > > --- > > include/linux/sched/topology.h | 1 + > > kernel/sched/fair.c | 11 ++++++++++- > > 2 files changed, 11 insertions(+), 1 deletion(-) > > > > diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h > > index 8f0f778b7c91..663b98959305 100644 > > --- a/include/linux/sched/topology.h > > +++ b/include/linux/sched/topology.h > > @@ -57,6 +57,7 @@ static inline int cpu_numa_flags(void) > > #endif > > > > extern int arch_asym_cpu_priority(int cpu); > > +extern bool arch_sched_asym_prefer_early(int a, int b); > > > > struct sched_domain_attr { > > int relax_domain_level; > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 4ef3fa0d5e8d..e74da853b046 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -106,6 +106,15 @@ int __weak arch_asym_cpu_priority(int cpu) > > return -cpu; > > } > > > > +/* > > + * For asym packing, early check if CPUs with higher priority should be > > + * preferred. On some architectures, more data is needed to make a > > decision. > > + */ > > +bool __weak arch_sched_asym_prefer_early(int a, int b) > > +{ > > + return sched_asym_prefer(a, b); > > +} > > + > > /* > > * The margin used when comparing utilization with CPU capacity. > > * > > @@ -8458,7 +8467,7 @@ static inline void update_sg_lb_stats(struct lb_env > > *env, > > if (!local_group && env->sd->flags & SD_ASYM_PACKING && > > env->idle != CPU_NOT_IDLE && > > sgs->sum_h_nr_running && > > - sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu)) { > > + arch_sched_asym_prefer_early(env->dst_cpu, > > group->asym_prefer_cpu)) { > > If itmt set arch_sched_asym_prefer_early to true all groups will be > set as group_asym_packing unconditionally which is wrong. The state > has to be set only when we want asym packing migration
Thanks for your quick review Vincent! Indeed, sgs->group_asym_packing should only be set when we want asym_packing migration. However, for ITMT we also need to know the state of the SMT siblings of dst_cpu if any. update_sg_lb_stats() does not anything about the local group. Thus, arch_sched_asym_prefer_early() intends to give an early decision that can be revoked later when statistics of the local group are known. I also thought about checking the state of the SMT siblings from update_sg_lb_stats() but that would duplicate the functionality of update_sg_lb_stats() when called on the local group. The group can still be classified as group_overloaded and group_imbalanced as they take precedence over group_asym_packing. We would miss the group_has_spare and group_fully_busy classifications, though. I can look into fixing that. Thanks and BR, Ricardo