On 14/09/20 11:03, Vincent Guittot wrote: > The busy_factor, which increases load balance interval when a cpu is busy, > is set to 32 by default. This value generates some huge LB interval on > large system like the THX2 made of 2 node x 28 cores x 4 threads. > For such system, the interval increases from 112ms to 3584ms at MC level. > And from 228ms to 7168ms at NUMA level. > > Even on smaller system, a lower busy factor has shown improvement on the > fair distribution of the running time so let reduce it for all. >
ISTR you mentioned taking this one step further and making (interval * busy_factor) scale logarithmically with the number of CPUs to avoid reaching outrageous numbers. Did you experiment with that already? > Signed-off-by: Vincent Guittot <[email protected]> > --- > kernel/sched/topology.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c > index 1a84b778755d..a8477c9e8569 100644 > --- a/kernel/sched/topology.c > +++ b/kernel/sched/topology.c > @@ -1336,7 +1336,7 @@ sd_init(struct sched_domain_topology_level *tl, > *sd = (struct sched_domain){ > .min_interval = sd_weight, > .max_interval = 2*sd_weight, > - .busy_factor = 32, > + .busy_factor = 16, > .imbalance_pct = 117, > > .cache_nice_tries = 0,

