On 21/09/20 08:24, Vincent Guittot wrote: > Some UCs like 9 always running tasks on 8 CPUs can't be balanced and the > load balancer currently migrates the waiting task between the CPUs in an > almost random manner. The success of a rq pulling a task depends of the > value of nr_balance_failed of its domains and its ability to be faster > than others to detach it. This behavior results in an unfair distribution > of the running time between tasks because some CPUs will run most of the > time, if not always, the same task whereas others will share their time > between several tasks. > > Instead of using nr_balance_failed as a boolean to relax the condition > for detaching task, the LB will use nr_balanced_failed to relax the > threshold between the tasks'load and the imbalance. This mecanism > prevents the same rq or domain to always win the load balance fight. > > Reviewed-by: Phil Auld <pa...@redhat.com> > Signed-off-by: Vincent Guittot <vincent.guit...@linaro.org>
Reviewed-by: Valentin Schneider <valentin.schnei...@arm.com>