From: Wanpeng Li <[email protected]> The commit: c5afb6a87f2 ("sched/fair: Fix nohz.next_balance update")
intends to update nohz.next_balance in two steps. 1) The ILB CPU utilizes next_balance variable in nohz_idle_balance() to gather the shortest next balance of other idle CPUs before updating nohz.next_balance. 2) The ILB CPU updates the nohz.next_balance according to its own next_balance after load balance on behalf of other idle CPUs. However, there is a mess which breaks the original intention of the first step, every idle CPUs update nohz.next_balance during ILB CPU on behalf of them to do load balance, and then the ILB CPU utilizes next_balance variable in nohz_idle_balance() to gather the shortest next balance of other idle CPUs before updating nohz.next_balance. This patch fixes it by don't update nohz.next_balance for other idle CPUs when ILB CPU on behalf of them to do load balance. Cc: Vincent Guittot <[email protected]> Cc: Peter Zijlstra (Intel) <[email protected]> CC: Ingo Molnar <[email protected]> Signed-off-by: Wanpeng Li <[email protected]> --- kernel/sched/fair.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 274c747..83948a4 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8750,7 +8750,8 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle) * balance for itself and we need to update the * nohz.next_balance accordingly. */ - if ((idle == CPU_IDLE) && time_after(nohz.next_balance, rq->next_balance)) + if ((idle == CPU_IDLE) && time_after(nohz.next_balance, rq->next_balance) && + !test_bit(NOHZ_BALANCE_KICK, nohz_flags(this_rq()->cpu))) nohz.next_balance = rq->next_balance; #endif } -- 2.7.4

