On Tue, 2014-05-20 at 14:09 -0700, Jason Low wrote: > Hi Tim, Rik > > Yes, that makes sense that we want to balance if they are equal. We > may also consider using "if (time_after_eq(jiffies, > rq->next_balance)". > > Reviewed-by: Jason Low <jason.l...@hp.com>
Jason & Rik, Thanks for reviewing the patch. I've updated the patch below as suggested. Tim --- From: Tim Chen <tim.c.c...@linux.intel.com> Subject: [PATCH v2] sched: Reduce the rate of needless idle load balancing The current no_hz idle load balancer do load balancing for *all* idle cpus, even though the time due to load balance for a particular idle cpu could be still a while in the future. This introduces a much higher load balancing rate than what is necessary. The patch changes the behavior by only doing idle load balancing on behalf of an idle cpu only when it is due for load balancing. On SGI's systems with over 3000 cores, the cpu responsible for idle balancing got overwhelmed with idle balancing, and introduces a lot of OS noise to workloads. This patch fixes the issue. Acked-by: Russ Anderson <r...@sgi.com> Reviewed-by: Rik van Riel <r...@redhat.com> Reviewed-by: Jason Low <jason.l...@hp.com> Signed-off-by: Tim Chen <tim.c.c...@linux.intel.com> --- kernel/sched/fair.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9b4c4f3..b826c3a 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6764,12 +6764,17 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) rq = cpu_rq(balance_cpu); - raw_spin_lock_irq(&rq->lock); - update_rq_clock(rq); - update_idle_cpu_load(rq); - raw_spin_unlock_irq(&rq->lock); - - rebalance_domains(rq, CPU_IDLE); + /* + * If time for next balance is due, + * do the balance. + */ + if (time_after_eq(jiffies, rq->next_balance)) { + raw_spin_lock_irq(&rq->lock); + update_rq_clock(rq); + update_idle_cpu_load(rq); + raw_spin_unlock_irq(&rq->lock); + rebalance_domains(rq, CPU_IDLE); + } if (time_after(this_rq->next_balance, rq->next_balance)) this_rq->next_balance = rq->next_balance; -- 1.7.11.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/