On Fri, 2012-09-07 at 09:20 +0800, Fengguang Wu wrote: > FYI, the bisect result is > > commit 554cecaf733623b327eef9652b65965eb1081b81 > Author: Diwakar Tundlam <dtund...@nvidia.com> > Date: Wed Mar 7 14:44:26 2012 -0800 > > sched/nohz: Correctly initialize 'next_balance' in 'nohz' idle balancer > > The 'next_balance' field of 'nohz' idle balancer must be initialized > to jiffies. Since jiffies is initialized to negative 300 seconds the > 'nohz' idle balancer does not run for the first 300s (5mins) after > bootup. If no new processes are spawed or no idle cycles happen, the > load on the cpus will remain unbalanced for that duration. > > Signed-off-by: Diwakar Tundlam <dtund...@nvidia.com> > Signed-off-by: Peter Zijlstra <a.p.zijls...@chello.nl> > Link: > http://lkml.kernel.org/r/1dd7bfedd3147247b1355befefe4665237994f3...@hqmail04.nvidia.com > Signed-off-by: Ingo Molnar <mi...@elte.hu>
Oh fun.. does the below 'fix' it? The thing I'm thinking of a tick happening right after we set jiffies but before the zalloc (specifically the memset(0)) is complete. Since we've already registered the softirq we can end up in the load-balancer and see a completely weird idle mask. Hmm? --- kernel/sched/fair.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 1ca4fe4..ac57bb6 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5346,13 +5346,12 @@ void print_cfs_stats(struct seq_file *m, int cpu) __init void init_sched_fair_class(void) { #ifdef CONFIG_SMP - open_softirq(SCHED_SOFTIRQ, run_rebalance_domains); - #ifdef CONFIG_NO_HZ nohz.next_balance = jiffies; zalloc_cpumask_var(&nohz.idle_cpus_mask, GFP_NOWAIT); cpu_notifier(sched_ilb_notifier, 0); #endif -#endif /* SMP */ + open_softirq(SCHED_SOFTIRQ, run_rebalance_domains); +#endif /* SMP */ } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/