On Fri, May 16, 2014 at 11:35:30AM +0200, Peter Zijlstra wrote: > On Fri, May 16, 2014 at 11:50:42AM +0800, Lai Jiangshan wrote: > > After debugging, I found the hotlug-in cpu is atctive but !online in this > > case. > > the problem was introduced by 5fbd036b. > > Some code assumes that any cpu in cpu_active_mask is also online, but > > 5fbd036b breaks > > this assumption, so the corresponding code with this assumption should be > > changed too. > > Good find, and yes it does that. > > > The following patch is just a workaround. After it is applied, the above > > WARNING > > is gone, but I can't hit the wq problem that you found. > > Seeing how the entirety of hotplug is basically duct tape and twigs, the > below isn't that bad.
I made that, are you okay with that? --- Subject: sched: Fix hotplug vs set_cpus_allowed_ptr() From: Lai Jiangshan <la...@cn.fujitsu.com> Date: Fri, 16 May 2014 11:50:42 +0800 Lai found that: WARNING: CPU: 1 PID: 13 at arch/x86/kernel/smp.c:124 native_smp_send_reschedule+0x2d/0x4b() ... migration_cpu_stop+0x1d/0x22 was caused by set_cpus_allowed_ptr() assuming that cpu_active_mask is always a sub-set of cpu_online_mask. This isn't true since 5fbd036b552f ("sched: Cleanup cpu_active madness"). So set active and online at the same time to avoid this particular problem. Fixes: 5fbd036b552f ("sched: Cleanup cpu_active madness") Signed-off-by: Lai Jiangshan <la...@cn.fujitsu.com> Signed-off-by: Peter Zijlstra <pet...@infradead.org> Link: http://lkml.kernel.org/r/53758b12.8060...@cn.fujitsu.com --- kernel/cpu.c | 6 ++++-- kernel/sched/core.c | 1 - 2 files changed, 4 insertions(+), 3 deletions(-) --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -726,10 +726,12 @@ void set_cpu_present(unsigned int cpu, b void set_cpu_online(unsigned int cpu, bool online) { - if (online) + if (online) { cpumask_set_cpu(cpu, to_cpumask(cpu_online_bits)); - else + cpumask_set_cpu(cpu, to_cpumask(cpu_active_bits)); + } else { cpumask_clear_cpu(cpu, to_cpumask(cpu_online_bits)); + } } void set_cpu_active(unsigned int cpu, bool active) --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5126,7 +5126,6 @@ static int sched_cpu_active(struct notif unsigned long action, void *hcpu) { switch (action & ~CPU_TASKS_FROZEN) { - case CPU_STARTING: case CPU_DOWN_FAILED: set_cpu_active((long)hcpu, true); return NOTIFY_OK;
pgp5XPrKD89d8.pgp
Description: PGP signature