On Mon, 26 Dec 2016, Borislav Petkov wrote:
> On Mon, Dec 26, 2016 at 07:21:44PM +0100, Thomas Gleixner wrote:
> > Is there anything interesting error message before the BUG hits? I'll try
> > to reproduce on a AMD box tomorrow.
> 
> Hmm, so lemme see if I see it correctly:
> 
> threshold_create_bank() does kobject_create_and_add(name, &dev->kobj);
> and that dev thing is
> 
>       struct device *dev = per_cpu(mce_device, cpu);
> 
> BUT(!), those mce_device per-CPU things get initialized in
> 
> mce_cpu_online()
> |-> mce_device_create(cpu);
> 
> With a CONFIG_HOTPLUG_CPU=n .config that doesn't happen, right?
> 
> Oh, and I see what could've changed that:
> 
>   8c0eeac819c8 ("x86/mcheck: Move CPU_ONLINE and CPU_DOWN_PREPARE to hotplug 
> state machine")
> 
> And before that, we did call mce_device_create(cpu) in
> mcheck_init_device() which is a device initcall and not dependent on CPU
> hotplug.
> 
> And frankly, flipping back to the for_each_online_cpu(i) is yucky as
> hell but I don't see any other/better solution besides pulling up
> mce_device_create() into mcheck_init_device()...

The hotplug callbacks are invoked even with HOTPLUG=n. So that's not the
problem. I can reproduce it. Will post info once I understand it.

Thanks,

        tglx

Reply via email to