Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-24 Thread Borislav Petkov
On Sun, Jun 22, 2014 at 10:25:50AM -0700, Boris Ostrovsky wrote: > You can add > > Tested-by: Boris Ostrovsky > > if you prefer to go with that version. I still think it's not 100% > reliable (because of what I said above) but at least it fixes the > current breakage. Thanks. The thing is, I don

Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-22 Thread Boris Ostrovsky
- b...@alien8.de wrote: > On Fri, Jun 20, 2014 at 10:04:37PM -0400, Boris Ostrovsky wrote: > > I'll try it later but this doesn't look sufficient to me: we might > not > > reach this point if subsys_system_register() or > zalloc_cpumask_var() > > fail. > > If those fail, I'd say we have a mu

Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-21 Thread Borislav Petkov
On Fri, Jun 20, 2014 at 10:04:37PM -0400, Boris Ostrovsky wrote: > I'll try it later but this doesn't look sufficient to me: we might not > reach this point if subsys_system_register() or zalloc_cpumask_var() > fail. If those fail, I'd say we have a much bigger problem than undeleted timers. > We

Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-20 Thread Boris Ostrovsky
On 06/20/2014 05:11 PM, Borislav Petkov wrote: On Fri, Jun 20, 2014 at 04:43:37PM -0400, Boris Ostrovsky wrote: We are getting CPU_ONLINE notifier for ASPs during boot: Bah, that's craptastic. Hmm, ok, let's try this instead: I'll try it later but this doesn't look sufficient to me: we might

Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-20 Thread Borislav Petkov
On Fri, Jun 20, 2014 at 04:43:37PM -0400, Boris Ostrovsky wrote: > We are getting CPU_ONLINE notifier for ASPs during boot: Bah, that's craptastic. Hmm, ok, let's try this instead: -- diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index bb92f38153b2..9a79c8dbd8e8

Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-20 Thread Boris Ostrovsky
On 06/20/2014 04:29 PM, Borislav Petkov wrote: On Fri, Jun 20, 2014 at 04:16:50PM -0400, Boris Ostrovsky wrote: Sorry, mce_device_create(). We can't call it in the notifier until mcheck_init_device() has been successfully executed (we need subsys_system_register(&mce_subsys)). I don't know whet

Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-20 Thread Borislav Petkov
On Fri, Jun 20, 2014 at 04:16:50PM -0400, Boris Ostrovsky wrote: > Sorry, mce_device_create(). > > We can't call it in the notifier until mcheck_init_device() has been > successfully executed (we need subsys_system_register(&mce_subsys)). I don't > know whether we can call subsys_system_register()

Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-20 Thread Boris Ostrovsky
On 06/20/2014 04:03 PM, Borislav Petkov wrote: On Fri, Jun 20, 2014 at 03:39:34PM -0400, Boris Ostrovsky wrote: What about mce_device_add()? What is a mce_device_add()? There's no such function. Sorry, mce_device_create(). We can't call it in the notifier until mcheck_init_device() has bee

Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-20 Thread Borislav Petkov
On Fri, Jun 20, 2014 at 03:39:34PM -0400, Boris Ostrovsky wrote: > What about mce_device_add()? What is a mce_device_add()? There's no such function. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line "unsubsc

Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-20 Thread Boris Ostrovsky
On 06/20/2014 01:52 PM, Borislav Petkov wrote: On Fri, Jun 20, 2014 at 12:16:39PM -0400, Boris Ostrovsky wrote: But I think you still need to do the dance in the notifier to make sure you are not trying to add/remove device if mcheck_init_device() had failed earlier. mce_device_remove should be

Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-20 Thread Borislav Petkov
On Fri, Jun 20, 2014 at 12:16:39PM -0400, Boris Ostrovsky wrote: > But I think you still need to do the dance in the notifier to make > sure you are not trying to add/remove device if mcheck_init_device() > had failed earlier. mce_device_remove should be smart enough. Hint: mce_device_initialized

Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-20 Thread Boris Ostrovsky
On 06/20/2014 11:58 AM, Borislav Petkov wrote: On Fri, Jun 20, 2014 at 11:41:27AM -0400, Boris Ostrovsky wrote: Only in the sense that on Xen misc_register() often fails. But any failure on baremetal will result in the same behavior. Ok, thanks for explaining the details. Right. And I think w

Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-20 Thread Borislav Petkov
On Fri, Jun 20, 2014 at 11:41:27AM -0400, Boris Ostrovsky wrote: > Only in the sense that on Xen misc_register() often fails. But any > failure on baremetal will result in the same behavior. Ok, thanks for explaining the details. > Right. And I think we shouldn't because we leave undeleted timers

Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-20 Thread Boris Ostrovsky
On 06/20/2014 11:23 AM, Borislav Petkov wrote: On Fri, Jun 20, 2014 at 10:28:13AM -0400, Boris Ostrovsky wrote: Commit 9c15a24b038f4d8da93a2bc2554731f8953a7c17 (x86/mce: Improve mcheck_init_device() error handling) unregisters (or never registers) MCE's hotplug notifier if an error is encountere

Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-20 Thread Borislav Petkov
On Fri, Jun 20, 2014 at 10:28:13AM -0400, Boris Ostrovsky wrote: > Commit 9c15a24b038f4d8da93a2bc2554731f8953a7c17 (x86/mce: Improve > mcheck_init_device() error handling) unregisters (or never registers) > MCE's hotplug notifier if an error is encountered. Well, mcheck_init_device() did encounter

[PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path

2014-06-20 Thread Boris Ostrovsky
Commit 9c15a24b038f4d8da93a2bc2554731f8953a7c17 (x86/mce: Improve mcheck_init_device() error handling) unregisters (or never registers) MCE's hotplug notifier if an error is encountered. Since unplugging a CPU would normally result in the notifier deleting MCE timer we are now left with the timer