Re: [PATCH] x86, amd, mce: Prevent potential cpu-online oops

2013-04-09 Thread Steffen Persvold
On 4/9/2013 12:24 PM, Borislav Petkov wrote: On Tue, Apr 09, 2013 at 11:45:44AM +0200, Steffen Persvold wrote: Hmm, yes of course. This of course breaks on our slave servers when the shared mechanism doesn't work properly (i.e NB not visible). Then all cores gets individual kobjects and there ca

Re: [PATCH] x86, amd, mce: Prevent potential cpu-online oops

2013-04-09 Thread Borislav Petkov
On Tue, Apr 09, 2013 at 11:45:44AM +0200, Steffen Persvold wrote: > Hmm, yes of course. This of course breaks on our slave servers when > the shared mechanism doesn't work properly (i.e NB not visible). Then > all cores gets individual kobjects and there can be discrepancies > between what the hard

Re: [PATCH] x86, amd, mce: Prevent potential cpu-online oops

2013-04-09 Thread Steffen Persvold
On 4/9/2013 11:38 AM, Borislav Petkov wrote: On Tue, Apr 09, 2013 at 11:25:16AM +0200, Steffen Persvold wrote: Why not let all cores just create their individual kobject and skip this "shared" nb->bank4 concept ? Any disadvantage to that (apart from the obvious storage bloat?). Well, bank4 is

Re: [PATCH] x86, amd, mce: Prevent potential cpu-online oops

2013-04-09 Thread Borislav Petkov
On Tue, Apr 09, 2013 at 11:25:16AM +0200, Steffen Persvold wrote: > Why not let all cores just create their individual kobject and skip > this "shared" nb->bank4 concept ? Any disadvantage to that (apart from > the obvious storage bloat?). Well, bank4 is shared across cores on the northbridge in *

Re: [PATCH] x86, amd, mce: Prevent potential cpu-online oops

2013-04-09 Thread Steffen Persvold
On 4/4/2013 9:07 PM, Borislav Petkov wrote: On Thu, Apr 04, 2013 at 08:05:46PM +0200, Steffen Persvold wrote: It made more sense (to me) to skip the creation of MC4 all together if you can't find the matching northbridge since you can't reliably do the dec_and_test() reference counting on the sh

Re: [PATCH] x86, amd, mce: Prevent potential cpu-online oops

2013-04-04 Thread Steffen Persvold
On 4/4/2013 9:07 PM, Borislav Petkov wrote: On Thu, Apr 04, 2013 at 08:05:46PM +0200, Steffen Persvold wrote: It made more sense (to me) to skip the creation of MC4 all together if you can't find the matching northbridge since you can't reliably do the dec_and_test() reference counting on the sh

Re: [PATCH] x86, amd, mce: Prevent potential cpu-online oops

2013-04-04 Thread Steffen Persvold
On 4/4/2013 6:13 PM, Borislav Petkov wrote: > On Thu, Apr 04, 2013 at 11:52:00PM +0800, Daniel J Blueman wrote: >> On platforms where all Northbridges may not be visible (due to routing, eg on >> NumaConnect systems), prevent oopsing due to stale pointer access when >> offlining cores. >> >> Signed

Re: [PATCH] x86, amd, mce: Prevent potential cpu-online oops

2013-04-04 Thread Borislav Petkov
On Thu, Apr 04, 2013 at 08:05:46PM +0200, Steffen Persvold wrote: > It made more sense (to me) to skip the creation of MC4 all together > if you can't find the matching northbridge since you can't reliably > do the dec_and_test() reference counting on the shared bank when you > don't have the commo

Re: [PATCH] x86, amd, mce: Prevent potential cpu-online oops

2013-04-04 Thread Borislav Petkov
On Thu, Apr 04, 2013 at 11:52:00PM +0800, Daniel J Blueman wrote: > On platforms where all Northbridges may not be visible (due to routing, eg on > NumaConnect systems), prevent oopsing due to stale pointer access when > offlining cores. > > Signed-off-by: Steffen Persvold > Signed-off-by: Daniel

RE: [PATCH] x86, amd, mce: Prevent potential cpu-online oops

2013-04-04 Thread Luck, Tony
+ if (WARN_ON_ONCE(!nb)) + goto out; + WARN_ON_ONCE() will drop a stack trace to the console - is that going to be useful? If you want a message perhaps: if (!nb) { printk_once("something interesting about not having a

[PATCH] x86, amd, mce: Prevent potential cpu-online oops

2013-04-04 Thread Daniel J Blueman
On platforms where all Northbridges may not be visible (due to routing, eg on NumaConnect systems), prevent oopsing due to stale pointer access when offlining cores. Signed-off-by: Steffen Persvold Signed-off-by: Daniel J Blueman --- arch/x86/kernel/cpu/mcheck/mce_amd.c | 11 ++- 1 f