On Thu, Jul 16, 2015 at 06:16:50PM -0700, Andy Lutomirski wrote:
> > From: Ashok Raj <ashok....@intel.com>
> >
> > kexec could boot a kernel that could be legacy with no knowledge of
> > LMCE. Hence we should make sure we clear LMCE optin before kexec reboot.
> >
> 
> What happens if an offline-but-not-unplugged CPU gets an MCE?  Or does
> this code also clear CR4.MCE?

kexec doesn't use cpu_offline() path, but uses an IPI to all threads
before letting the BSP jump to new kernel.

In this patch, we only turned off the LMCE opt-in. CR4.MCE isn't touched.

if an offline-but-not-unplugged CPU gets an MCE its usually fatal and will
be broadcast to all cpus in the system.

Turning off CR4.MCE would not be good, since any thread that receives an MCE
and has CR4.MCE=0 would result in resetting the whole system.

There are other bugs in MCE in the offline path that i'm working on to send a 
patch update.

for e.g. one such bug is that during CPU_DOWN_PREPARE mce_disable_cpu() 
turns off MCx_CTL().

Machine check banks in uncore are visible to all logical cpus. We should not 
clear them. Today offlining a single cpu would disable MCE generation for any
of the uncore banks. I have them brewing in a test, should release in a couple
weeks or so. 

We can only clear banks if they are only thread local during cpu_offline(). 
We don't have such banks today (but coming). Most banks are either core scoped 
or socket scoped.

Cheers,
Ashok
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to