On Mon, May 28, 2018 at 10:49:23PM +0200, Borislav Petkov wrote: > On Fri, May 25, 2018 at 02:41:55PM -0700, Tony Luck wrote: > > @@ -1287,12 +1292,17 @@ void do_machine_check(struct pt_regs *regs, long > > error_code) > > no_way_out = worst >= MCE_PANIC_SEVERITY; > > } else { > > /* > > - * Local MCE skipped calling mce_reign() > > - * If we found a fatal error, we need to panic here. > > + * If there was a fatal machine check we should have > > + * already called mce_panic earlier in this function. > > + * Since we re-read the banks, we might have found > > + * something new. Check again to see if we found a > > + * fatal error. We call "mce_severity()" again to > > + * make sure we have the right "msg". > > */ > > - if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) > > - mce_panic("Machine check from unknown source", > > - NULL, NULL); > > + if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) { > > + severity = mce_severity(&m, cfg->tolerant, &msg, true); > > + mce_panic("Local fatal machine check!", &m, msg);
If this doesn't affect mcelog parsing, would it make sense to change this from "fatal" -> "Unrecoverable".. Fatal typically screams PCC=1 for x86, but some of these cases are its Software recoverable, but just that Kernel isn't able to perform recovery. > > Haha, this would still make you look at the code to remember was it > "fatal local" or "local fatal" the second one. Yeah, there's the "!" but > still. > > How about: > > "Fatal local machine check after banks scan" > > or so. > > Btw, the code in do_machine_check() has become one helluva spaghetti > mess. It could use some clean up a bit... :) > > -- > Regards/Gruss, > Boris. > > SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB > 21284 (AG Nürnberg) > --