On Tue, Apr 04, 2017 at 12:24:31PM -0500, Yazen Ghannam wrote: > From: Yazen Ghannam <yazen.ghan...@amd.com> > > We have support for the new SMCA MCA_DE{STAT,ADDR} registers in Linux. So > we've used these registers in place of MCA_{STATUS,ADDR} on SMCA systems. > However, the guidance for current implementations of SMCA is to continue > using MCA_{STATUS,ADDR} and to use MCA_DE{STAT,ADDR} only if a Deferred > error was not found in the former registers. This also means we shouldn't > clear MCA_CONFIG[LogDeferredInMcaStat].
Does that mean we should trust what BIOS has set it to? > > Redo the AMD Deferred error interrupt handler to follow the guidance for > current SMCA systems. Also, don't break after finding the first error. > > Rework __log_error() for clarity. > > Don't clear MCA_CONFIG[LogDeferredInMcaStat] during AMD mcheck init. > > Signed-off-by: Yazen Ghannam <yazen.ghan...@amd.com> > --- ... > @@ -832,25 +821,29 @@ asmlinkage __visible void __irq_entry > smp_trace_deferred_error_interrupt(void) > exiting_ack_irq(); > } > > +static inline bool is_deferred_error(u64 status) > +{ > + return ((status & MCI_STATUS_VAL) && (status & MCI_STATUS_DEFERRED)); > +} > + > /* APIC interrupt handler for deferred errors */ > static void amd_deferred_error_interrupt(void) > { > unsigned int bank; > - u32 msr_status; > u64 status; > > for (bank = 0; bank < mca_cfg.banks; ++bank) { > - msr_status = (mce_flags.smca) ? MSR_AMD64_SMCA_MCx_DESTAT(bank) > - : msr_ops.status(bank); > + rdmsrl(msr_ops.status(bank), status); > > - rdmsrl(msr_status, status); > + if (is_deferred_error(status)) { > + __log_error_deferred(bank, false); > > - if (!(status & MCI_STATUS_VAL) || > - !(status & MCI_STATUS_DEFERRED)) > - continue; > + } else if (mce_flags.smca) { > + rdmsrl(MSR_AMD64_SMCA_MCx_DESTAT(bank), status); > > - __log_error(bank, true, false, 0); > - break; > + if (is_deferred_error(status)) > + __log_error_deferred(bank, true); > + } The diff is very hard to parse, lemme paste the whole resulting function after the patch: > /* APIC interrupt handler for deferred errors */ > static void amd_deferred_error_interrupt(void) > { > unsigned int bank; > u64 status; > > for (bank = 0; bank < mca_cfg.banks; ++bank) { > rdmsrl(msr_ops.status(bank), status); > > if (is_deferred_error(status)) { > __log_error_deferred(bank, false); > > } else if (mce_flags.smca) { > rdmsrl(MSR_AMD64_SMCA_MCx_DESTAT(bank), status); > > if (is_deferred_error(status)) > __log_error_deferred(bank, true); So this doesn't sound like the guidance: if we should fallback to the DE* versions only if a deferred error wasn't found in the usual regs, the logic should be: static void __log_error_deferred(unsigned int bank) { u32 msr_stat = msr_ops.status(bank); u32 msr_addr = msr_ops.addr(bank); /* * __log_error() needs to return true to say it has logged successfully or false * if it hasn't. */ if (__log_error(bank, msr_stat, msr_addr, 0)) goto out; if (!mce_flags.smca) return; /* * Couldn't log a deferred error from the usual regs, fallback to DE* * variants. */ msr_stat = MSR_AMD64_SMCA_MCx_DESTAT(bank); msr_addr = MSR_AMD64_SMCA_MCx_DEADDR(bank); __log_error(bank, msr_stat, msr_addr, 0); out: /* We should still clear MCA_DESTAT even if we used MCA_STATUS. */ if (mce_flags.smca) wrmsrl(MSR_AMD64_SMCA_MCx_DESTAT(bank), 0); } and this way you don't need to wiggle around that use_smca_destat thing. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.