> -----Original Message----- > From: linux-edac-ow...@vger.kernel.org <linux-edac- > ow...@vger.kernel.org> On Behalf Of Borislav Petkov > Sent: Monday, March 26, 2018 3:35 PM > To: Ghannam, Yazen <yazen.ghan...@amd.com> > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; > tony.l...@intel.com; x...@kernel.org > Subject: Re: [PATCH 2/2] x86/MCE: Always save MCA_{ADDR,MISC,SYND} > register contents > > On Mon, Mar 26, 2018 at 02:15:26PM -0500, Yazen Ghannam wrote: > > From: Yazen Ghannam <yazen.ghan...@amd.com> > > > > The Intel SDM and AMD APM both state that the contents of the > MCA_ADDR > > register should be saved if MCA_STATUS[ADDRV] is set. The same applies > > to MCA_MISC and MCA_SYND (on SMCA systems) and their respective valid > > bits. > > > > However, the Fam17h Processor Programming Reference states > > "Error handlers should save the values in MCA_ADDR, MCA_MISC0, and > > MCA_SYND even if MCA_STATUS[AddrV], MCA_STATUS[MiscV], and > > MCA_STATUS[SyndV] are zero." > > Well, then you can't remove valid bit checks for older families. This > sounds like F17h only. > > If so, it better be abstracted away cleanly and not changing the generic > code. >
Sure, I can do that. But I didn't think it was necessary because it doesn't hurt to read the registers whether or not the valid bits are set. > > > > This is to ensure that all MCA state information is collected even if > > software cannot act upon it (because the valid bits are cleared). > > > > So always save the auxiliary MCA register contents even if the valid > > bits are cleared. This should not affect error processing because > > software should still check the valid bits before using the register > > contents for error processing. > > > > Also, print MCA_{ADDR,MISC,SYND} even if their valid bits are not set. > > Printing from EDAC/mce_amd is included here since we want to do this on > > AMD systems. > > > > Signed-off-by: Yazen Ghannam <yazen.ghan...@amd.com> > > --- > > arch/x86/kernel/cpu/mcheck/mce.c | 23 +++++++---------------- > > arch/x86/kernel/cpu/mcheck/mce_amd.c | 10 +++------- > > drivers/edac/mce_amd.c | 10 +++------- > > 3 files changed, 13 insertions(+), 30 deletions(-) > > > > diff --git a/arch/x86/kernel/cpu/mcheck/mce.c > b/arch/x86/kernel/cpu/mcheck/mce.c > > index 42cf2880d0ed..a556e1cadfbc 100644 > > --- a/arch/x86/kernel/cpu/mcheck/mce.c > > +++ b/arch/x86/kernel/cpu/mcheck/mce.c > > @@ -248,19 +248,14 @@ static void __print_mce(struct mce *m) > > } > > > > pr_emerg(HW_ERR "TSC %llx ", m->tsc); > > - if (m->addr) > > - pr_cont("ADDR %llx ", m->addr); > > - if (m->misc) > > - pr_cont("MISC %llx ", m->misc); > > + pr_cont("ADDR %016llx ", m->addr); > > + pr_cont("MISC %016llx\n", m->misc); > > You simply can't do this - this is generic code, not AMD only. > I can change this if you'd like. I just thought it would be simpler to make the change here since we're just printing the values. Thanks, Yazen