> -----Original Message----- > From: Borislav Petkov [mailto:b...@alien8.de] > Sent: Wednesday, March 22, 2017 5:13 PM > To: Ghannam, Yazen <yazen.ghan...@amd.com> > Cc: linux-e...@vger.kernel.org; Tony Luck <tony.l...@intel.com>; > x...@kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH v2 4/4] x86/mce: Add AMD SMCA support to SRAO > notifier > > On Mon, Mar 20, 2017 at 03:26:54PM -0500, Yazen Ghannam wrote: > > From: Yazen Ghannam <yazen.ghan...@amd.com> > > > > Deferred errors on AMD systems may get an Action Optional severity > > with the goal of being handled by the SRAO notifier block. However, > > the process of determining if an address is usable is different > > between Intel and AMD. So define vendor-specific functions for this. > > > > Also, check for the AO severity before determining if an address is > > usable to possibly save some cycles. > > > > Signed-off-by: Yazen Ghannam <yazen.ghan...@amd.com> > > --- > > Link: > > http://lkml.kernel.org/r/1486760120-60944-3-git-send-email-Yazen.Ghann > > a...@amd.com > > > > v1->v2: > > - New in v2. Based on v1 patch 3. > > - Update SRAO notifier block to handle errors from SMCA systems. > > > > arch/x86/kernel/cpu/mcheck/mce.c | 52 > > ++++++++++++++++++++++++++++++---------- > > 1 file changed, 40 insertions(+), 12 deletions(-) > > > > diff --git a/arch/x86/kernel/cpu/mcheck/mce.c > > b/arch/x86/kernel/cpu/mcheck/mce.c > > index 5e365a2..1a2669d 100644 > > --- a/arch/x86/kernel/cpu/mcheck/mce.c > > +++ b/arch/x86/kernel/cpu/mcheck/mce.c > > @@ -547,19 +547,49 @@ static void mce_report_event(struct pt_regs > *regs) > > * be somewhat complicated (e.g. segment offset would require an > instruction > > * parser). So only support physical addresses up to page granuality for > now. > > */ > > -static int mce_usable_address(struct mce *m) > > +static int mce_usable_address_intel(struct mce *m, unsigned long > > +*pfn) > > So this function is basically saying whether the address is usable but then > it is > also returning it if so. > > And then it is using an I/O argument. Yuck. > > So it sounds to me like this functionality needs redesign: something like > have a > get_usable_address() function (the "mce_" prefix is not really needed as it is > static) which returns an invalid value when it determines that it doesn't > have a > usable address and the address itself if it succeeds. >
Okay, I'll redo it. > > { > > - if (!(m->status & MCI_STATUS_MISCV) || !(m->status & > MCI_STATUS_ADDRV)) > > + if (!(m->status & MCI_STATUS_MISCV)) > > return 0; > > - > > - /* Checks after this one are Intel-specific: */ > > - if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) > > - return 1; > > - > > if (MCI_MISC_ADDR_LSB(m->misc) > PAGE_SHIFT) > > return 0; > > if (MCI_MISC_ADDR_MODE(m->misc) != MCI_MISC_ADDR_PHYS) > > return 0; > > + > > + *pfn = m->addr >> PAGE_SHIFT; > > + return 1; > > +} > > + > > +/* Only support this on SMCA systems and errors logged from a UMC. */ > > +static int mce_usable_address_amd(struct mce *m, unsigned long *pfn) > > +{ > > + u8 umc; > > + u16 nid = cpu_to_node(m->extcpu); > > + u64 addr; > > + > > + if (!mce_flags.smca) > > + return 0; > > So on !SMCA systems there'll be no usable address ever! Even with > MCI_STATUS_ADDRV set. > > Please *test* your stuff on all affected hardware before submitting. > I was thinking of this for use with the SRAO notifier. But since this can be used in other places then the SMCA check should be grouped with the code below. > > + > > + umc = find_umc_channel(m); > > + > > + if (umc < 0 || umc_normaddr_to_sysaddr(m->addr, nid, umc, &addr)) > > + return 0; > > + > > + *pfn = addr >> PAGE_SHIFT; > > + return 1; > > +} > > + > > +static int mce_usable_address(struct mce *m, unsigned long *pfn) { > > + if (!(m->status & MCI_STATUS_ADDRV)) > > + return 0; > > What happened to the MCI_STATUS_MISCV bit check? > It was moved into the Intel function. It's not necessary on AMD. > > + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) > > + return mce_usable_address_intel(m, pfn); > > + > > + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) > > + return mce_usable_address_amd(m, pfn); > > + > > return 1; > > We definitely don't want to say that the address is usable on a third vendor. > It > would be most likely a lie even if we never reach this code on a third vendor. > Okay. Thanks, Yazen