> -----Original Message-----
> From: Borislav Petkov [mailto:b...@alien8.de]
> Sent: Wednesday, March 22, 2017 5:13 PM
> To: Ghannam, Yazen <yazen.ghan...@amd.com>
> Cc: linux-e...@vger.kernel.org; Tony Luck <tony.l...@intel.com>;
> x...@kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v2 4/4] x86/mce: Add AMD SMCA support to SRAO
> notifier
> 
> On Mon, Mar 20, 2017 at 03:26:54PM -0500, Yazen Ghannam wrote:
> > From: Yazen Ghannam <yazen.ghan...@amd.com>
> >
> > Deferred errors on AMD systems may get an Action Optional severity
> > with the goal of being handled by the SRAO notifier block. However,
> > the process of determining if an address is usable is different
> > between Intel and AMD. So define vendor-specific functions for this.
> >
> > Also, check for the AO severity before determining if an address is
> > usable to possibly save some cycles.
> >
> > Signed-off-by: Yazen Ghannam <yazen.ghan...@amd.com>
> > ---
> > Link:
> > http://lkml.kernel.org/r/1486760120-60944-3-git-send-email-Yazen.Ghann
> > a...@amd.com
> >
> > v1->v2:
> > - New in v2. Based on v1 patch 3.
> > - Update SRAO notifier block to handle errors from SMCA systems.
> >
> >  arch/x86/kernel/cpu/mcheck/mce.c | 52
> > ++++++++++++++++++++++++++++++----------
> >  1 file changed, 40 insertions(+), 12 deletions(-)
> >
> > diff --git a/arch/x86/kernel/cpu/mcheck/mce.c
> > b/arch/x86/kernel/cpu/mcheck/mce.c
> > index 5e365a2..1a2669d 100644
> > --- a/arch/x86/kernel/cpu/mcheck/mce.c
> > +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> > @@ -547,19 +547,49 @@ static void mce_report_event(struct pt_regs
> *regs)
> >   * be somewhat complicated (e.g. segment offset would require an
> instruction
> >   * parser). So only support physical addresses up to page granuality for
> now.
> >   */
> > -static int mce_usable_address(struct mce *m)
> > +static int mce_usable_address_intel(struct mce *m, unsigned long
> > +*pfn)
> 
> So this function is basically saying whether the address is usable but then 
> it is
> also returning it if so.
> 
> And then it is using an I/O argument. Yuck.
> 
> So it sounds to me like this functionality needs redesign: something like 
> have a
> get_usable_address() function (the "mce_" prefix is not really needed as it is
> static) which returns an invalid value when it determines that it doesn't 
> have a
> usable address and the address itself if it succeeds.
> 

Okay, I'll redo it.

> >  {
> > -   if (!(m->status & MCI_STATUS_MISCV) || !(m->status &
> MCI_STATUS_ADDRV))
> > +   if (!(m->status & MCI_STATUS_MISCV))
> >             return 0;
> > -
> > -   /* Checks after this one are Intel-specific: */
> > -   if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
> > -           return 1;
> > -
> >     if (MCI_MISC_ADDR_LSB(m->misc) > PAGE_SHIFT)
> >             return 0;
> >     if (MCI_MISC_ADDR_MODE(m->misc) != MCI_MISC_ADDR_PHYS)
> >             return 0;
> > +
> > +   *pfn = m->addr >> PAGE_SHIFT;
> > +   return 1;
> > +}
> > +
> > +/* Only support this on SMCA systems and errors logged from a UMC. */
> > +static int mce_usable_address_amd(struct mce *m, unsigned long *pfn)
> > +{
> > +   u8 umc;
> > +   u16 nid = cpu_to_node(m->extcpu);
> > +   u64 addr;
> > +
> > +   if (!mce_flags.smca)
> > +           return 0;
> 
> So on !SMCA systems there'll be no usable address ever! Even with
> MCI_STATUS_ADDRV set.
> 
> Please *test* your stuff on all affected hardware before submitting.
> 

I was thinking of this for use with the SRAO notifier. But since this can be
used in other places then the SMCA check should be grouped with the
code below.

> > +
> > +   umc = find_umc_channel(m);
> > +
> > +   if (umc < 0 || umc_normaddr_to_sysaddr(m->addr, nid, umc, &addr))
> > +           return 0;
> > +
> > +   *pfn = addr >> PAGE_SHIFT;
> > +   return 1;
> > +}
> > +
> > +static int mce_usable_address(struct mce *m, unsigned long *pfn) {
> > +   if (!(m->status & MCI_STATUS_ADDRV))
> > +           return 0;
> 
> What happened to the MCI_STATUS_MISCV bit check?
> 

It was moved into the Intel function. It's not necessary on AMD.

> > +   if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
> > +           return mce_usable_address_intel(m, pfn);
> > +
> > +   if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
> > +           return mce_usable_address_amd(m, pfn);
> > +
> >     return 1;
> 
> We definitely don't want to say that the address is usable on a third vendor. 
> It
> would be most likely a lie even if we never reach this code on a third vendor.
> 

Okay.

Thanks,
Yazen

Reply via email to