Re: [PATCH 1/2] x86/MCE/AMD: Check for NULL banks in THR interrupt handler

2018-08-21 Thread Borislav Petkov
On Thu, Aug 16, 2018 at 07:00:43PM +, Ghannam, Yazen wrote:
> So I think we should keep the NULL pointer checks for now to keep this fix
> small. I can make a new patch following your suggestion above.
> 
> We can change the code so that we create the data structures during the
> earlier init process, but I think this will be a much bigger change. This 
> could
> fall under the idea of decoupling the handling code from sysfs.

Ok, then pls add the simpler fixes at the beginning of the series so
that they can go to stable and have the more involved changes follow
them.

Thx.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


RE: [PATCH 1/2] x86/MCE/AMD: Check for NULL banks in THR interrupt handler

2018-08-16 Thread Ghannam, Yazen
> -Original Message-
> From: linux-edac-ow...@vger.kernel.org 
> On Behalf Of Ghannam, Yazen
> Sent: Thursday, August 9, 2018 1:18 PM
> To: Borislav Petkov 
> Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org;
> tony.l...@intel.com; x...@kernel.org
> Subject: RE: [PATCH 1/2] x86/MCE/AMD: Check for NULL banks in THR
> interrupt handler
> 
> > -Original Message-
> > From: Borislav Petkov 
> > Sent: Thursday, August 9, 2018 11:16 AM
> > To: Ghannam, Yazen 
> > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org;
> > tony.l...@intel.com; x...@kernel.org
> > Subject: Re: [PATCH 1/2] x86/MCE/AMD: Check for NULL banks in THR
> > interrupt handler
> >
> > On Thu, Aug 09, 2018 at 09:08:33AM -0500, Yazen Ghannam wrote:
> > > From: Yazen Ghannam 
> > >
> > > If threshold_init_device() fails then per_cpu(threshold_banks) will be
> > > deallocated. The thresholding interrupt handler will still be active, so
> >
> > So fix the code so that *that* doesn't happen instead of adding checks
> > to the interrupt handler.
> >
> > I.e.,
> >
> > if (err) {
> > mce_threshold_vector = default_threshold_interrupt;
> > return err;
> > }
> >
> 
> Okay. I'll make that change.
> 

I don't think this is enough. We have a gap between when the interrupt
handler is set up during boot in __mcheck_cpu_init_vendor() and when all
the data structures are created during threshold_init_device().

So I think we should keep the NULL pointer checks for now to keep this fix
small. I can make a new patch following your suggestion above.

We can change the code so that we create the data structures during the
earlier init process, but I think this will be a much bigger change. This could
fall under the idea of decoupling the handling code from sysfs.

Thanks,
Yazen


RE: [PATCH 1/2] x86/MCE/AMD: Check for NULL banks in THR interrupt handler

2018-08-09 Thread Ghannam, Yazen
> -Original Message-
> From: Borislav Petkov 
> Sent: Thursday, August 9, 2018 11:16 AM
> To: Ghannam, Yazen 
> Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org;
> tony.l...@intel.com; x...@kernel.org
> Subject: Re: [PATCH 1/2] x86/MCE/AMD: Check for NULL banks in THR
> interrupt handler
> 
> On Thu, Aug 09, 2018 at 09:08:33AM -0500, Yazen Ghannam wrote:
> > From: Yazen Ghannam 
> >
> > If threshold_init_device() fails then per_cpu(threshold_banks) will be
> > deallocated. The thresholding interrupt handler will still be active, so
> 
> So fix the code so that *that* doesn't happen instead of adding checks
> to the interrupt handler.
> 
> I.e.,
> 
>   if (err) {
>   mce_threshold_vector = default_threshold_interrupt;
>   return err;
>   }
> 

Okay. I'll make that change.

Thanks,
Yazen


Re: [PATCH 1/2] x86/MCE/AMD: Check for NULL banks in THR interrupt handler

2018-08-09 Thread Borislav Petkov
On Thu, Aug 09, 2018 at 09:08:33AM -0500, Yazen Ghannam wrote:
> From: Yazen Ghannam 
> 
> If threshold_init_device() fails then per_cpu(threshold_banks) will be
> deallocated. The thresholding interrupt handler will still be active, so

So fix the code so that *that* doesn't happen instead of adding checks
to the interrupt handler.

I.e.,

if (err) {
mce_threshold_vector = default_threshold_interrupt;
return err;
}

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


[PATCH 1/2] x86/MCE/AMD: Check for NULL banks in THR interrupt handler

2018-08-09 Thread Yazen Ghannam
From: Yazen Ghannam 

If threshold_init_device() fails then per_cpu(threshold_banks) will be
deallocated. The thresholding interrupt handler will still be active, so
it's possible to get a NULL pointer dereference if a THR interrupt
happens and any of the structures are NULL.

Exit the handler if per_cpu(threshold_banks) is NULL and skip NULL
banks. MCA error information will still be in the registers. The
information will be logged during polling or in another MCA exception or
interrupt handler.

Fixes: 17ef4af0ec0f ("x86/mce/AMD: Use saved threshold block info in interrupt 
handler")
Cc:  # 4.13.x
Signed-off-by: Yazen Ghannam 
---
 arch/x86/kernel/cpu/mcheck/mce_amd.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c 
b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index dd33c357548f..2dbf34250bbf 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -934,13 +934,21 @@ static void log_and_reset_block(struct threshold_block 
*block)
 static void amd_threshold_interrupt(void)
 {
struct threshold_block *first_block = NULL, *block = NULL, *tmp = NULL;
+   struct threshold_bank *th_bank = NULL;
unsigned int bank, cpu = smp_processor_id();
 
+   if (!per_cpu(threshold_banks, cpu))
+   return;
+
for (bank = 0; bank < mca_cfg.banks; ++bank) {
if (!(per_cpu(bank_map, cpu) & (1 << bank)))
continue;
 
-   first_block = per_cpu(threshold_banks, cpu)[bank]->blocks;
+   th_bank = per_cpu(threshold_banks, cpu)[bank];
+   if (!th_bank)
+   continue;
+
+   first_block = th_bank->blocks;
if (!first_block)
continue;
 
-- 
2.17.1