> -----Original Message----- > From: linux-edac-ow...@vger.kernel.org <linux-edac-ow...@vger.kernel.org> > On Behalf Of Ghannam, Yazen > Sent: Thursday, August 9, 2018 1:18 PM > To: Borislav Petkov <b...@alien8.de> > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; > tony.l...@intel.com; x...@kernel.org > Subject: RE: [PATCH 1/2] x86/MCE/AMD: Check for NULL banks in THR > interrupt handler > > > -----Original Message----- > > From: Borislav Petkov <b...@alien8.de> > > Sent: Thursday, August 9, 2018 11:16 AM > > To: Ghannam, Yazen <yazen.ghan...@amd.com> > > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; > > tony.l...@intel.com; x...@kernel.org > > Subject: Re: [PATCH 1/2] x86/MCE/AMD: Check for NULL banks in THR > > interrupt handler > > > > On Thu, Aug 09, 2018 at 09:08:33AM -0500, Yazen Ghannam wrote: > > > From: Yazen Ghannam <yazen.ghan...@amd.com> > > > > > > If threshold_init_device() fails then per_cpu(threshold_banks) will be > > > deallocated. The thresholding interrupt handler will still be active, so > > > > So fix the code so that *that* doesn't happen instead of adding checks > > to the interrupt handler. > > > > I.e., > > > > if (err) { > > mce_threshold_vector = default_threshold_interrupt; > > return err; > > } > > > > Okay. I'll make that change. >
I don't think this is enough. We have a gap between when the interrupt handler is set up during boot in __mcheck_cpu_init_vendor() and when all the data structures are created during threshold_init_device(). So I think we should keep the NULL pointer checks for now to keep this fix small. I can make a new patch following your suggestion above. We can change the code so that we create the data structures during the earlier init process, but I think this will be a much bigger change. This could fall under the idea of decoupling the handling code from sysfs. Thanks, Yazen