> -----Original Message----- > From: Borislav Petkov <b...@alien8.de> > Sent: Friday, May 17, 2019 3:02 PM > To: Ghannam, Yazen <yazen.ghan...@amd.com> > Cc: Luck, Tony <tony.l...@intel.com>; linux-e...@vger.kernel.org; > linux-kernel@vger.kernel.org; x...@kernel.org > Subject: Re: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in > hardware > > > On Fri, May 17, 2019 at 07:49:10PM +0000, Ghannam, Yazen wrote: > > > @@ -1569,7 +1575,13 @@ static void __mcheck_cpu_init_clear_banks(void) > > > > > > if (!b->init) > > > continue; > > > + > > > + /* Check if any bits are implemented in h/w */ > > > wrmsrl(msr_ops.ctl(i), b->ctl); > > > + rdmsrl(msr_ops.ctl(i), msrval); > > > + > > > + b->init = !!msrval; > > > + > > Just a minor nit, but can we group the comment, RDMSR, and check > > together? The WRMSR is part of normal operation and isn't tied to the > > check. > > Of course it is - that's the "throw all 1s at it" part :) >
I did a bit more testing and I noticed that writing "0" disables a bank with no way to reenable it. For example: 1) Read bank10. a) Succeeds; returns "fffffffffffffff". 2) Write "0" to bank10. a) Succeeds; hardware register is set to "0". b) Hardware register is checked, and b->init=0. 3) Read bank10. a) Fails, because b->init=0. 4) Write non-zero value to bank10 to reenable it. a) Fails, because b->init=0. 5) Reboot needed to reset bank. Is that okay? Thanks, Yazen