On Thu, Oct 17, 2013 at 05:37:22PM +0530, Naveen N. Rao wrote:
> That's me raising both my hands :)
:-)
> If you feel so strongly about it. "Corrected Error" is an oxymoron.
> It's really just the hardware notifying us.
Yeah, but we can't write
"We just corrected a single-bit flip in DIMM array
On 10/16/2013 12:53 AM, Borislav Petkov wrote:
On Wed, Oct 16, 2013 at 12:40:40AM +0530, Naveen N. Rao wrote:
+2 ;)
You're counting for 2 people, huh?
That's me raising both my hands :)
:-)
While at it, I wonder if we're better off calling these "Hardware
events" rather than "Hardware e
On Wed, Oct 16, 2013 at 12:40:40AM +0530, Naveen N. Rao wrote:
> +2 ;)
You're counting for 2 people, huh?
:-)
> While at it, I wonder if we're better off calling these "Hardware
> events" rather than "Hardware errors".
Oh, please no. That's that euphemistic lying which serves no one. And
here's
On 2013/10/15 09:15AM, Tony Luck wrote:
> On Tue, Oct 15, 2013 at 2:28 AM, Borislav Petkov wrote:
> > We can even add a hint for the user like:
> >
> > "Above errors have been corrected by the hardware and require no
> > further action."
> >
> > Btw, this is valid for both dmesg and trace
On Tue, Oct 15, 2013 at 2:28 AM, Borislav Petkov wrote:
> We can even add a hint for the user like:
>
> "Above errors have been corrected by the hardware and require no
> further action."
>
> Btw, this is valid for both dmesg and trace event output.
>
> Because from my experience so far p
On Tue, Oct 15, 2013 at 12:07:31AM -0400, Chen Gong wrote:
> Some errors have multiple sub sections like below:
>
> [ 1442.070522] {2}[Hardware Error]: Hardware error from APEI Generic Hardware
> Error Source: 0
> [ 1442.070528] {2}[Hardware Error]: event severity: corrected
> [ 1442.070531] {2}[
On Mon, Oct 14, 2013 at 12:55:33PM +0200, Borislav Petkov wrote:
> Date: Mon, 14 Oct 2013 12:55:33 +0200
> From: Borislav Petkov
> To: Chen Gong
> Cc: tony.l...@intel.com, linux-kernel@vger.kernel.org,
> linux-a...@vger.kernel.org
> Subject: Re: Extended H/W error log driver
On Mon, Oct 14, 2013 at 02:49:40AM -0400, Chen Gong wrote:
> On Fri, Oct 11, 2013 at 10:04:27AM +0200, Borislav Petkov wrote:
> > > [56005.786154] {4}Hardware error detected on CPU0
> > > [56005.786159] {4}event severity: corrected
> > > [56005.786162] {4}sub_event[0], severity: corrected
> >
> >
On Fri, Oct 11, 2013 at 10:04:27AM +0200, Borislav Petkov wrote:
> Date: Fri, 11 Oct 2013 10:04:27 +0200
> From: Borislav Petkov
> To: "Chen, Gong"
> Cc: tony.l...@intel.com, linux-kernel@vger.kernel.org,
> linux-a...@vger.kernel.org
> Subject: Re: Extended H/W e
On Fri, Oct 11, 2013 at 02:54:13PM +, Luck, Tony wrote:
> It's such a simple goal - I can't believe it took this long to get
> here :-)
Right, I'd guess some standard's body needed to be persuaded :-)
> > Btw, what's "Memriser1"?
>
> Each memory controller on this machine routes to a plug-in
>> [56005.785981] {3}physical_address: 0x000851fe
>> [56005.786027] {3}DIMM location: Memriser1 CHANNEL A DIMM 0
>
> Very good guys, I've been waiting for years for this to be possible,
> good job! :-)
It's such a simple goal - I can't believe it took this long to get here :-)
> Btw, what
On Fri, Oct 11, 2013 at 02:32:38AM -0400, Chen, Gong wrote:
> [56005.785917] {3}Hardware error detected on CPU0
> [56005.785959] {3}event severity: corrected
> [56005.785975] {3}sub_event[0], severity: corrected
> [56005.785977] {3}section_type: memory error
> [56005.785981] {3}physical_address: 0x
On Fri, 2013-10-11 at 02:32 -0400, Chen, Gong wrote:
> This patch series adds an enhanced MCA event logging driver provided by Intel.
[]
> dmesg output:
>
> [56005.785917] {3}Hardware error detected on CPU0
> [56005.785959] {3}event severity: corrected
> [56005.785975] {3}sub_event[0], severity: c
13 matches
Mail list logo