RE: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only

2020-09-10 Thread Luck, Tony
> Considering how this situation is supposed to almost never happen and > how we're actually interested in the first line of the whole splat I > pasted, how much output comes after it, doesn't really matter. All it > matters is that the machine stops any further progress (as much as we > can do

Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only

2020-09-10 Thread Borislav Petkov
On Thu, Sep 10, 2020 at 11:42:06AM -0700, Luck, Tony wrote: > With only one call site the rIP isn't super helpful at the moment. But > once you start selling those "MSR or die" T-shirts everyone will want > to use this :-) :-))) > Do we need the stack trace twice? Once from your fixup >

Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only

2020-09-10 Thread Luck, Tony
On Thu, Sep 10, 2020 at 08:29:19PM +0200, Borislav Petkov wrote: > Ok, with all those changes, I don't think the following nice and juicy > message can be overlooked: > > [ 32.267830] mce: MSR access error: RDMSR from 0x1234 at rIP: > 0x8102ed62 (mce_rdmsrl+0x12/0x50) With only one

Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only

2020-09-10 Thread Borislav Petkov
Ok, with all those changes, I don't think the following nice and juicy message can be overlooked: [ 32.267830] mce: MSR access error: RDMSR from 0x1234 at rIP: 0x8102ed62 (mce_rdmsrl+0x12/0x50) [ 32.267838] Call Trace: [ 32.267838] <#MC> [ 32.267838] do_machine_check+0xbd/0x9f0

Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only

2020-09-09 Thread Borislav Petkov
On Wed, Sep 09, 2020 at 11:20:51AM -0700, Luck, Tony wrote: > Do we think there will be other places where we want this > MSR-or-die behaviour? MSR-or-die - I like that. That belongs on a T-shirt. :-) > If there are, then most of this belongs elsewhere from > arch/x86/kernel/cpu/mce/core.c

Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only

2020-09-09 Thread Luck, Tony
On Wed, Sep 09, 2020 at 01:30:22PM +0200, Borislav Petkov wrote: > I guess something as straightforward as this: Do we think there will be other places where we want this MSR-or-die behaviour? If there are, then most of this belongs elsewhere from arch/x86/kernel/cpu/mce/core.c > --- > diff

Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only

2020-09-09 Thread Borislav Petkov
I guess something as straightforward as this: --- diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 0ba24dfffdb2..9893caaf2696 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -373,10 +373,27 @@ static int msr_to_offset(u32 msr)

Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only

2020-09-08 Thread Borislav Petkov
On Tue, Sep 08, 2020 at 03:07:05PM +, Luck, Tony wrote: > We can even get a nice diagnostic message since the handler > has access to "regs". It can print which MSR (regs->cx) and > where it happened (regs->ip). > > Which sounds like you might want a specific ex_handler_rdmsr > function

RE: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only

2020-09-08 Thread Luck, Tony
> Ok, so I think this is what Andy meant last night and PeterZ just > suggested it too: > > We do a: > >_ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_panic) > > which panics straight in the #GP handler and avoids the IRET. We can even get a nice diagnostic message since the handler has access to

Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only

2020-09-08 Thread Borislav Petkov
On Tue, Sep 08, 2020 at 11:46:50AM +0200, Borislav Petkov wrote: > So, Andy suggested we do a simple .fixup so that when the RDMSR fails, > in the fixup we panic directly. Ok, so I think this is what Andy meant last night and PeterZ just suggested it too: We do a:

Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only

2020-09-08 Thread Borislav Petkov
On Mon, Sep 07, 2020 at 01:06:22PM -0700, Luck, Tony wrote: > Digging into the history it seems that this rdmsrl_safe() was added for > a possible bug on a pentiumIII back in 2009 that was eventually closed > as "unreproducible". So this is the $ 10^6 question so far: if I can assume that those

Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only

2020-09-07 Thread Borislav Petkov
On Mon, Sep 07, 2020 at 01:16:43PM -0700, Andy Lutomirski wrote: > > + asm volatile("rdmsr" : EAX_EDX_RET(val, low, high) : "c" (msr)); > > I don't like this. Plain rdmsrl() will at least print a nice error if it > fails. I think you read my commit message too quickly :) The point is to

Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only

2020-09-07 Thread Andy Lutomirski
On Sun, Sep 6, 2020 at 2:21 PM Borislav Petkov wrote: > > Hi, > > Ingo and I talked about this thing this morning and tglx has had it on > his to-fix list too so here's a first attempt at it. > > Below is just a brain dump of what we talked about so let's start with > it and see where it would

Re: [RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only

2020-09-07 Thread Luck, Tony
On Sun, Sep 06, 2020 at 11:21:30PM +0200, Borislav Petkov wrote: > Hi, > > Ingo and I talked about this thing this morning and tglx has had it on > his to-fix list too so here's a first attempt at it. > > Below is just a brain dump of what we talked about so let's start with > it and see where

[RFC PATCH] x86/mce: Make mce_rdmsrl() do a plain RDMSR only

2020-09-06 Thread Borislav Petkov
Hi, Ingo and I talked about this thing this morning and tglx has had it on his to-fix list too so here's a first attempt at it. Below is just a brain dump of what we talked about so let's start with it and see where it would take us. Thx. --- From: Borislav Petkov ... without any exception