Re: [PATCH v1 2/2] arm64: handle NOTIFY_SEI notification by the APEI driver

2018-06-15 Thread James Morse
Hi gengdongjiu,

On 01/06/18 08:21, gengdongjiu wrote:
> On 2018/6/1 0:51, James Morse wrote:
>> On 31/05/18 12:01, Mark Rutland wrote:
>>> In do_serror() we already handle nmi_{enter,exit}(), so there's no need
>>> for that here.
>>
>> Even better: nmi_enter() has a BUG_ON(in_nmi()).
> 
> There are two places call the arm64_is_fatal_ras_serror():
> 1. do_serror()
> 2. kvm_handle_guest_serror()
> 
> Yes, the do_serror() already handle nmi_{enter,exit}(), so the 
> arm64_is_fatal_ras_serror() does not need to handle it again.

> For the kvm_handle_guest_serror(), it does not handle the nmi_{enter,exit}(),
> so should we add nmi_{enter,exit}() in  kvm_handle_guest_serror() before 
> calling
> arm64_is_fatal_ras_serror()?

kvm_handle_guest_serror() never interrupts anything, so its not an NMI, this is
why the KVM code doesn't do this today. We can take another SError from there
without issue.

But once we call APEI, it needs to prevent re-entrance, so SError has to be
masked. If we're using the estatus-queue (which we should), then we should
always call the notification helper in the same context. If do_serror() is
in_nmi(), then we should fake it up in the appropriate helper.

Given the latest review comments from Borislav, the in_nmi() weirdness in APEI
may all disappear, in which case all that matters is we call APEI with SError
masked.


> > For the NOTIFY_SEA, I do not know why handle_guest_sea() does not handle
> the nmi_{enter,exit}(). James, does we miss it in handle_guest_sea()?

The exception went to the hypervisor, it only interrupted the guest. KVM returns
to process context, and KVM then goes on to call handle_guest_sea().

nmi_enter() is to tell things like printk() that we may have interrupted its
IRQ-masked regions, and it can't take any locks to do the work. We know this is
never true for KVM at this point.


>>> TBH, I don't understand why do_sea() does that conditionally today.
>>> Unless there's some constraint I'm missing, 
>>
>> APEI uses a different fixmap entry and locks when in_nmi(). This was because 
>> we
>> may interrupt the irq-masked region in APEI that was using the regular 
>> memory.
>> (e.g. the 'polled' notification, or something backed by an interrupt.) But,
>> Borislav has spotted other things in here that are broken[0]. I'm working on
>> rolling all that into 'v5' of the in_nmi() rework stuff.
>>
>> We currently get away with this on arm because 'SEA' is the only NMI-like 
>> thing,
>> and it occurs synchronously. The problem cases are all also cases where the
>> kernel text is corrupt, which we can't possibly hope to handle.
>>
>> For NOTIFY_SDEI and NOTIFY_SEI this is the wrong pattern as these are
>> asynchronous. do_serror() has already done most of the work for NOTIFY_SEI, 
>> but
>> we need to use the estatus queue in APEI, which is currently x86 only.

> I think this patch can based on you 'v5' of the in_nmi() rework stuff

I think NOTIFY_SEI needs the estatus-queue, so yes. But the KVM changes mean
this will need doing as a separate series.


>>> I think it would make more
>>> sense to do that regardless of whether the interrupted context had
>>> interrupts enabled. James -- does that make sense to you?
>>>
>>> If you update the prior patch with a stub for !CONFIG_ACPI_APEI_SEI, you
>>> can simplify all of the above additions down to:
>>>
>>> if (!ghes_notify_sei())
>>> return;
>>>
>>> ... which would look a lot nicer.
>>
>> The code that calls ghes_notify_sei() may need calling by KVM too, but its
>> default action to an 'unclaimed' SError will be different.
>> Because of the race between memory_failure() and return-to-userspace, we may
>> need to kick the irq work queue (if we can), as we return from do_serror(). 
>> [1]
>> and [2] provide an example for NOTIFY_SEA. SDEI does this by returning to the
>> kernel through the IRQ handler, (which handles the KVM case too).

> I can kick the IRQ work queue as you do for the NOTIFY_SEA and NOTIFY_SDEI.

(NOTIFY_SDEI has its own tricks: it returns to the kernel through the IRQ VBAR
entry)

I think this needs something like the apei_claim_sea() in that series, probably
with some __ version called from do_serror() to avoid the BUG_ON(in_nmi()).


>> I think this series is unsafe until we can use the estatus queue in APEI. Its
>> also missing the handling for an SError interrupting a KVM guest.
> 
> how about this series is based on your patches that uses the estatus queue in 
> APEI to make it safe?
> 
> when an SError interrupting a KVM guest, it will trap to hypervisor, 
> hypervisor will call
> below software stack to handle it:
> kvm_handle_guest_serror()->arm64_is_fatal_ras_serror()->ghes_notify_sei()
> 
> so it already handles the case that an SError interrupting a KVM guest.

It already calls it at some point... this was called out in the cover-letter
(and maybe commit messages) of the series that added that code: We will need to
do more once this is used as a notification from firmware.

That seri

Re: [PATCH v1 2/2] arm64: handle NOTIFY_SEI notification by the APEI driver

2018-06-01 Thread gengdongjiu
Hi Mark, James,
   Thanks the review.

On 2018/6/1 0:51, James Morse wrote:
> Hi Mark, Dongjiu Geng,
> 
> On 31/05/18 12:01, Mark Rutland wrote:
>> In do_serror() we already handle nmi_{enter,exit}(), so there's no need
>> for that here.
> 
> Even better: nmi_enter() has a BUG_ON(in_nmi()).

There are two places call the arm64_is_fatal_ras_serror():
1. do_serror()
2. kvm_handle_guest_serror()

Yes, the do_serror() already handle nmi_{enter,exit}(), so the 
arm64_is_fatal_ras_serror() does not need to handle it again.
For the kvm_handle_guest_serror(), it does not handle the nmi_{enter,exit}(), 
so should we add nmi_{enter,exit}() in  kvm_handle_guest_serror() before 
calling arm64_is_fatal_ras_serror()?

For the NOTIFY_SEA, I do not know why handle_guest_sea() does not handle the 
nmi_{enter,exit}(). James, does we miss it in handle_guest_sea()?

> 
> 
>> TBH, I don't understand why do_sea() does that conditionally today.
>> Unless there's some constraint I'm missing, 
> 
> APEI uses a different fixmap entry and locks when in_nmi(). This was because 
> we
> may interrupt the irq-masked region in APEI that was using the regular memory.
> (e.g. the 'polled' notification, or something backed by an interrupt.) But,
> Borislav has spotted other things in here that are broken[0]. I'm working on
> rolling all that into 'v5' of the in_nmi() rework stuff.
> 
> We currently get away with this on arm because 'SEA' is the only NMI-like 
> thing,
> and it occurs synchronously. The problem cases are all also cases where the
> kernel text is corrupt, which we can't possibly hope to handle.
> 
> For NOTIFY_SDEI and NOTIFY_SEI this is the wrong pattern as these are
> asynchronous. do_serror() has already done most of the work for NOTIFY_SEI, 
> but
> we need to use the estatus queue in APEI, which is currently x86 only.
I think this patch can based on you 'v5' of the in_nmi() rework stuff

> 
> 
>> I think it would make more
>> sense to do that regardless of whether the interrupted context had
>> interrupts enabled. James -- does that make sense to you?
>>
>> If you update the prior patch with a stub for !CONFIG_ACPI_APEI_SEI, you
>> can simplify all of the above additions down to:
>>
>>  if (!ghes_notify_sei())
>>  return;
>>
>> ... which would look a lot nicer.
> 
> The code that calls ghes_notify_sei() may need calling by KVM too, but its
> default action to an 'unclaimed' SError will be different.
> Because of the race between memory_failure() and return-to-userspace, we may
> need to kick the irq work queue (if we can), as we return from do_serror(). 
> [1]
> and [2] provide an example for NOTIFY_SEA. SDEI does this by returning to the
> kernel through the IRQ handler, (which handles the KVM case too).
I can kick the IRQ work queue as you do for the NOTIFY_SEA and NOTIFY_SDEI.

> 
> 
> I think this series is unsafe until we can use the estatus queue in APEI. Its
> also missing the handling for an SError interrupting a KVM guest.

how about this series is based on your patches that uses the estatus queue in 
APEI to make it safe?

when an SError interrupting a KVM guest, it will trap to hypervisor, hypervisor 
will call
below software stack to handle it:
kvm_handle_guest_serror()->arm64_is_fatal_ras_serror()->ghes_notify_sei()

so it already handles the case that an SError interrupting a KVM guest.

> 
> 
> Thanks,
> 
> James
> 
> [0] https://www.spinics.net/lists/arm-kernel/msg653332.html
> [1] https://www.spinics.net/lists/arm-kernel/msg649237.html
> [2] https://www.spinics.net/lists/arm-kernel/msg649239.html
> 
> .
> 



Re: [PATCH v1 2/2] arm64: handle NOTIFY_SEI notification by the APEI driver

2018-05-31 Thread James Morse
Hi Mark, Dongjiu Geng,

On 31/05/18 12:01, Mark Rutland wrote:
> In do_serror() we already handle nmi_{enter,exit}(), so there's no need
> for that here.

Even better: nmi_enter() has a BUG_ON(in_nmi()).


> TBH, I don't understand why do_sea() does that conditionally today.
> Unless there's some constraint I'm missing, 

APEI uses a different fixmap entry and locks when in_nmi(). This was because we
may interrupt the irq-masked region in APEI that was using the regular memory.
(e.g. the 'polled' notification, or something backed by an interrupt.) But,
Borislav has spotted other things in here that are broken[0]. I'm working on
rolling all that into 'v5' of the in_nmi() rework stuff.

We currently get away with this on arm because 'SEA' is the only NMI-like thing,
and it occurs synchronously. The problem cases are all also cases where the
kernel text is corrupt, which we can't possibly hope to handle.

For NOTIFY_SDEI and NOTIFY_SEI this is the wrong pattern as these are
asynchronous. do_serror() has already done most of the work for NOTIFY_SEI, but
we need to use the estatus queue in APEI, which is currently x86 only.


> I think it would make more
> sense to do that regardless of whether the interrupted context had
> interrupts enabled. James -- does that make sense to you?
> 
> If you update the prior patch with a stub for !CONFIG_ACPI_APEI_SEI, you
> can simplify all of the above additions down to:
> 
>   if (!ghes_notify_sei())
>   return;
> 
> ... which would look a lot nicer.

The code that calls ghes_notify_sei() may need calling by KVM too, but its
default action to an 'unclaimed' SError will be different.
Because of the race between memory_failure() and return-to-userspace, we may
need to kick the irq work queue (if we can), as we return from do_serror(). [1]
and [2] provide an example for NOTIFY_SEA. SDEI does this by returning to the
kernel through the IRQ handler, (which handles the KVM case too).


I think this series is unsafe until we can use the estatus queue in APEI. Its
also missing the handling for an SError interrupting a KVM guest.


Thanks,

James

[0] https://www.spinics.net/lists/arm-kernel/msg653332.html
[1] https://www.spinics.net/lists/arm-kernel/msg649237.html
[2] https://www.spinics.net/lists/arm-kernel/msg649239.html


Re: [PATCH v1 2/2] arm64: handle NOTIFY_SEI notification by the APEI driver

2018-05-31 Thread Mark Rutland
On Thu, May 31, 2018 at 08:41:46PM +0800, Dongjiu Geng wrote:
> When kernel or KVM gets the NOTIFY_SEI notification, it firstly
> calls the APEI driver to handle this notification.
> 
> Signed-off-by: Dongjiu Geng 
> ---
>  arch/arm64/kernel/traps.c | 15 +++
>  1 file changed, 15 insertions(+)
> ---
> change since https://www.spinics.net/lists/kvm/msg168919.html
> 
> 1. Remove the handle_guest_sei() helper
> 
> 
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index 8bbdc17..676e40c 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -50,6 +50,7 @@
>  #include 
>  #include 
>  #include 
> +#include 

Nit: please place newline before the new include, since it comes from a
different directory (and we do so in fault.c).

>  static const char *handler[]= {
>   "Synchronous Abort",
> @@ -701,6 +702,20 @@ void __noreturn arm64_serror_panic(struct pt_regs *regs, 
> u32 esr)
>  bool arm64_is_fatal_ras_serror(struct pt_regs *regs, unsigned int esr)
>  {
>   u32 aet = arm64_ras_serror_get_severity(esr);
> + int ret = -ENOENT;
> +
> + if (IS_ENABLED(CONFIG_ACPI_APEI_SEI)) {
> + if (interrupts_enabled(regs))
> + nmi_enter();
> +
> + ret = ghes_notify_sei();
> +
> + if (interrupts_enabled(regs))
> + nmi_exit();
> +
> + if (!ret)
> + return false;
> + }

In do_serror() we already handle nmi_{enter,exit}(), so there's no need
for that here.

TBH, I don't understand why do_sea() does that conditionally today.
Unless there's some constraint I'm missing, I think it would make more
sense to do that regardless of whether the interrupted context had
interrupts enabled. James -- does that make sense to you?

If you update the prior patch with a stub for !CONFIG_ACPI_APEI_SEI, you
can simplify all of the above additions down to:

if (!ghes_notify_sei())
return;

... which would look a lot nicer.

Thanks,
Mark.