ppc: Handle NMI guest exit

David Gibson Sun, 08 Oct 2017 16:59:29 -0700

On Sun, Oct 08, 2017 at 02:29:26PM +0530, Aravinda Prasad wrote:
> 
> 
> On Wednesday 04 October 2017 06:59 AM, David Gibson wrote:
> > On Thu, Sep 28, 2017 at 04:08:10PM +0530, Aravinda Prasad wrote:
> >> Memory error such as bit flips that cannot be corrected
> >> by hardware are passed on to the kernel for handling.
> >> If the memory address in error belongs to guest then
> >> the guest kernel is responsible for taking suitable action.
> >> Patch [1] enhances KVM to exit guest with exit reason
> >> set to KVM_EXIT_NMI in such cases.
> >>
> >> This patch handles KVM_EXIT_NMI exit. If the guest OS
> >> has registered the machine check handling routine by
> >> calling "ibm,nmi-register", then the handler builds
> >> the error log and invokes the registered handler else
> >> invokes the handler at 0x200.
> >>
> >> Note that FWNMI handles synchronous machine check exceptions
> >> triggered by the hardware and hence we do not extend
> >> such support to the "nmi" command available in the QEMU
> >> monitor. Hence, "nmi" command from the monitor will
> >> always go through 0x200 vector.
> >>
> >> [1] https://www.spinics.net/lists/kvm-ppc/msg12637.html
> >>    (e20bbd3d and related commits)
> > 
> > What does happen on KVM if an asynchronous machine check exception
> > occurs while in the guest?  Or under PowerVM for that matter.
> 
> AFAIK asynchronous errors take a different path in KVM as it can happen
> in a different process context.


Well, obviously, I'm wondering what impact it will have on the guest,
one way or another.

[snip]
> >> +ssize_t spapr_get_rtas_size(void)
> >> +{
> >> +    return RTAS_ERRLOG_OFFSET + sizeof(struct rtas_event_log_mce);
> > 
> > Erm.. because of the definition of rtas_event_log_mce, this only
> > allows for 1 byte of extended log buffer.  That doesn't seem right.
> 
> This is directly taken from the kernel's RTAS log (struct rtas_error_log
> in arch/powerpc/include/asm/rtas.h). I am not sure why they use 1 byte
> extended log buffer.

I think you'd better find out, then.

[snip]
> >> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> >> index 28b6e2e..a75e9cf 100644
> >> --- a/include/hw/ppc/spapr.h
> >> +++ b/include/hw/ppc/spapr.h
> >> @@ -556,6 +556,9 @@ target_ulong spapr_hypercall(PowerPCCPU *cpu, 
> >> target_ulong opcode,
> >>  #define DIAGNOSTICS_RUN_MODE_IMMEDIATE 2
> >>  #define DIAGNOSTICS_RUN_MODE_PERIODIC  3
> >>  
> >> +/* Offset from rtas-base where error log is placed */
> >> +#define RTAS_ERRLOG_OFFSET       0x200
> > 
> > Is there any particular rationale for this offset?  Our actual RTAS
> > code is 20 bytes, much smaller than this.
> 
> Just to ensure some space if in case RTAS code needs to be extended in
> future.

Hm, but IIUC, we control both sides here.  qemu puts the log into the
RTAS buffer at a particular offset, and qemu tells the guest where to
find it at a particular offset within the RTAS buffer.

So, if we need to extend the RTAS code (unlikely) we can increase our
offset, and the guest will be none the wiser.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v5 4/6] target/ppc: Handle NMI guest exit

Reply via email to