ppc: Handle NMI guest exit

David Gibson Mon, 25 Mar 2019 18:17:01 -0700

On Mon, Mar 25, 2019 at 01:31:12PM +0530, Aravinda Prasad wrote:
> 
> 
> On Monday 25 March 2019 11:52 AM, David Gibson wrote:
> > On Fri, Mar 22, 2019 at 12:03:58PM +0530, Aravinda Prasad wrote:
> >> Memory error such as bit flips that cannot be corrected
> >> by hardware are passed on to the kernel for handling.
> >> If the memory address in error belongs to guest then
> >> the guest kernel is responsible for taking suitable action.
> >> Patch [1] enhances KVM to exit guest with exit reason
> >> set to KVM_EXIT_NMI in such cases. This patch handles
> >> KVM_EXIT_NMI exit.
> >>
> >> [1] https://www.spinics.net/lists/kvm-ppc/msg12637.html
> >>     (e20bbd3d and related commits)
> >>
> >> Signed-off-by: Aravinda Prasad <aravi...@linux.vnet.ibm.com>
> >> ---
> >>  hw/ppc/spapr_events.c  |   22 ++++++++++++++++++++++
> >>  include/hw/ppc/spapr.h |    1 +
> >>  target/ppc/kvm.c       |   16 ++++++++++++++++
> >>  target/ppc/kvm_ppc.h   |    2 ++
> >>  4 files changed, 41 insertions(+)
> >>
> >> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> >> index ae0f093..e7a24ad 100644
> >> --- a/hw/ppc/spapr_events.c
> >> +++ b/hw/ppc/spapr_events.c
> >> @@ -620,6 +620,28 @@ void 
> >> spapr_hotplug_req_remove_by_count_indexed(SpaprDrcType drc_type,
> >>                              RTAS_LOG_V6_HP_ACTION_REMOVE, drc_type, 
> >> &drc_id);
> >>  }
> >>  
> >> +void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
> >> +{
> >> +    SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> >> +
> >> +    while (spapr->mc_status != -1) {
> >> +        /*
> >> +         * Check whether the same CPU got machine check error
> >> +         * while still handling the mc error (i.e., before
> >> +         * that CPU called "ibm,nmi-interlock"
> >> +         */
> >> +        if (spapr->mc_status == cpu->vcpu_id) {
> >> +            qemu_system_guest_panicked(NULL);
> >> +        }
> >> +        qemu_cond_wait_iothread(&spapr->mc_delivery_cond);
> >> +        /* If the system is reset meanwhile, then just return */
> >> +        if (spapr->mc_reset) {
> > 
> > I don't really see what this accomplishes.  IIUC mc_reset is true from
> > reset time until nmi-register is called.  Which means you could just
> > check for guest_mnachine_check_addre being unset - in which case don't
> > you need to fallback to the old machine check behaviour anyway?
> 
> We can check for guest_machine_check_addr being unset instead of mc_reset.
> 
> Do we need any kind of memory barriers to ensure that this thread reads
> the updated guest_machine_check_addr/mc_reset?


IIUC these variables are all protected by the global qemu mutex, so
that should include the necessary memory barriers already.

> We don't have to fallback to the old machine check behavior, because
> guest_machine_check_addr is unset only during system reset.

Yes... which means that between reset and re-registering, we should be
using the old machine check behaviour, yes?  Which is exactly this
situation.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v7 3/6] target/ppc: Handle NMI guest exit

Reply via email to