On 2014-06-17 16:23:58 Tue, Paul Mackerras wrote:
> On Wed, Jun 11, 2014 at 02:18:21PM +0530, Mahesh J Salgaonkar wrote:
> > From: Mahesh Salgaonkar <mah...@linux.vnet.ibm.com>
> > 
> > Currently we forward MCEs to guest which have been recovered by guest.
> > And for unhandled errors we do not deliver the MCE to guest. It looks like
> > with no support of FWNMI in qemu, guest just panics whenever we deliver the
> > recovered MCEs to guest. Also, the existig code used to return to host for
> > unhandled errors which was casuing guest to hang with soft lockups inside
> > guest and makes it difficult to recover guest instance.
> > 
> > This patch now forwards all fatal MCEs to guest causing guest to 
> > crash/panic.
> > And, for recovered errors we just go back to normal functioning of guest
> > instead of returning to host.
> 
> ... having corrupted possibly live values that the guest had in SRR0/1.
> 
> Ideally the guest should have cleared MSR[RI] before putting values in
> SRR0/1, so perhaps you could check that and return to the guest
> without giving it a machine check if MSR[RI] is set.  But if MSR[RI]
> is clear, the guest is unfixably corrupted because the machine check
> overwrote SRR0/1, and the only thing we can do, in the absence of
> FWNMI support, is give the guest a machine check interrupt and let it
> crash.

Yes agree. I have patch (below) ready for the same, will test/verify and send it
out soon.

Thanks,
-Mahesh.

-------------
Deliver machine check with MSR(RI=0) to guest as MCE

From: Mahesh Salgaonkar <mah...@linux.vnet.ibm.com>


---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 868347e..c9c56ee 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2257,7 +2257,6 @@ machine_check_realmode:
        mr      r3, r9          /* get vcpu pointer */
        bl      kvmppc_realmode_machine_check
        nop
-       cmpdi   r3, 0           /* Did we handle MCE ? */
        ld      r9, HSTATE_KVM_VCPU(r13)
        li      r12, BOOK3S_INTERRUPT_MACHINE_CHECK
        /*
@@ -2270,13 +2269,18 @@ machine_check_realmode:
         * The old code used to return to host for unhandled errors which
         * was causing guest to hang with soft lockups inside guest and
         * makes it difficult to recover guest instance.
+        *
+        * if we receive machine check with MSR(RI=0) then deliver it to
+        * guest as machine check causing guest to crash.
         */
-       ld      r10, VCPU_PC(r9)
        ld      r11, VCPU_MSR(r9)
+       andi.   r10, r11, MSR_RI        /* check for unrecoverable exception */
+       beq     1f                      /* Deliver a machine check to guest */
+       ld      r10, VCPU_PC(r9)
+       cmpdi   r3, 0           /* Did we handle MCE ? */
        bne     2f      /* Continue guest execution. */
        /* If not, deliver a machine check.  SRR0/1 are already set */
-       li      r10, BOOK3S_INTERRUPT_MACHINE_CHECK
-       ld      r11, VCPU_MSR(r9)
+1:     li      r10, BOOK3S_INTERRUPT_MACHINE_CHECK
        bl      kvmppc_msr_interrupt
 2:     b       fast_interrupt_c_return
 

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Reply via email to