When passing through SRIOV VFs to guest, we possibly encounter EEH error on PF. In this case, the VF PEs are put into frozen state. The error could be reported to guest before it's captured by the host. That means the guest could attempt to recover errors on VFs before host gets chance to recover errors on PFs. The VFs won't be recovered successfully.
This enforces the recovery order for above case: the recovery on child PE in guest is hold until the recovery on parent PE in host is completed. Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Russell Currey <[email protected]> --- arch/powerpc/kernel/eeh.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c index fd9c782..5591f05 100644 --- a/arch/powerpc/kernel/eeh.c +++ b/arch/powerpc/kernel/eeh.c @@ -1541,6 +1541,17 @@ int eeh_pe_get_state(struct eeh_pe *pe) if (!eeh_ops || !eeh_ops->get_state) return -ENOENT; + /* + * If the parent PE is owned by the host kernel and is undergoing + * error recovery, we should return the PE state as temporarily + * unavailable so that the error recovery on the guest is suspended + * until the recovery completes on the host. + */ + if (pe->parent && + !(pe->state & EEH_PE_REMOVED) && + (pe->parent->state & (EEH_PE_ISOLATED | EEH_PE_RECOVERING))) + return EEH_PE_STATE_UNAVAIL; + result = eeh_ops->get_state(pe, NULL); rst_active = !!(result & EEH_STATE_RESET_ACTIVE); dma_en = !!(result & EEH_STATE_DMA_ENABLED); -- 2.1.0 _______________________________________________ Linuxppc-dev mailing list [email protected] https://lists.ozlabs.org/listinfo/linuxppc-dev
