On Tue, 6 Oct 2020 19:03:28 +0200 Cédric Le Goater <c...@kaod.org> wrote:
> On 10/6/20 6:58 PM, Greg Kurz wrote: > > On Mon, 5 Oct 2020 18:51:44 +0200 > > Cédric Le Goater <c...@kaod.org> wrote: > > > >> StoreEOI on POWER9 CPUs is racy because load-after-store ordering is > >> not enforced. > >> > >> Signed-off-by: Cédric Le Goater <c...@kaod.org> > >> --- > >> hw/ppc/spapr_caps.c | 9 +++++++++ > >> 1 file changed, 9 insertions(+) > >> > >> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c > >> index b0a9d0227db2..9251badbdc27 100644 > >> --- a/hw/ppc/spapr_caps.c > >> +++ b/hw/ppc/spapr_caps.c > >> @@ -549,6 +549,15 @@ static void cap_storeeoi_apply(SpaprMachineState > >> *spapr, uint8_t val, > >> error_setg(errp, "StoreEOI not supported by KVM"); > >> return; > >> } > >> + > >> + /* > >> + * load-after-store ordering is not enforced on POWER9 CPUs > >> + * and StoreEOI can be racy. > >> + */ > >> + if (!ppc_type_check_compat(machine->cpu_type, > >> CPU_POWERPC_LOGICAL_3_10, > >> + 0, spapr->max_compat_pvr)) { > >> + warn_report("StoreEOI on a POWER9 CPU is unsafe on KVM."); > > The error message should mention XIVE KVM device actually since this only depends on kernel-irqchip. > > It all boils down to what "unsafe" really means here... if the outcome is > > "very likely hang the guest" as soon as it starts doing I/O, shouldn't > > we error out instead ? What is the motivation to use StoreEOI if the > > processor doesn't really support it ? > > We use it in the lab on P9. We have never seen it failed even under stress. > But there is a possible race in the logic. > > C. Thinking again. P9 boston systems on the field only use emulated XIVE and we certainly want to be able to migrate to P9 systems that use in-kernel XIVE and vice-versa. So, even if StoreEOI is technically supported by the emulated XIVE, maybe we should disallow it anyway ?