On 11/20/20 6:46 PM, Greg Kurz wrote: > A regression was recently fixed in the sPAPR XIVE code for QEMU 5.2 > RC3 [1]. It boiled down to a confusion between IPI numbers and vCPU > ids, which happen to be numerically equal in general, but are really > different entities that can diverge in some setups. When this happens, > we end up misconfiguring XIVE in a way that is almost fatal for the > guest. > > The confusion comes from XICS which has historically assumed equality > between interrupt server numbers and vCPU ids, as mentionned in a > comment back from 2011 in the linux kernel icp_native_init_one_node() > function: > > /* This code does the theorically broken assumption that the interrupt > * server numbers are the same as the hard CPU numbers. > * This happens to be the case so far but we are playing with fire... > * should be fixed one of these days. -BenH. > */ > > This assumption crept into QEMU through the "ibm,interrupt-server-ranges" > property of the "interrupt-controller" node in the DT. This property > contains ranges of consecutive vCPU ids defined as (first id, # of ids). > In the case of QEMU, we define a single range starting from 0 up to the > highest vCPU id, as returned by spapr_max_server_number(). This has > always been associated to the "nr_servers" wording when naming variables > or function arguments. When XIVE got added, we introduced an sPAPR IRQ > abstraction to be able to control several interrupt controller backends. > The sPAPR IRQ base class provides a dt() handler used to populate the > "interrupt-controller" node in the DT. This handler takes an "nr_server" > argument inherited from XICS and we ended up using it to populate the > "ibm,xive-lisn-ranges" property used with XIVE, which has completely > different semantics. It contain ranges of interrupt numbers that the > guest can use for any purpose. Since one obvious purpose is IPI, its > first range should just be able to accomodate all possible vCPUs.
To clarify, PAPR says it is a requirement : "The first range will contain at least one per possible thread in the partition." The regression showed that we were not initializing correctly this range in QEMU/KVM. I an not even sure it had the correct size either since we were anyhow initializing 4096 IPIs. > In the case of QEMU, we define a single range starting from 0 up > to "nr_server" but we should rather size it to the number of vCPUs > actually (ie. smp.max_cpus). ah. And so spapr_max_server_number(spapr) is crap ? This is starting to be complex to follow :/ > This series aims at getting rid of the "nr_server" argument in the > sPAPR IC handlers. Since both the highest possible vCPU id and the > maximum number of vCPUs are invariants for XICS and XIVE respectively, What XIVE cares about is the number of possible IPIs and the number of vCPUs since we deduced from that the number of event queue descriptors, which is another XIVE structure. > let's make them device properties to be configured by the machine when > it creates the interrupt controllers and use them where needed. > > This doesn't cause any visible change to setups using the default > VSMT machine settings. This changes "ibm,xive-lisn-ranges" for > setups that mess with VSMT, but this is acceptable since linux > only allocates one interrupt per vCPU and the higher part of the > range was never used. This range is only used for vCPUs. C. > [1] > https://git.qemu.org/?p=qemu.git;a=commit;h=6d24795ee7e3199d199d3c415312c93382ad1807 > > Greg Kurz (8): > spapr/xive: Turn some sanity checks into assertions > spapr/xive: Introduce spapr_xive_nr_ends() > spapr/xive: Add "nr-servers" property > spapr/xive: Add "nr-ipis" property > spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect() > spapr/xics: Add "nr-servers" property > spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation > spapr: Drop "nr_servers" argument of the sPAPR IC dt() operation > > include/hw/ppc/spapr.h | 4 +-- > include/hw/ppc/spapr_irq.h | 9 ++--- > include/hw/ppc/spapr_xive.h | 25 ++++++++++++- > include/hw/ppc/xics_spapr.h | 23 +++++++++--- > hw/intc/spapr_xive.c | 72 ++++++++++++++++++++++--------------- > hw/intc/spapr_xive_kvm.c | 4 +-- > hw/intc/xics_kvm.c | 4 +-- > hw/intc/xics_spapr.c | 45 ++++++++++++++--------- > hw/ppc/spapr.c | 7 ++-- > hw/ppc/spapr_irq.c | 27 +++++++------- > 10 files changed, 141 insertions(+), 79 deletions(-) >