On 11/30/20 5:52 PM, Greg Kurz wrote: > A regression was recently fixed in the sPAPR XIVE code for QEMU 5.2 > RC3 [1]. It boiled down to a confusion between IPI numbers and vCPU > ids, which happen to be numerically equal in general, but are really > different entities that can diverge in some setups. This was causing > QEMU to misconfigure XIVE and to crash the guest. > > The confusion comes from XICS actually. Interrupt presenters in XICS > are identified by a "server number" which is a 1:1 mapping to vCPU > ids. The range of these "server numbers" is exposed to the guest in > the "ibm,interrupt-server-ranges" property. A xics_max_server_number() > helper was introduced at some point to compute the upper limit of the > range. When XIVE was added, commit 1a518e7693c9 renamed the helper to > spapr_max_server_number(). It ended up being used to size a bunch of > things in XIVE that are per-vCPU, such as internal END tables or > IPI ranges presented to the guest. The problem is that the maximum > "server number" can be much higher (up to 8 times) than the actual > number of vCPUs when the VSMT mode doesn't match the number of threads > per core in the guest: > > DIV_ROUND_UP(ms->smp.max_cpus * spapr->vsmt, ms->smp.threads); > > Since QEMU 4.2, the default behavior is to set spapr->vsmt to > ms->smp.threads. Setups with custom VSMT settings will configure XIVE > to use more HW resources than needed. This is a bit unfortunate but > not extremely harmful,
Indeed. The default usage case (without vsmt) has no impact since it does not fragment the XIVE VP space more than needed. > unless maybe if a lot of guests are running on the host. We can run 4K (-2) KVM guests today on a P9 system. To reach the internal limits, each should have 32 vCPUs. It's possible with a lot of RAM but it's not a common scenario. C. > The sizing of the IPI range is more problematic though > as it eventually led to [1]. > > This series first does some renaming to make it clear when we're > dealing with vCPU ids. It then fixes the machine code to pass > smp.max_cpus to XIVE where appropriate. Since these changes are > guest/migration visible, a machine property is added to keep the > existing behavior for older machine types. The series is thus based > on Connie's recent patch that introduces compat machines for > QEMU 6.0. > > Based-on: 20201109173928.1001764-1-coh...@redhat.com > > Note that we still use spapr_max_vcpu_ids() when activating the > in-kernel irqchip because this is what both XICS-on-XIVE and XIVE > KVM devices expect. > > [1] https://bugs.launchpad.net/qemu/+bug/1900241 > > v2: - comments on v1 highlighted that problems mostly come from > spapr_max_server_number() which got misused over the years. > Updated the cover letter accordingly. > - completely new approach. Instead of messing with device properties, > pass the appropriate values to the IC backend handlers. > - rename a few things using the "max_vcpu_ids" wording instead of > "nr_servers" and "max_server_number" > > Greg Kurz (3): > spapr: Improve naming of some vCPU id related items > spapr/xive: Fix size of END table and number of claimed IPIs > spapr/xive: Fix the "ibm,xive-lisn-ranges" property > > include/hw/ppc/spapr.h | 3 ++- > include/hw/ppc/spapr_irq.h | 12 ++++++------ > include/hw/ppc/spapr_xive.h | 2 +- > include/hw/ppc/xics_spapr.h | 2 +- > hw/intc/spapr_xive.c | 9 +++++---- > hw/intc/spapr_xive_kvm.c | 4 ++-- > hw/intc/xics_kvm.c | 4 ++-- > hw/intc/xics_spapr.c | 11 ++++++----- > hw/ppc/spapr.c | 12 ++++++++---- > hw/ppc/spapr_irq.c | 34 ++++++++++++++++++++++++---------- > 10 files changed, 57 insertions(+), 36 deletions(-) >