On 11/30/20 5:52 PM, Greg Kurz wrote:
> A regression was recently fixed in the sPAPR XIVE code for QEMU 5.2
> RC3 [1]. It boiled down to a confusion between IPI numbers and vCPU
> ids, which happen to be numerically equal in general, but are really
> different entities that can diverge in some setups. This was causing
> QEMU to misconfigure XIVE and to crash the guest.
> 
> The confusion comes from XICS actually. Interrupt presenters in XICS
> are identified by a "server number" which is a 1:1 mapping to vCPU
> ids. The range of these "server numbers" is exposed to the guest in
> the "ibm,interrupt-server-ranges" property. A xics_max_server_number()
> helper was introduced at some point to compute the upper limit of the
> range. When XIVE was added, commit 1a518e7693c9 renamed the helper to
> spapr_max_server_number(). It ended up being used to size a bunch of
> things in XIVE that are per-vCPU, such as internal END tables or
> IPI ranges presented to the guest. The problem is that the maximum
> "server number" can be much higher (up to 8 times) than the actual
> number of vCPUs when the VSMT mode doesn't match the number of threads
> per core in the guest:
> 
>     DIV_ROUND_UP(ms->smp.max_cpus * spapr->vsmt, ms->smp.threads);
> 
> Since QEMU 4.2, the default behavior is to set spapr->vsmt to
> ms->smp.threads. Setups with custom VSMT settings will configure XIVE
> to use more HW resources than needed. This is a bit unfortunate but
> not extremely harmful, 

Indeed. The default usage case (without vsmt) has no impact since 
it does not fragment the XIVE VP space more than needed.

> unless maybe if a lot of guests are running on the host. 

We can run 4K (-2) KVM guests today on a P9 system. To reach the 
internal limits, each should have 32 vCPUs. It's possible with a 
lot of RAM but it's not a common scenario. 

C.


> The sizing of the IPI range is more problematic though
> as it eventually led to [1].
> 
> This series first does some renaming to make it clear when we're
> dealing with vCPU ids. It then fixes the machine code to pass
> smp.max_cpus to XIVE where appropriate. Since these changes are
> guest/migration visible, a machine property is added to keep the
> existing behavior for older machine types. The series is thus based
> on Connie's recent patch that introduces compat machines for
> QEMU 6.0.
> 
> Based-on: 20201109173928.1001764-1-coh...@redhat.com
> 
> Note that we still use spapr_max_vcpu_ids() when activating the
> in-kernel irqchip because this is what both XICS-on-XIVE and XIVE
> KVM devices expect.
> 
> [1] https://bugs.launchpad.net/qemu/+bug/1900241
> 
> v2: - comments on v1 highlighted that problems mostly come from
>       spapr_max_server_number() which got misused over the years.
>       Updated the cover letter accordingly.
>     - completely new approach. Instead of messing with device properties,
>       pass the appropriate values to the IC backend handlers.
>     - rename a few things using the "max_vcpu_ids" wording instead of
>       "nr_servers" and "max_server_number"
> 
> Greg Kurz (3):
>   spapr: Improve naming of some vCPU id related items
>   spapr/xive: Fix size of END table and number of claimed IPIs
>   spapr/xive: Fix the "ibm,xive-lisn-ranges" property
> 
>  include/hw/ppc/spapr.h      |  3 ++-
>  include/hw/ppc/spapr_irq.h  | 12 ++++++------
>  include/hw/ppc/spapr_xive.h |  2 +-
>  include/hw/ppc/xics_spapr.h |  2 +-
>  hw/intc/spapr_xive.c        |  9 +++++----
>  hw/intc/spapr_xive_kvm.c    |  4 ++--
>  hw/intc/xics_kvm.c          |  4 ++--
>  hw/intc/xics_spapr.c        | 11 ++++++-----
>  hw/ppc/spapr.c              | 12 ++++++++----
>  hw/ppc/spapr_irq.c          | 34 ++++++++++++++++++++++++----------
>  10 files changed, 57 insertions(+), 36 deletions(-)
> 


Reply via email to