Hi Vaibhav,

Thanks for reviewing the patch. Please find my reponse inline.

On 2026/06/19 04:41 PM, Vaibhav Jain wrote:
> Amit Machhiwal <[email protected]> writes:
> 
> > Add infrastructure to query the host CPU compatibility mode via the
> > KVM_PPC_GET_COMPAT_CAPS ioctl. This allows QEMU to determine if the
> > host is running in a compatibility mode (e.g., a Power11 processor
> > operating in Power10 compatibility mode).
> >
> > The kvmppc_get_compat_caps() function queries KVM for host compatibility
> > capabilities with size field validation for ABI compatibility between
> > QEMU and the kernel. The kvm_ppc_host_compat_pvr() function derives the
> > effective PVR based on the compatibility mode reported by KVM.
> >
> > Additionally, cas_check_pvr() in hw/ppc/spapr_hcall.c is updated to
> > prevent fallback to raw mode when the host is running in compatibility
> > mode. This ensures that nested guests cannot exceed the host's
> > compatibility level.
> >
> > If the capability is not supported or the query fails, the functions
> > return 0, allowing fallback to existing behavior.
> >
> > Signed-off-by: Amit Machhiwal <[email protected]>
> > ---
> >  hw/ppc/spapr_hcall.c | 14 ++++++++++
> >  target/ppc/kvm.c     | 65 ++++++++++++++++++++++++++++++++++++++++++++
> >  target/ppc/kvm_ppc.h | 16 +++++++++++
> >  3 files changed, 95 insertions(+)
> >
> > diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> > index 60ba215e8611..68c29fbe141b 100644
> > --- a/hw/ppc/spapr_hcall.c
> > +++ b/hw/ppc/spapr_hcall.c
> > @@ -1135,6 +1135,7 @@ static uint32_t cas_check_pvr(PowerPCCPU *cpu, 
> > uint32_t max_compat,
> >  {
> >      bool explicit_match = false; /* Matched the CPU's real PVR */
> >      uint32_t best_compat = 0;
> > +    uint32_t compat_host_pvr = 0;
> >      int i;
> >  
> >      /*
> > @@ -1162,6 +1163,19 @@ static uint32_t cas_check_pvr(PowerPCCPU *cpu, 
> > uint32_t max_compat,
> >          }
> >      }
> >  
> > +    if (explicit_match) {
> > +        compat_host_pvr = kvm_ppc_host_compat_pvr();
> > +        /*
> > +         * If the host is booted in a compatibility mode, do not try 
> > booting in
> > +         * the raw mode as it may allow KVM guests to boot with a higher 
> > CPU
> > +         * version compared to what host was booted with; which should not 
> > be
> > +         * allowed.
> > +         */
> > +        if (compat_host_pvr) {
> > +            explicit_match = false;
> > +        }
> > +    }
> > +
> >      *raw_mode_supported = explicit_match;
> >  
> >      /* Parsing finished */
> > diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> > index b94c2997a07f..9e5006e0c2cd 100644
> > --- a/target/ppc/kvm.c
> > +++ b/target/ppc/kvm.c
> > @@ -2602,6 +2602,71 @@ bool kvmppc_supports_ail_3(void)
> >      return cap_ail_mode_3;
> >  }
> >  
> > +#if defined(TARGET_PPC64)
> This define would be implicit for this file. I think you should target
> these functions for CONFIG_KVM. Otherwise you will have multiple
> definitions of kvm_ppc_host_compat_pvr()

No, this is not implicit. I had fixed a ppc32 build breakage for ppc32
in v2 of this series with these guards (the same is also documented in
the cover letter).

I believe explicit check for CONFIG_KVM is redundant as target/ppc/kvm.c
is only compiled when CONFIG_KVM=y (please take a look at
target/ppc/meson.build:54):

  [...]
  ppc_system_ss.add(when: 'CONFIG_KVM', if_true: files('kvm.c'))

> 
> > +static target_ulong kvmppc_get_compat_caps(void)
> > +{
> > +    struct kvm_ppc_compat_caps host_compat;
> > +    target_ulong host_caps;
> > +    int ret;
> > +
> > +    if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_COMPAT_CAPS)) {
> > +        return 0;
> > +    }
> > +
> > +    /* Initialize the structure with size field for forward compatibility 
> > */
> > +    memset(&host_compat, 0, sizeof(host_compat));
> > +    host_compat.size = sizeof(host_compat);
> > +
> > +    ret = kvm_vm_ioctl(kvm_state, KVM_PPC_GET_COMPAT_CAPS, &host_compat);
> > +    if (ret < 0) {
> > +        fprintf(stderr, "KVM: failed to get host capabilities\n");
> Should use error_report instead of fprintf(stderr,..

Good catch and my bad. I intended to replace these with error_report()
after my testing but missed to do so. I'll fix this in the next version.

> > +        return 0;
> > +    }
> > +
> > +    /*
> > +     * Validate the returned size matches our structure size.
> > +     * The kernel validates that userspace provides sufficient size before
> > +     * the ioctl, and returns its own structure size. A mismatch indicates
> > +     * a version incompatibility between QEMU and the kernel.
> > +     */
> > +    if (host_compat.size != sizeof(host_compat)) 
> This error handling differs from the kernel code which can indeed return
> host_compat.size < sizeof(host_compat).

You are right. I Will unify this check.

> > +        fprintf(stderr, "KVM: host_compat size mismatch (expected %zu, got 
> > %lu)\n",
> > +                sizeof(host_compat), host_compat.size);
> Error message can be more user friendly. Also your should use
> error_report instead of fprintf(stderr,..

Will fix this and think about a better error message.

> > +        return 0;
> > +    }
> 
> > +
> > +    host_caps = host_compat.compat_capabilities;
> > +    return host_caps;
> Minor: Why not just return host_compat.compat_capabilities instead of
> this extra store

Yeah, removing this extra store makes sense. I'll fix this.

> > +}
> > +
> Minor: Minor comment about this function would be nice to have

Sure, will do.

> > +uint32_t kvm_ppc_host_compat_pvr(void)
> > +{
> > +    uint32_t compat_host_pvr = 0;
> > +    int cap_idx = 0;
> > +    target_ulong host_caps = kvmppc_get_compat_caps();
> > +
> > +    host_caps = host_caps & KVM_PPC_COMPAT_BITMASK;
> > +    if (host_caps) {
> > +        cap_idx = 63 - __builtin_ctzll(host_caps);
> you probably need to use clz64() instead of this.

I agree that __builtin_ctzll() should be replaced with QEMU's portable
function for consistency and safety, but clz64() would be incorrect
here. The code needs to find the lowest set bit (rightmost in PowerPC
MSB-0 numbering), not the highest.

For example, with host_caps = 0x3000000000000000 which advertises the
support for Power10 and Power11 via bit 2 and bit 3 respectively:

- Using clz64(): clz64(0x3000000000000000) = 2 (wrong - gives bit 2)
- Using ctz64(): 63 - ctz64(0x3000000000000000) = 63 - 60 = 3 (correct)

The correct fix is to use ctz64() instead:

    cap_idx = 63 - ctz64(host_caps);

This maintains the logic while using QEMU's portable bit manipulation
function from qemu/host-utils.h, which also handles the zero-input case
safely (though we already check for non-zero host_caps).

> 
> > +        switch (cap_idx) {
> > +        case KVM_PPC_COMPAT_CAP_P9_IDX:
> > +            compat_host_pvr = CPU_POWERPC_POWER9_DD22;
> > +            break;
> > +        case KVM_PPC_COMPAT_CAP_P10_IDX:
> > +            compat_host_pvr = CPU_POWERPC_POWER10_DD20;
> > +            break;
> > +        case KVM_PPC_COMPAT_CAP_P11_IDX:
> > +            compat_host_pvr = CPU_POWERPC_POWER11_DD20;
> > +            break;
> > +        default:
> > +            break;
> > +        }
> > +    }
> > +
> > +    return compat_host_pvr;
> > +}
> > +#endif /* TARGET_PPC64 */
> > +
> >  PowerPCCPUClass *kvm_ppc_get_host_cpu_class(void)
> >  {
> >      uint32_t host_pvr = mfpvr();
> > diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> > index 742881231e16..5970e7383740 100644
> > --- a/target/ppc/kvm_ppc.h
> > +++ b/target/ppc/kvm_ppc.h
> > @@ -81,6 +81,17 @@ bool kvmppc_supports_ail_3(void);
> >  int kvmppc_enable_hwrng(void);
> >  int kvmppc_put_books_sregs(PowerPCCPU *cpu);
> >  PowerPCCPUClass *kvm_ppc_get_host_cpu_class(void);
> > +
> > +/*
> > + * Bit position indices for KVM_PPC_COMPAT_CAP_* capabilities.
> > + * These represent the result of (63 - bit_number) for each capability bit.
> > +#define KVM_PPC_COMPAT_CAP_P9_IDX   1  /* 63 - 62 */
> > +#define KVM_PPC_COMPAT_CAP_P10_IDX  2  /* 63 - 61 */
> > +#define KVM_PPC_COMPAT_CAP_P11_IDX  3  /* 63 - 60 */
> Can these be derived from KVM_PPC_COMPAT_CAP_POWERXX
> Also if these defines are only used in kvm_ppc_host_compat_pvr() then
> should be defined near to it.

Deriving these would require using ctz64() in the preprocessor, which
isn't possible since it's a function, not a macro. I tested this:

    #define KVM_PPC_COMPAT_CAP_P9_IDX (63 - ctz64(KVM_PPC_COMPAT_CAP_POWER9))

And got a compilation error:
    error: case label does not reduce to an integer constant

This is because switch/case statements require compile-time constants, 
but ctz64() is a runtime function call.

Explicit bitmap index defines are already used elsewhere in the codebase. 
For example, H_GUEST_CAP_XX_MODE_BMAP in hw/ppc/spapr_nested.h.

Also, I prefer keeping these in the header as they represent the KVM
compatibility capability bitmap layout and could be useful for other
code in the future for any other KVM operations that need to interpret
these capabilities. But I don't really have any strong feelings about
this. Please let me know your views.

Thanks,
Amit

> 
> > +
> > +uint32_t kvm_ppc_host_compat_pvr(void);
> >  void kvmppc_check_papr_resize_hpt(Error **errp);
> >  int kvmppc_resize_hpt_prepare(PowerPCCPU *cpu, target_ulong flags, int 
> > shift);
> >  int kvmppc_resize_hpt_commit(PowerPCCPU *cpu, target_ulong flags, int 
> > shift);
> > @@ -440,6 +451,11 @@ static inline PowerPCCPUClass 
> > *kvm_ppc_get_host_cpu_class(void)
> >      return NULL;
> >  }
> >  
> > +static inline uint32_t kvm_ppc_host_compat_pvr(void)
> > +{
> > +    return 0;
> > +}
> > +
> >  static inline void kvmppc_check_papr_resize_hpt(Error **errp)
> >  {
> >  }
> > -- 
> > 2.50.1 (Apple Git-155)
> >
> 
> -- 
> Cheers
> ~ Vaibhav

Reply via email to