Amit Machhiwal <[email protected]> writes: > On IBM POWER systems, newer processor generations can operate in > compatibility modes corresponding to earlier generations. This becomes > relevant for nested virtualization, where nested KVM guests may need to > run with a specific processor compatibility level. > > Currently, when running a nested KVM guest (L2) inside a Power11 pSeries > logical partition (L1) booted in Power10 compatibility mode, the guest > fails to boot while setting 'arch_compat'. This happens because the CPU > class is derived from the hardware PVR (via mfspr()), which reflects the > physical processor generation (Power11), rather than the effective > compatibility mode (Power10). > > As a result, userspace may request a Power11 arch_compat for the L2 > guest. However, the L1 partition, running in Power10 compatibility, has > only negotiated support up to Power10 with the Power Hypervisor (L0). > When H_GUEST_SET_STATE is invoked with a Power11 Logical PVR, the > hypervisor rejects the request, leading to a late guest boot failure: > > KVM-NESTEDv2: couldn't set guest wide elements > [..KVM reg dump..] > > This situation should be detected earlier and rejected by KVM. Without > proper validation, if userspace ignores the error, the guest may continue > to boot in Power11 raw mode on a Power10 compatibility host, which should > not be allowed. > > Introduce a validation mechanism that detects unsupported arch_compat > values early in the guest initialization path. When an unsupported > arch_compat is requested (e.g., Power11 on a Power10 compatibility mode > host), kvmppc_set_arch_compat() uses cpu_has_feature(CPU_FTR_P11_PVR) to > detect the mismatch and sets arch_compat to PVR_ARCH_INVALID (0xffffffff). > This sentinel value is architecturally safe: PAPR specifies that valid > logical PVR values must have 0x0f as the first byte, ensuring 0xffffffff > lies permanently outside the specification-defined range. Setting this > value triggers kvmppc_sanity_check() to mark the vCPU as invalid by > setting vcpu->arch.sane to false. On the next vCPU run, kvmppc_vcpu_run_hv() > checks this flag and returns -EINVAL, preventing the guest from running > with an invalid processor compatibility configuration. > > With this, when a Power11 arch_compat is requested on a Power10 > compatibility mode host, the guest fails early during boot with: > > error: kvm run failed Invalid argument > > This provides a much clearer failure mode compared to the previous > behavior where the guest could boot in Power11 raw mode (if userspace > ignored the error) or fail late during H_GUEST_SET_STATE. > > Suggested-by: Vaibhav Jain <[email protected]> > Reviewed-by: Vaibhav Jain <[email protected]> > Tested-by: Anushree Mathur <[email protected]> > Acked-by: Gautam Menghani <[email protected]> > Cc: [email protected] # v6.13+ > Signed-off-by: Amit Machhiwal <[email protected]> > --- > Testing: Both Anushree and I have tested the below scenarios: > 1. P11 guest on P11 host - Works > 2. P10 compat guest on P11 host - Works > 3. P11 guest on compat-P10 host - Correctly fails with "Invalid argument" > 4. P10 guest on compat-P10 host - Works >
Thanks for incorporating all the changes and adding the test result matrix in the changelog. The changes looks good, feel free to add: Reviewed-by: Ritesh Harjani (IBM) <[email protected]>
