Hi Amit,
Amit Machhiwal <[email protected]> writes: > On POWER systems, newer processor generations can operate in compatibility > modes corresponding to earlier generations (e.g., a Power11 system running > in Power10 compatibility mode). In such cases, the effective CPU level > exposed to guests differs from the physical processor generation. > > This creates a problem for nested virtualization. When booting a nested KVM > guest (L2) inside a host KVM guest (L1) running in a compatibility mode, > userspace (e.g., QEMU) may derive the CPU model from the raw hardware PVR > and attempt to configure the nested guest accordingly. However, the L1 > partition is constrained by the compatibility level negotiated with the > hypervisor (L0), and requests exceeding that level are rejected, leading to > guest boot failures such as: > > KVM-NESTEDv2: couldn't set guest wide elements > > This series addresses the issue in two steps: > > 1. Detect and reject invalid compatibility requests early in KVM to avoid > late failures. > > 2. Provide a mechanism for userspace to query the effective CPU > compatibility modes supported by the host, so it can select an > appropriate CPU model for nested guests. > Do we really need to add a uapi change for this? Tools like Qemu can read the device tree info of the host, isn't it? > To achieve this, the series introduces a new KVM capability and ioctl > (KVM_CAP_PPC_COMPAT_CAPS / KVM_PPC_GET_COMPAT_CAPS) that expose the > compatibility modes supported by the host. > > The implementation supports both: > > - PowerVM (nested API v2), where compatibility information is obtained > via the H_GUEST_GET_CAPABILITIES hypercall. > - PowerNV (nested API v1), where compatibility is derived from the device > tree ("cpu-version") representing the effective processor compatibility > level. See there you go, for PowerNV if this info is provided in the device tree, then Qemu could as well just read that info, no? ... yup, kvmppc_read_int_dt() can do that I guess. So, my request is, can we look into this to see, if there is a possible alternative to this? maybe we already have a mechanism which Qemu could use to get this info already? btw - I haven't given a full read of the patch series, but reading the cover letter, I felt we should atleast add this info to the cover letter on, why a uapi change is really needed here, why can't the existing alternatives work for us. -ritesh > > This allows userspace (e.g., QEMU) to select a CPU model consistent with > the host compatibility mode, avoiding mismatches and enabling successful > nested guest boot. >
