On Tue, May 26, 2026 at 17:17:10 +0200, Denis V. Lunev wrote:
> qemuDomainMakeCPUMigratable() strips features marked added='yes' (in
> src/cpu_map/x86_*.xml) from the migration cookie when the source CPU
> was specified as host-model. The intent was libvirt-protocol compat
> with older destinations; the cost is guest CPU compat, paid silently
> on every migration.
Sigh, yeah there's still (at least) one thing missing around vmx
migration...
> Every Intel x86 CPU model from Westmere through Sapphire Rapids
> carries 60+ added='yes' features, including
> vmx-exit-load-perf-global-ctrl and vmx-entry-load-perf-global-ctrl
> that control the LOAD_IA32_PERF_GLOBAL_CTRL allowed-1 bits of
> MSR_IA32_VMX_{EXIT,ENTRY}_CTLS. A host-model live migration on any
> of these models drops those features from the destination's qemu
> argv.
Yeah, unless those vmx features are explicitly specified in the domain
XML, libvirt adds them according to what QEMU enabled based on the -cpu
command line and then removes them during migration. The assumption is
that if QEMU automatically added them on the source host, it will also
add them on the destination host.
> Modern qemu gates the nested VMX capability MSRs on the explicit -cpu
> list, so the guest's MSR view shifts.
What exactly "modern qemu" means? Do you have an exact version? Anyway,
this sounds like a QEMU bug to me. If a combination of -cpu command line
and a machine type enabled the features on the source, the same
combination should enable them on the source as well. The machine type
does not change during migration and treating vmx features shouldn't
change either.
> Drop the strip. If a destination libvirt does not know a feature in
> the cookie, its parser rejects the migration with a precise
> unknown-feature error: operators can upgrade or narrow the source
> CPU definition. Either is visible; the status quo is not.
The problem is old libvirt was not tracking vmx features at all and thus
any domain started on new libvirt would fail to migrate to an older
libvirt. If a user explicitly required a vmx feature in domain XML, old
libvirt would correctly refuse incoming migration because of an unknown
CPU feature. But it's new libvirt suddenly recognizing features QEMU
always enabled and adding them to domain XML. We always deal with such
situation by dropping the automatically added XML elements for migration
compatibility.
> This effectively reverts 14d3517410 ("qemu: domain: Drop added
> features from migratable CPU") together with its follow-up
> aae8a5774b ("qemu: Drop vmx-* from migratable CPU model only when
> origCPU is set"), and removes the now-unused origCPU plumbing in
> qemuDomainMakeCPUMigratable() and its callers.
Unfortunately this is not the correct solution. We don't have a policy
for backward migration compatibility with old libvirt so we should keep
the compatibility as long as the domain XML does not contain anything
unknown to the old libvirt.
That said, we're missing a code that would transfer all the removed vmx
flags in a migration cookie to make sure new libvirt can see the exact
CPU definition and thus can explicitly request all the vmx features the
source QEMU added.
Jirka