On Wed, Jul 03, 2024 at 10:49:12PM +0800, Wei Wang wrote:
> When enforce_cpuid is set to false, the guest is launched with a filtered
> set of features, meaning that unsupported features by the host are removed
> from the guest's vCPU model. This could cause issues for live migration.
> For example, a guest on the source is running with features A and B. If
> the destination host does not support feature B, the stub guest can still
> be launched on the destination with feature A only if enforce_cpuid=false.
> Live migration can start in this case, though it may fail later when the
> states of feature B are put to the destination side. This failure occurs
> in the late stage (i.e., stop&copy phase) of the migration flow, where the
> source guest has already been paused. Tests show that in such cases the
> source guest does not recover, and the destination is unable to resume to
> run.
> 
> Make "enfore_cpuid=true" a hard requirement for a guest to be migratable,
> and change the default value of "enforce_cpuid" to true, making the guest
> vCPUs migratable by default. If the destination stub guest has inconsistent
> CPUIDs (i.e., destination host cannot support the features defined by the
> guest's vCPU model), it fails to boot (with enfore_cpuid=true by default),
> thereby preventing migration from occuring. If enfore_cpuid=false is
> explicitly added for the guest, the guest is deemed as non-migratable
> (via the migration blocker), so the above issue won't occur as the guest
> won't be migrated.
> 
> Tested-by: Lei Wang <lei4.w...@intel.com>
> Signed-off-by: Wei Wang <wei.w.w...@intel.com>

[Copy Jiri and Dan for libvirt-side implications]

> ---
>  target/i386/cpu.c     |  2 +-
>  target/i386/kvm/kvm.c | 25 +++++++++++++++----------
>  2 files changed, 16 insertions(+), 11 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 4c2e6f3a71..7db4fe4ead 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -8258,7 +8258,7 @@ static Property x86_cpu_properties[] = {
>      DEFINE_PROP_UINT32("hv-version-id-snumber", X86CPU, hyperv_ver_id_sn, 0),
>  
>      DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
> -    DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
> +    DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, true),

I assume in many cases people can still properly migrate when the hosts are
similar or identical, so maybe we at least want the old machine types keep
working (by introducing a machine compat property)?

>      DEFINE_PROP_BOOL("x-force-features", X86CPU, force_features, false),
>      DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
>      DEFINE_PROP_UINT32("phys-bits", X86CPU, phys_bits, 0),
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index dd8b0f3313..aee717c1cf 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -1741,7 +1741,7 @@ static int hyperv_init_vcpu(X86CPU *cpu)
>      return 0;
>  }
>  
> -static Error *invtsc_mig_blocker;
> +static Error *cpu_mig_blocker;
>  
>  #define KVM_MAX_CPUID_ENTRIES  100
>  
> @@ -2012,6 +2012,15 @@ full:
>      abort();
>  }
>  
> +static bool kvm_vcpu_need_block_migration(X86CPU *cpu)
> +{
> +    CPUX86State *env = &cpu->env;
> +
> +    return !cpu->enforce_cpuid ||
> +           (!env->user_tsc_khz && (env->features[FEAT_8000_0007_EDX] &
> +                                   CPUID_APM_INVTSC));
> +}

Nit: maybe it's nice this returns a "const char*" with detailed reasons to
be put into the error_setg(), so it dumps the same as before for the invtsc
blocker.

Thanks,

> +
>  int kvm_arch_init_vcpu(CPUState *cs)
>  {
>      struct {
> @@ -2248,18 +2257,14 @@ int kvm_arch_init_vcpu(CPUState *cs)
>          has_msr_mcg_ext_ctl = has_msr_feature_control = true;
>      }
>  
> -    if (!env->user_tsc_khz) {
> -        if ((env->features[FEAT_8000_0007_EDX] & CPUID_APM_INVTSC) &&
> -            invtsc_mig_blocker == NULL) {
> -            error_setg(&invtsc_mig_blocker,
> -                       "State blocked by non-migratable CPU device"
> -                       " (invtsc flag)");
> -            r = migrate_add_blocker(&invtsc_mig_blocker, &local_err);
> +    if (!cpu_mig_blocker &&  kvm_vcpu_need_block_migration(cpu)) {
> +            error_setg(&cpu_mig_blocker,
> +                       "State blocked by non-migratable CPU device");
> +            r = migrate_add_blocker(&cpu_mig_blocker, &local_err);
>              if (r < 0) {
>                  error_report_err(local_err);
>                  return r;
>              }
> -        }
>      }
>  
>      if (cpu->vmware_cpuid_freq
> @@ -2312,7 +2317,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
>      return 0;
>  
>   fail:
> -    migrate_del_blocker(&invtsc_mig_blocker);
> +    migrate_del_blocker(&cpu_mig_blocker);
>  
>      return r;
>  }
> -- 
> 2.27.0
> 
> 

-- 
Peter Xu


Reply via email to