On 22.07.25 12:27, Xiaoyao Li wrote:
> On 7/22/2025 5:21 PM, Mathias Krause wrote:
>> IMHO, there is no valid reason for still having the patching in place as
>> the .text of non-ancient kernel's  will be write-protected, making
>> patching attempts fail. And, as they fail with a #PF instead of #UD, the
>> guest cannot even handle them appropriately, as there was no memory
>> write attempt from its point of view. Therefore the default should be to
>> disable it, IMO. This won't prevent guests making use of the wrong
>> instruction from trapping, but, at least, now they'll get the correct
>> exception vector and can handle it appropriately.
> 
> But you don't accout for the case that guest kernel is built without
> CONFIG_STRICT_KERNEL_RWX enabled, or without CONFIG_DEBUG_RO_DATA, or
> for whatever reason the guest's text is not readonly, and the VM needs
> to be migrated among different vendors (Intel <-> AMD).
> 
> Before this patch, the above usecase works well. But with this patch,
> the guest will gets #UD after migrated to different vendors.

I strongly doubt that use case works well, probably doesn't work at all
right now. There's so much more to handle on the kernel side for such a
thing to actually "work". Think of mitigations like RETPOLINE, RETHUNK,
UNRET, SRSO, GDS,... and all the vendor-specific MSRs that need to be
switched over. No way this "just works". Not patching VMCALL to VMMCALL
or vice versa is the least of your problems in such a scenario.

> I heard from some small CSPs that they do want to the ability to live
> migrate VMs among Intel and AMD host.

Well, they'll have a hard time trying to get it to work, sorry.

>> [...]
>> 
>> Ignoring pre-v2.6.25 kernels for a moment, we can assume that KVM will
>> do the patching. So the lack of KVM_X86_QUIRK_FIX_HYPERCALL_INSN but
>> having 'hypercall_patching_enabled == false' indicates that the user
>> wants to disable it but QEMU cannot do so, because KVM lacks the
>> extension to do so. This, IMO, legitimizes the warn_report("kvm:
>> disabling hypercall patching not supported") -- as it's not supported.
> 
> The minimum supported kernel version is 4.5 (see commit f180e367fce4).
> So pre-v2.6.25 kernels is not the case.

Perfect!

> Surely we can list all the cases of different versions of KVM starting
> from v4.5 and draw the conclusion that the semantics of "valid_quirks &
> KVM_X86_QUIRK_FIX_HYPERCALL_INSN == 0" means KVM enables the hypercall
> patching quirk but don't provide the interface for userspace to disable
> it. So the code logic is correct.

Thanks for confirming!

> My statement of "I think it requires a new cap in KVM to return the
> enabled quirks" is more for generic consideration. i.e., QEMU can know
> whether a quirk is enabled or not without analysing the detailed history
> of KVM.

Well, all quirks are enabled by default, only explicitly disabled ones
are, well, disabled. And that requires actively doing so via
kvm_vm_enable_cap(KVM_CAP_DISABLE_QUIRKS[2]). Surely, QEMU can track
these calls on its own if it really wants to.

> Of course, it's more of the requirement on KVM to provide new interface
> and Current QEMU can do nothing on it. That is, current implementation
> of this PATCH is OK to me.

Ok, I'll send a v2 incorporating the member name change to
'hypercall_patching'.

>>> [...]
>>> I think return 0 here is to avoid the warn_report() in the caller. But
>>> for the correct semantics, we need to return -1 to indicate that it
>>> fails to disable the hypercall patching?
>>
>> No, returning 0 here is very much on purpose, as you noticed, to avoid
>> the warn_report() in the caller. The already issued warn_report() is the
>> correct one for this case.
> 
> We can use @Error to pass the error log instead of the trick on return
> value.
> 
> e.g.,
> 
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -3228,17 +3228,24 @@ static int kvm_vm_enable_energy_msrs(KVMState *s)
>      return 0;
>  }
> 
> -static int kvm_vm_disable_hypercall_patching(KVMState *s)
> +static int kvm_vm_disable_hypercall_patching(KVMState *s, Error **errp)
>  {
>      int valid_quirks = kvm_vm_check_extension(s, KVM_CAP_DISABLE_QUIRKS2);
> +    int ret = -1;
> 
>      if (valid_quirks & KVM_X86_QUIRK_FIX_HYPERCALL_INSN) {
> -        return kvm_vm_enable_cap(s, KVM_CAP_DISABLE_QUIRKS2, 0,
> -                                 KVM_X86_QUIRK_FIX_HYPERCALL_INSN);
> +        ret = kvm_vm_enable_cap(s, KVM_CAP_DISABLE_QUIRKS2, 0,
> +                                KVM_X86_QUIRK_FIX_HYPERCALL_INSN);
> +        if (ret) {
> +            error_setg_errno(errp, -ret, "kvm: failed to disable "
> +                             "hypercall patching quirk: %s",
> +                             strerror(-ret));
> +        }
> +    } else {
> +        error_setg(errp, "kvm: disabling hypercall patching not
> supported");
>      }
> 
> -    warn_report("kvm: disabling hypercall patching not supported");
> -    return 0;
> +    return ret;
>  }
> 
>  int kvm_arch_init(MachineState *ms, KVMState *s)
> @@ -3381,8 +3388,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>      }
> 
>      if (s->hypercall_patching_enabled == false) {
> -        if (kvm_vm_disable_hypercall_patching(s)) {
> -            warn_report("kvm: failed to disable hypercall patching
> quirk");
> +        if (kvm_vm_disable_hypercall_patching(s, &local_err)) {
> +            error_report_err(local_err);
>          }
>      }
> 

Ohh, neat. I'll integrate that as well. Thanks!

Thanks,
Mathias

Reply via email to