from:"Vitaly Kuznetsov"

Re: [PATCH RESEND v3 0/3] i386: Fix Hyper-V Gen1 guests stuck on boot with 'hv-passthrough'

2024-03-25 Thread Vitaly Kuznetsov

Vitaly Kuznetsov  writes:

> Changes since 'RESEND v2':
> - Included 'docs/system: Add recommendations to Hyper-V enlightenments doc'
>   in the set as it also requires a "RESEND")

Ping)

>
> Hyper-V Gen1 guests are getting stuck on boot when 'hv-passthrough' is
> used. While 'hv-passthrough' is a debug only feature, this significantly
> limit its usefullness. While debugging the problem, I found that there are
> two loosely connected issues:
> - 'hv-passthrough' enables 'hv-syndbg' and this is undesired.
> - 'hv-syndbg's support by KVM is detected incorrectly when !CONFIG_SYNDBG.
>
> Fix both issues; exclude 'hv-syndbg' from 'hv-passthrough' and don't allow
> to turn on 'hv-syndbg' for !CONFIG_SYNDBG builds. 
>
> Vitaly Kuznetsov (3):
>   i386: Fix conditional CONFIG_SYNDBG enablement
>   i386: Exclude 'hv-syndbg' from 'hv-passthrough'
>   docs/system: Add recommendations to Hyper-V enlightenments doc
>
>  docs/system/i386/hyperv.rst | 43 +
>  target/i386/cpu.c   |  2 ++
>  target/i386/kvm/kvm.c   | 18 ++--
>  3 files changed, 53 insertions(+), 10 deletions(-)

-- 
Vitaly

Re: [PATCH RESEND v3 3/3] docs/system: Add recommendations to Hyper-V enlightenments doc

2024-03-07 Thread Vitaly Kuznetsov

Zhao Liu  writes:

> Hi Vitaly,
>
> On Tue, Mar 05, 2024 at 05:42:04PM +0100, Vitaly Kuznetsov wrote:
>> Date: Tue,  5 Mar 2024 17:42:04 +0100
>> From: Vitaly Kuznetsov 
>> Subject: [PATCH RESEND v3 3/3] docs/system: Add recommendations to Hyper-V
>>  enlightenments doc
>> 
>> While hyperv.rst already has all currently implemented Hyper-V
>> enlightenments documented, it may be unclear what is the recommended set to
>> achieve the best result. Add the corresponding section to the doc.
>> 
>> Signed-off-by: Vitaly Kuznetsov 
>> ---
>>  docs/system/i386/hyperv.rst | 30 ++
>>  1 file changed, 30 insertions(+)
>> 
>> diff --git a/docs/system/i386/hyperv.rst b/docs/system/i386/hyperv.rst
>> index 009947e39141..1c1de77feb65 100644
>> --- a/docs/system/i386/hyperv.rst
>> +++ b/docs/system/i386/hyperv.rst
>> @@ -283,6 +283,36 @@ Supplementary features
>>feature alters this behavior and only allows the guest to use exposed 
>> Hyper-V
>>enlightenments.
>>  
>> +Recommendations
>> +---
>
> This guide is very helpful!
>
>> +To achieve the best performance of Windows and Hyper-V guests and unless 
>> there
>> +are any specific requirements (e.g. migration to older QEMU/KVM versions,
>> +emulating specific Hyper-V version, ...), it is recommended to enable all
>> +currently implemented Hyper-V enlightenments with the following exceptions:
>> +
>> +- ``hv-syndbg``, ``hv-passthrough``, ``hv-enforce-cpuid`` should not be 
>> enabled
>> +  in production configurations as these are debugging/development features.
>> +- ``hv-reset`` can be avoided as modern Hyper-V versions don't expose it.
>
> Does the "Hyper-V versions" means Hyper-V guest version or Microsoft's Hyper-V
> hypervisor version? 
> It would be better to clarify Hyper-V guest and Hyper-v hypervisor.
>
> And it would be better to have a clear version number.

This is about QEMU/KVM emulating certain Hyper-V version, not about
guest Hyper-V version. To be honest, I'm not sure what was the last
version of Hyper-V which was exposing HV_SYSTEM_RESET_RECOMMENDED. I
don't have anything older that WS2016 around now and the bit is not
there. If I'm not mistaken, it was already missing in 2012R2. I would
appreciate if anyone has more precise historical info to add here.

>
>> +- ``hv-evmcs`` can (and should) be enabled on Intel CPUs only. While the 
>> feature
>> +  is only used in nested configurations (Hyper-V, WSL2), enabling it for 
>> regular
>> +  Windows guests should not have any negative effects.
>> +- ``hv-no-nonarch-coresharing`` must only be enabled if vCPUs are properly 
>> pinned
>> +  so no non-architectural core sharing is possible.
>> +- ``hv-vendor-id``, ``hv-version-id-build``, ``hv-version-id-major``,
>> +  ``hv-version-id-minor``, ``hv-version-id-spack``, 
>> ``hv-version-id-sbranch``,
>> +  ``hv-version-id-snumber`` can be left unchanged, guests are not supposed 
>> to
>> +  behave differently when different Hyper-V version is presented to them.
>> +- ``hv-crash`` must only be enabled if the crash information is consumed via
>> +  QAPI by higher levels of the virtualization stack. Enabling this feature
>> +  effectively prevents Windows from creating dumps upon crashes.
>> +- ``hv-reenlightenment`` can only be used on hardware which supports TSC
>> +  scaling or when guest migration is not needed.
>> +- ``hv-spinlocks`` should be set to e.g. 0xfff when host CPUs are 
>> overcommited
>> +  (meaning there are other scheduled tasks or guests) and can be left 
>> unchanged
>> +  from the default value (0x) otherwise.
>> +- ``hv-avic``/``hv-apicv`` should not be enabled if the hardware does not
>> +  support APIC virtualization (Intel APICv, AMD AVIC).
>>
>
> It's also better to add blank lines between paragraphs above.

Np, if I am to re-send this I'll add these (hope it's not an acceptance
blocker, we can always do a follow-up).

>
> BTW, may I ask another Windows question? I understand that Windows such
> as Windows 10 and later is already a virtualized architecture with
> built-in Hyper-V to run root partation.
>
> So is it true that booting Windows VM via KVM + QEMU is running Windows
> Guest in L2? Or what is the relationship between Hyper-V within Windows
> and Hyper-V enlightenments with QEMU + KVM?

Hyper-V is a role you can enable in various Windows versions, both
server and client. When enabled, you get a hypervisor (which is called
'Microsoft Hypervisor' as I was told) and your Windows becomes the root
partition (similar to Xen Dom0). In case you run this on KVM, Windows
becomes L2. Hyper-V enlightenments provided by KVM/QEMU are consumed by
the hypervisor then.

Note: Hyper-V role is optional, in many cases Windows guests run without
it (no Hyper-V VMs, no WSL2, ...) and thus consume KVM's Hyper-V
enlightenments directly, no nested virt involved.

-- 
Vitaly

[PATCH RESEND v3 2/3] i386: Exclude 'hv-syndbg' from 'hv-passthrough'

2024-03-05 Thread Vitaly Kuznetsov

Windows with Hyper-V role enabled doesn't boot with 'hv-passthrough' when
no debugger is configured, this significantly limits the usefulness of the
feature as there's no support for subtracting Hyper-V features from CPU
flags at this moment (e.g. "-cpu host,hv-passthrough,-hv-syndbg" does not
work). While this is also theoretically fixable, 'hv-syndbg' is likely
very special and unneeded in the default set. Genuine Hyper-V doesn't seem
to enable it either.

Introduce 'skip_passthrough' flag to 'kvm_hyperv_properties' and use it as
one-off to skip 'hv-syndbg' when enabling features in 'hv-passthrough'
mode. Note, "-cpu host,hv-passthrough,hv-syndbg" can still be used if
needed.

As both 'hv-passthrough' and 'hv-syndbg' are debug features, the change
should not have any effect on production environments.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/system/i386/hyperv.rst | 13 +
 target/i386/kvm/kvm.c   |  7 +--
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/docs/system/i386/hyperv.rst b/docs/system/i386/hyperv.rst
index 2505dc4c86e0..009947e39141 100644
--- a/docs/system/i386/hyperv.rst
+++ b/docs/system/i386/hyperv.rst
@@ -262,14 +262,19 @@ Supplementary features
 ``hv-passthrough``
   In some cases (e.g. during development) it may make sense to use QEMU in
   'pass-through' mode and give Windows guests all enlightenments currently
-  supported by KVM. This pass-through mode is enabled by "hv-passthrough" CPU
-  flag.
+  supported by KVM.
 
   Note: ``hv-passthrough`` flag only enables enlightenments which are known to 
QEMU
   (have corresponding 'hv-' flag) and copies ``hv-spinlocks`` and 
``hv-vendor-id``
   values from KVM to QEMU. ``hv-passthrough`` overrides all other 'hv-' 
settings on
-  the command line. Also, enabling this flag effectively prevents migration as 
the
-  list of enabled enlightenments may differ between target and destination 
hosts.
+  the command line.
+
+  Note: ``hv-passthrough`` does not enable ``hv-syndbg`` which can prevent 
certain
+  Windows guests from booting when used without proper configuration. If 
needed,
+  ``hv-syndbg`` can be enabled additionally.
+
+  Note: ``hv-passthrough`` effectively prevents migration as the list of 
enabled
+  enlightenments may differ between target and destination hosts.
 
 ``hv-enforce-cpuid``
   By default, KVM allows the guest to use all currently supported Hyper-V
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index f067e35d35b1..f01d19ad2d51 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -823,6 +823,7 @@ static struct {
 uint32_t bits;
 } flags[2];
 uint64_t dependencies;
+bool skip_passthrough;
 } kvm_hyperv_properties[] = {
 [HYPERV_FEAT_RELAXED] = {
 .desc = "relaxed timing (hv-relaxed)",
@@ -951,7 +952,8 @@ static struct {
 {.func = HV_CPUID_FEATURES, .reg = R_EDX,
  .bits = HV_FEATURE_DEBUG_MSRS_AVAILABLE}
 },
-.dependencies = BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_RELAXED)
+.dependencies = BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_RELAXED),
+.skip_passthrough = true,
 },
 [HYPERV_FEAT_MSR_BITMAP] = {
 .desc = "enlightened MSR-Bitmap (hv-emsr-bitmap)",
@@ -1360,7 +1362,8 @@ bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
  * hv_build_cpuid_leaf() uses this info to build guest CPUIDs.
  */
 for (feat = 0; feat < ARRAY_SIZE(kvm_hyperv_properties); feat++) {
-if (hyperv_feature_supported(cs, feat)) {
+if (hyperv_feature_supported(cs, feat) &&
+!kvm_hyperv_properties[feat].skip_passthrough) {
 cpu->hyperv_features |= BIT(feat);
 }
 }
-- 
2.43.2

[PATCH RESEND v3 0/3] i386: Fix Hyper-V Gen1 guests stuck on boot with 'hv-passthrough'

2024-03-05 Thread Vitaly Kuznetsov

Changes since 'RESEND v2':
- Included 'docs/system: Add recommendations to Hyper-V enlightenments doc'
  in the set as it also requires a "RESEND")

Hyper-V Gen1 guests are getting stuck on boot when 'hv-passthrough' is
used. While 'hv-passthrough' is a debug only feature, this significantly
limit its usefullness. While debugging the problem, I found that there are
two loosely connected issues:
- 'hv-passthrough' enables 'hv-syndbg' and this is undesired.
- 'hv-syndbg's support by KVM is detected incorrectly when !CONFIG_SYNDBG.

Fix both issues; exclude 'hv-syndbg' from 'hv-passthrough' and don't allow
to turn on 'hv-syndbg' for !CONFIG_SYNDBG builds. 

Vitaly Kuznetsov (3):
  i386: Fix conditional CONFIG_SYNDBG enablement
  i386: Exclude 'hv-syndbg' from 'hv-passthrough'
  docs/system: Add recommendations to Hyper-V enlightenments doc

 docs/system/i386/hyperv.rst | 43 +
 target/i386/cpu.c   |  2 ++
 target/i386/kvm/kvm.c   | 18 ++--
 3 files changed, 53 insertions(+), 10 deletions(-)

-- 
2.43.2

[PATCH RESEND v3 3/3] docs/system: Add recommendations to Hyper-V enlightenments doc

2024-03-05 Thread Vitaly Kuznetsov

While hyperv.rst already has all currently implemented Hyper-V
enlightenments documented, it may be unclear what is the recommended set to
achieve the best result. Add the corresponding section to the doc.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/system/i386/hyperv.rst | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/docs/system/i386/hyperv.rst b/docs/system/i386/hyperv.rst
index 009947e39141..1c1de77feb65 100644
--- a/docs/system/i386/hyperv.rst
+++ b/docs/system/i386/hyperv.rst
@@ -283,6 +283,36 @@ Supplementary features
   feature alters this behavior and only allows the guest to use exposed Hyper-V
   enlightenments.
 
+Recommendations
+---
+
+To achieve the best performance of Windows and Hyper-V guests and unless there
+are any specific requirements (e.g. migration to older QEMU/KVM versions,
+emulating specific Hyper-V version, ...), it is recommended to enable all
+currently implemented Hyper-V enlightenments with the following exceptions:
+
+- ``hv-syndbg``, ``hv-passthrough``, ``hv-enforce-cpuid`` should not be enabled
+  in production configurations as these are debugging/development features.
+- ``hv-reset`` can be avoided as modern Hyper-V versions don't expose it.
+- ``hv-evmcs`` can (and should) be enabled on Intel CPUs only. While the 
feature
+  is only used in nested configurations (Hyper-V, WSL2), enabling it for 
regular
+  Windows guests should not have any negative effects.
+- ``hv-no-nonarch-coresharing`` must only be enabled if vCPUs are properly 
pinned
+  so no non-architectural core sharing is possible.
+- ``hv-vendor-id``, ``hv-version-id-build``, ``hv-version-id-major``,
+  ``hv-version-id-minor``, ``hv-version-id-spack``, ``hv-version-id-sbranch``,
+  ``hv-version-id-snumber`` can be left unchanged, guests are not supposed to
+  behave differently when different Hyper-V version is presented to them.
+- ``hv-crash`` must only be enabled if the crash information is consumed via
+  QAPI by higher levels of the virtualization stack. Enabling this feature
+  effectively prevents Windows from creating dumps upon crashes.
+- ``hv-reenlightenment`` can only be used on hardware which supports TSC
+  scaling or when guest migration is not needed.
+- ``hv-spinlocks`` should be set to e.g. 0xfff when host CPUs are overcommited
+  (meaning there are other scheduled tasks or guests) and can be left unchanged
+  from the default value (0x) otherwise.
+- ``hv-avic``/``hv-apicv`` should not be enabled if the hardware does not
+  support APIC virtualization (Intel APICv, AMD AVIC).
 
 Useful links
 
-- 
2.43.2

[PATCH RESEND v3 1/3] i386: Fix conditional CONFIG_SYNDBG enablement

2024-03-05 Thread Vitaly Kuznetsov

Putting HYPERV_FEAT_SYNDBG entry under "#ifdef CONFIG_SYNDBG" in
'kvm_hyperv_properties' array is wrong: as HYPERV_FEAT_SYNDBG is not
the highest feature number, the result is an empty (zeroed) entry in
the array (and not a skipped entry!). hyperv_feature_supported() is
designed to check that all CPUID bits are set but for a zeroed
feature in 'kvm_hyperv_properties' it returns 'true' so QEMU considers
HYPERV_FEAT_SYNDBG as always supported, regardless of whether KVM host
actually supports it.

To fix the issue, leave HYPERV_FEAT_SYNDBG's definition in
'kvm_hyperv_properties' array, there's nothing wrong in having it defined
even when 'CONFIG_SYNDBG' is not set. Instead, put "hv-syndbg" CPU property
under '#ifdef CONFIG_SYNDBG' to alter the existing behavior when the flag
is silently skipped in !CONFIG_SYNDBG builds.

Leave an 'assert' sentinel in hyperv_feature_supported() making sure there
are no 'holes' or improperly defined features in 'kvm_hyperv_properties'.

Fixes: d8701185f40c ("hw: hyperv: Initial commit for Synthetic Debugging 
device")
Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/cpu.c |  2 ++
 target/i386/kvm/kvm.c | 11 +++
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 2666ef380891..64ce7c4c8242 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7866,8 +7866,10 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_TLBFLUSH_DIRECT, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
+#ifdef CONFIG_SYNDBG
 DEFINE_PROP_BIT64("hv-syndbg", X86CPU, hyperv_features,
   HYPERV_FEAT_SYNDBG, 0),
+#endif
 DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
 DEFINE_PROP_BOOL("hv-enforce-cpuid", X86CPU, hyperv_enforce_cpuid, false),
 
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 42970ab046fa..f067e35d35b1 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -945,7 +945,6 @@ static struct {
  .bits = HV_DEPRECATING_AEOI_RECOMMENDED}
 }
 },
-#ifdef CONFIG_SYNDBG
 [HYPERV_FEAT_SYNDBG] = {
 .desc = "Enable synthetic kernel debugger channel (hv-syndbg)",
 .flags = {
@@ -954,7 +953,6 @@ static struct {
 },
 .dependencies = BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_RELAXED)
 },
-#endif
 [HYPERV_FEAT_MSR_BITMAP] = {
 .desc = "enlightened MSR-Bitmap (hv-emsr-bitmap)",
 .flags = {
@@ -1206,6 +1204,13 @@ static bool hyperv_feature_supported(CPUState *cs, int 
feature)
 uint32_t func, bits;
 int i, reg;
 
+/*
+ * kvm_hyperv_properties needs to define at least one CPUID flag which
+ * must be used to detect the feature, it's hard to say whether it is
+ * supported or not otherwise.
+ */
+assert(kvm_hyperv_properties[feature].flags[0].func);
+
 for (i = 0; i < ARRAY_SIZE(kvm_hyperv_properties[feature].flags); i++) {
 
 func = kvm_hyperv_properties[feature].flags[i].func;
@@ -3388,13 +3393,11 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
 kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_STATUS,
   env->msr_hv_tsc_emulation_status);
 }
-#ifdef CONFIG_SYNDBG
 if (hyperv_feat_enabled(cpu, HYPERV_FEAT_SYNDBG) &&
 has_msr_hv_syndbg_options) {
 kvm_msr_entry_add(cpu, HV_X64_MSR_SYNDBG_OPTIONS,
   hyperv_syndbg_query_options());
 }
-#endif
 }
 if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VAPIC)) {
 kvm_msr_entry_add(cpu, HV_X64_MSR_APIC_ASSIST_PAGE,
-- 
2.43.2

Re: [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state()

2024-01-16 Thread Vitaly Kuznetsov

As I'm the addressee of the ping for some reason ... :-)

the fix looks good to me but I'm not sure about all the consequences of
moving kvm_put_vcpu_events() to an earlier stage. Max, Paolo, please
take a look!

Eiichi Tsukata  writes:

> Ping.
>
>> On Nov 8, 2023, at 10:12, Eiichi Tsukata  wrote:
>> 
>> Hi all, appreciate any comments or feedbacks on the patch.
>> 
>> Thanks,
>> Eiichi
>> 
>>> On Nov 1, 2023, at 23:04, Vitaly Kuznetsov  wrote:
>>> 
>>> Eiichi Tsukata  writes:
>>> 
>>>> FYI: The EINVAL in vmx_set_nested_state() is caused by the following 
>>>> condition:
>>>> * vcpu->arch.hflags == 0
>>>> * kvm_state->hdr.vmx.smm.flags == KVM_STATE_NESTED_SMM_VMXON
>>> 
>>> This is a weird state indeed,
>>> 
>>> 'vcpu->arch.hflags == 0' means we're not in SMM and not in guest mode
>>> but kvm_state->hdr.vmx.smm.flags == KVM_STATE_NESTED_SMM_VMXON is a
>>> reflection of vmx->nested.smm.vmxon (see
>>> vmx_get_nested_state()). vmx->nested.smm.vmxon gets set (conditioally)
>>> in vmx_enter_smm() and gets cleared in vmx_leave_smm() which means the
>>> vCPU must be in SMM to have it set.
>>> 
>>> In case the vCPU is in SMM upon migration, HF_SMM_MASK must be set from
>>> kvm_vcpu_ioctl_x86_set_vcpu_events() -> kvm_smm_changed() but QEMU's
>>> kvm_put_vcpu_events() calls kvm_put_nested_state() _before_
>>> kvm_put_vcpu_events(). This can explain "vcpu->arch.hflags == 0".
>>> 
>>> Paolo, Max, any idea how this is supposed to work?
>>> 
>>> -- 
>>> Vitaly
>>> 
>> 
>

-- 
Vitaly

[PATCH] docs/system: Add recommendations to Hyper-V enlightenments doc

2023-11-15 Thread Vitaly Kuznetsov

While hyperv.rst already has all currently implemented Hyper-V
enlightenments documented, it may be unclear what is the recommended set to
achieve the best result. Add the corresponding section to the doc.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/system/i386/hyperv.rst | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/docs/system/i386/hyperv.rst b/docs/system/i386/hyperv.rst
index 2505dc4c86e0..1c7c4a3981ea 100644
--- a/docs/system/i386/hyperv.rst
+++ b/docs/system/i386/hyperv.rst
@@ -278,6 +278,36 @@ Supplementary features
   feature alters this behavior and only allows the guest to use exposed Hyper-V
   enlightenments.
 
+Recommendations
+---
+
+To achieve the best performance of Windows and Hyper-V guests and unless there
+are any specific requirements (e.g. migration to older QEMU/KVM versions,
+emulating specific Hyper-V version, ...), it is recommended to enable all
+currently implemented Hyper-V enlightenments with the following exceptions:
+
+- ``hv-syndbg``, ``hv-passthrough``, ``hv-enforce-cpuid`` should not be enabled
+  in production configurations as these are debugging/development features.
+- ``hv-reset`` can be avoided as modern Hyper-V versions don't expose it.
+- ``hv-evmcs`` can (and should) be enabled on Intel CPUs only. While the 
feature
+  is only used in nested configurations (Hyper-V, WSL2), enabling it for 
regular
+  Windows guests should not have any negative effects.
+- ``hv-no-nonarch-coresharing`` must only be enabled if vCPUs are properly 
pinned
+  so no non-architectural core sharing is possible.
+- ``hv-vendor-id``, ``hv-version-id-build``, ``hv-version-id-major``,
+  ``hv-version-id-minor``, ``hv-version-id-spack``, ``hv-version-id-sbranch``,
+  ``hv-version-id-snumber`` can be left unchanged, guests are not supposed to
+  behave differently when different Hyper-V version is presented to them.
+- ``hv-crash`` must only be enabled if the crash information is consumed via
+  QAPI by higher levels of the virtualization stack. Enabling this feature
+  effectively prevents Windows from creating dumps upon crashes.
+- ``hv-reenlightenment`` can only be used on hardware which supports TSC
+  scaling or when guest migration is not needed.
+- ``hv-spinlocks`` should be set to e.g. 0xfff when host CPUs are overcommited
+  (meaning there are other scheduled tasks or guests) and can be left unchanged
+  from the default value (0x) otherwise.
+- ``hv-avic``/``hv-apicv`` should not be enabled if the hardware does not
+  support APIC virtualization (Intel APICv, AMD AVIC).
 
 Useful links
 
-- 
2.41.0

[PATCH RESEND v2 2/2] i386: Exclude 'hv-syndbg' from 'hv-passthrough'

2023-11-15 Thread Vitaly Kuznetsov

Windows with Hyper-V role enabled doesn't boot with 'hv-passthrough' when
no debugger is configured, this significantly limits the usefulness of the
feature as there's no support for subtracting Hyper-V features from CPU
flags at this moment (e.g. "-cpu host,hv-passthrough,-hv-syndbg" does not
work). While this is also theoretically fixable, 'hv-syndbg' is likely
very special and unneeded in the default set. Genuine Hyper-V doesn't seem
to enable it either.

Introduce 'skip_passthrough' flag to 'kvm_hyperv_properties' and use it as
one-off to skip 'hv-syndbg' when enabling features in 'hv-passthrough'
mode. Note, "-cpu host,hv-passthrough,hv-syndbg" can still be used if
needed.

As both 'hv-passthrough' and 'hv-syndbg' are debug features, the change
should not have any effect on production environments.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/system/i386/hyperv.rst | 13 +
 target/i386/kvm/kvm.c   |  7 +--
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/docs/system/i386/hyperv.rst b/docs/system/i386/hyperv.rst
index 2505dc4c86e0..009947e39141 100644
--- a/docs/system/i386/hyperv.rst
+++ b/docs/system/i386/hyperv.rst
@@ -262,14 +262,19 @@ Supplementary features
 ``hv-passthrough``
   In some cases (e.g. during development) it may make sense to use QEMU in
   'pass-through' mode and give Windows guests all enlightenments currently
-  supported by KVM. This pass-through mode is enabled by "hv-passthrough" CPU
-  flag.
+  supported by KVM.
 
   Note: ``hv-passthrough`` flag only enables enlightenments which are known to 
QEMU
   (have corresponding 'hv-' flag) and copies ``hv-spinlocks`` and 
``hv-vendor-id``
   values from KVM to QEMU. ``hv-passthrough`` overrides all other 'hv-' 
settings on
-  the command line. Also, enabling this flag effectively prevents migration as 
the
-  list of enabled enlightenments may differ between target and destination 
hosts.
+  the command line.
+
+  Note: ``hv-passthrough`` does not enable ``hv-syndbg`` which can prevent 
certain
+  Windows guests from booting when used without proper configuration. If 
needed,
+  ``hv-syndbg`` can be enabled additionally.
+
+  Note: ``hv-passthrough`` effectively prevents migration as the list of 
enabled
+  enlightenments may differ between target and destination hosts.
 
 ``hv-enforce-cpuid``
   By default, KVM allows the guest to use all currently supported Hyper-V
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 2fcb1f6673d8..0c745562b667 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -823,6 +823,7 @@ static struct {
 uint32_t bits;
 } flags[2];
 uint64_t dependencies;
+bool skip_passthrough;
 } kvm_hyperv_properties[] = {
 [HYPERV_FEAT_RELAXED] = {
 .desc = "relaxed timing (hv-relaxed)",
@@ -951,7 +952,8 @@ static struct {
 {.func = HV_CPUID_FEATURES, .reg = R_EDX,
  .bits = HV_FEATURE_DEBUG_MSRS_AVAILABLE}
 },
-.dependencies = BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_RELAXED)
+.dependencies = BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_RELAXED),
+.skip_passthrough = true,
 },
 [HYPERV_FEAT_MSR_BITMAP] = {
 .desc = "enlightened MSR-Bitmap (hv-emsr-bitmap)",
@@ -1360,7 +1362,8 @@ bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
  * hv_build_cpuid_leaf() uses this info to build guest CPUIDs.
  */
 for (feat = 0; feat < ARRAY_SIZE(kvm_hyperv_properties); feat++) {
-if (hyperv_feature_supported(cs, feat)) {
+if (hyperv_feature_supported(cs, feat) &&
+!kvm_hyperv_properties[feat].skip_passthrough) {
 cpu->hyperv_features |= BIT(feat);
 }
 }
-- 
2.41.0

[PATCH RESEND v2 0/2] i386: Fix Hyper-V Gen1 guests stuck on boot with 'hv-passthrough'

2023-11-15 Thread Vitaly Kuznetsov

Changes since v1/v1 RESEND:
- No changes.

Hyper-V Gen1 guests are getting stuck on boot when 'hv-passthrough' is
used. While 'hv-passthrough' is a debug only feature, this significantly
limit its usefullness. While debugging the problem, I found that there are
two loosely connected issues:
- 'hv-passthrough' enables 'hv-syndbg' and this is undesired.
- 'hv-syndbg's support by KVM is detected incorrectly when !CONFIG_SYNDBG.

Fix both issues; exclude 'hv-syndbg' from 'hv-passthrough' and don't allow
to turn on 'hv-syndbg' for !CONFIG_SYNDBG builds. 

Vitaly Kuznetsov (2):
  i386: Fix conditional CONFIG_SYNDBG enablement
  i386: Exclude 'hv-syndbg' from 'hv-passthrough'

 docs/system/i386/hyperv.rst | 13 +
 target/i386/cpu.c   |  2 ++
 target/i386/kvm/kvm.c   | 18 --
 3 files changed, 23 insertions(+), 10 deletions(-)

-- 
2.41.0

[PATCH RESEND v2 1/2] i386: Fix conditional CONFIG_SYNDBG enablement

2023-11-15 Thread Vitaly Kuznetsov

Putting HYPERV_FEAT_SYNDBG entry under "#ifdef CONFIG_SYNDBG" in
'kvm_hyperv_properties' array is wrong: as HYPERV_FEAT_SYNDBG is not
the highest feature number, the result is an empty (zeroed) entry in
the array (and not a skipped entry!). hyperv_feature_supported() is
designed to check that all CPUID bits are set but for a zeroed
feature in 'kvm_hyperv_properties' it returns 'true' so QEMU considers
HYPERV_FEAT_SYNDBG as always supported, regardless of whether KVM host
actually supports it.

To fix the issue, leave HYPERV_FEAT_SYNDBG's definition in
'kvm_hyperv_properties' array, there's nothing wrong in having it defined
even when 'CONFIG_SYNDBG' is not set. Instead, put "hv-syndbg" CPU property
under '#ifdef CONFIG_SYNDBG' to alter the existing behavior when the flag
is silently skipped in !CONFIG_SYNDBG builds.

Leave an 'assert' sentinel in hyperv_feature_supported() making sure there
are no 'holes' or improperly defined features in 'kvm_hyperv_properties'.

Fixes: d8701185f40c ("hw: hyperv: Initial commit for Synthetic Debugging 
device")
Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/cpu.c |  2 ++
 target/i386/kvm/kvm.c | 11 +++
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 358d9c0a655a..f5fac3744173 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7842,8 +7842,10 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_TLBFLUSH_DIRECT, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
+#ifdef CONFIG_SYNDBG
 DEFINE_PROP_BIT64("hv-syndbg", X86CPU, hyperv_features,
   HYPERV_FEAT_SYNDBG, 0),
+#endif
 DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
 DEFINE_PROP_BOOL("hv-enforce-cpuid", X86CPU, hyperv_enforce_cpuid, false),
 
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 11b8177eff21..2fcb1f6673d8 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -945,7 +945,6 @@ static struct {
  .bits = HV_DEPRECATING_AEOI_RECOMMENDED}
 }
 },
-#ifdef CONFIG_SYNDBG
 [HYPERV_FEAT_SYNDBG] = {
 .desc = "Enable synthetic kernel debugger channel (hv-syndbg)",
 .flags = {
@@ -954,7 +953,6 @@ static struct {
 },
 .dependencies = BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_RELAXED)
 },
-#endif
 [HYPERV_FEAT_MSR_BITMAP] = {
 .desc = "enlightened MSR-Bitmap (hv-emsr-bitmap)",
 .flags = {
@@ -1206,6 +1204,13 @@ static bool hyperv_feature_supported(CPUState *cs, int 
feature)
 uint32_t func, bits;
 int i, reg;
 
+/*
+ * kvm_hyperv_properties needs to define at least one CPUID flag which
+ * must be used to detect the feature, it's hard to say whether it is
+ * supported or not otherwise.
+ */
+assert(kvm_hyperv_properties[feature].flags[0].func);
+
 for (i = 0; i < ARRAY_SIZE(kvm_hyperv_properties[feature].flags); i++) {
 
 func = kvm_hyperv_properties[feature].flags[i].func;
@@ -3391,13 +3396,11 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
 kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_STATUS,
   env->msr_hv_tsc_emulation_status);
 }
-#ifdef CONFIG_SYNDBG
 if (hyperv_feat_enabled(cpu, HYPERV_FEAT_SYNDBG) &&
 has_msr_hv_syndbg_options) {
 kvm_msr_entry_add(cpu, HV_X64_MSR_SYNDBG_OPTIONS,
   hyperv_syndbg_query_options());
 }
-#endif
 }
 if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VAPIC)) {
 kvm_msr_entry_add(cpu, HV_X64_MSR_APIC_ASSIST_PAGE,
-- 
2.41.0

Re: [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state()

2023-11-01 Thread Vitaly Kuznetsov

Eiichi Tsukata  writes:

> FYI: The EINVAL in vmx_set_nested_state() is caused by the following 
> condition:
> * vcpu->arch.hflags == 0
> * kvm_state->hdr.vmx.smm.flags == KVM_STATE_NESTED_SMM_VMXON

This is a weird state indeed,

'vcpu->arch.hflags == 0' means we're not in SMM and not in guest mode
but kvm_state->hdr.vmx.smm.flags == KVM_STATE_NESTED_SMM_VMXON is a
reflection of vmx->nested.smm.vmxon (see
vmx_get_nested_state()). vmx->nested.smm.vmxon gets set (conditioally)
in vmx_enter_smm() and gets cleared in vmx_leave_smm() which means the
vCPU must be in SMM to have it set.

In case the vCPU is in SMM upon migration, HF_SMM_MASK must be set from
kvm_vcpu_ioctl_x86_set_vcpu_events() -> kvm_smm_changed() but QEMU's
kvm_put_vcpu_events() calls kvm_put_nested_state() _before_
kvm_put_vcpu_events(). This can explain "vcpu->arch.hflags == 0".

Paolo, Max, any idea how this is supposed to work?

-- 
Vitaly

Re: [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state()

2023-10-26 Thread Vitaly Kuznetsov

Cc'ing Max :-) At first glance the condition in vmx_set_nested_state()
is correct so I guess we either have a stale
KVM_STATE_NESTED_RUN_PENDING when in SMM or stale smm.flags when outside
of it...

Philippe Mathieu-Daudé  writes:

> Cc'ing Vitaly.
>
> On 26/10/23 07:49, Eiichi Tsukata wrote:
>> Hi all,
>> 
>> Here is additional details on the issue.
>> 
>> We've found this issue when testing Windows Virtual Secure Mode (VSM) VMs.
>> We sometimes saw live migration failures of VSM-enabled VMs. It turned
>> out that the issue happens during live migration when VMs change boot related
>> EFI variables (ex: BootOrder, Boot0001).
>> After some debugging, I've found the race I mentioned in the commit message.
>> 
>> Symptom
>> ===
>> 
>> When it happnes with the latest Qemu which has commit 
>> https://github.com/qemu/qemu/commit/7191f24c7fcfbc1216d09
>> Qemu shows the following error message on destination.
>> 
>>qemu-system-x86_64: Failed to put registers after init: Invalid argument
>> 
>> If it happens with older Qemu which doesn't have the commit, then we see  
>> CPU dump something like this:
>> 
>>KVM internal error. Suberror: 3
>>extra data[0]: 0x8b0e
>>extra data[1]: 0x0031
>>extra data[2]: 0x0683
>>extra data[3]: 0x7f809000
>>extra data[4]: 0x0026
>>RAX= RBX= RCX= 
>> RDX=0f61
>>RSI= RDI= RBP= 
>> RSP=
>>R8 = R9 = R10= 
>> R11=
>>R12= R13= R14= 
>> R15=
>>RIP=fff0 RFL=00010002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
>>ES =0020   00c09300 DPL=0 DS   [-WA]
>>CS =0038   00a09b00 DPL=0 CS64 [-RA]
>>SS =0020   00c09300 DPL=0 DS   [-WA]
>>DS =0020   00c09300 DPL=0 DS   [-WA]
>>FS =0020   00c09300 DPL=0 DS   [-WA]
>>GS =0020   00c09300 DPL=0 DS   [-WA]
>>LDT=   00c0
>>TR =0040 7f7df050 00068fff 00808b00 DPL=0 TSS64-busy
>>GDT= 7f7df000 004f
>>IDT= 7f836000 01ff
>>CR0=80010033 CR2=fff0 CR3=7f809000 CR4=0668
>>DR0= DR1= DR2= 
>> DR3=DR6=0ff0 DR7=0400
>>EFER=0d00
>>Code=?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??  ?? 
>> ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? 
>> ?? ?? ??
>> 
>> In the above dump, CR3 is pointing to SMRAM region though SMM=0.
>> 
>> Repro
>> =
>> 
>> Repro step is pretty simple.
>> 
>> * Run SMM enabled Linux guest with secure boot enabled OVMF.
>> * Run the following script in the guest.
>> 
>>/usr/libexec/qemu-kvm &
>>while true
>>do
>>  efibootmgr -n 1
>>done
>> 
>> * Do live migration
>> 
>> On my environment, live migration fails in 20%.
>> 
>> VMX specific
>> 
>> 
>> This issue is VMX sepcific and SVM is not affected as the validation
>> in svm_set_nested_state() is a bit different from VMX one.
>> 
>> VMX:
>> 
>>static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
>>struct kvm_nested_state __user 
>> *user_kvm_nested_state,
>>struct kvm_nested_state *kvm_state)
>>{
>>..   /* * SMM temporarily disables VMX, so we cannot 
>> be in guest mode,
>>   * nor can VMLAUNCH/VMRESUME be pending.  Outside SMM, SMM flags
>>   * must be zero.
>>   */   if (is_smm(vcpu) ?
>>  (kvm_state->flags &
>>   (KVM_STATE_NESTED_GUEST_MODE | 
>> KVM_STATE_NESTED_RUN_PENDING))
>>  : kvm_state->hdr.vmx.smm.flags)
>>  return -EINVAL;
>>..
>> 
>> SVM:
>> 
>>static int svm_set_nested_state(struct kvm_vcpu *vcpu,
>>struct kvm_nested_state __user 
>> *user_kvm_nested_state,
>>struct kvm_nested_state *kvm_state)
>>{
>>..   /* SMM temporarily disables SVM, so we cannot be in guest 
>> mode.  */   if (is_smm(vcpu) && (kvm_state->flags & 
>> KVM_STATE_NESTED_GUEST_MODE))
>>  return -EINVAL;
>>..
>> 
>> Thanks,
>> 
>> Eiichi
>> 
>>> On Oct 26, 2023, at 14:42, Eiichi Tsukata  
>>> wrote:
>>>
>>> kvm_put_vcpu_events() needs to be called before kvm_put_nested_state()
>>> because vCPU's hflag is referred in KVM vmx_get_nested_state()
>>> validation. Otherwise kvm_put_nested_state() can fail with -EINVAL when
>>> a vCPU is in VMX operation and enters SMM mode. This

[PATCH RESEND 1/2] i386: Fix conditional CONFIG_SYNDBG enablement

2023-09-22 Thread Vitaly Kuznetsov

Putting HYPERV_FEAT_SYNDBG entry under "#ifdef CONFIG_SYNDBG" in
'kvm_hyperv_properties' array is wrong: as HYPERV_FEAT_SYNDBG is not
the highest feature number, the result is an empty (zeroed) entry in
the array (and not a skipped entry!). hyperv_feature_supported() is
designed to check that all CPUID bits are set but for a zeroed
feature in 'kvm_hyperv_properties' it returns 'true' so QEMU considers
HYPERV_FEAT_SYNDBG as always supported, regardless of whether KVM host
actually supports it.

To fix the issue, leave HYPERV_FEAT_SYNDBG's definition in
'kvm_hyperv_properties' array, there's nothing wrong in having it defined
even when 'CONFIG_SYNDBG' is not set. Instead, put "hv-syndbg" CPU property
under '#ifdef CONFIG_SYNDBG' to alter the existing behavior when the flag
is silently skipped in !CONFIG_SYNDBG builds.

Leave an 'assert' sentinel in hyperv_feature_supported() making sure there
are no 'holes' or improperly defined features in 'kvm_hyperv_properties'.

Fixes: d8701185f40c ("hw: hyperv: Initial commit for Synthetic Debugging 
device")
Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/cpu.c |  2 ++
 target/i386/kvm/kvm.c | 11 +++
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 2589c8e9294a..01c7e8414408 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7840,8 +7840,10 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_TLBFLUSH_DIRECT, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
+#ifdef CONFIG_SYNDBG
 DEFINE_PROP_BIT64("hv-syndbg", X86CPU, hyperv_features,
   HYPERV_FEAT_SYNDBG, 0),
+#endif
 DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
 DEFINE_PROP_BOOL("hv-enforce-cpuid", X86CPU, hyperv_enforce_cpuid, false),
 
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index af101fcdf6ff..51b381a2fbbc 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -993,7 +993,6 @@ static struct {
  .bits = HV_DEPRECATING_AEOI_RECOMMENDED}
 }
 },
-#ifdef CONFIG_SYNDBG
 [HYPERV_FEAT_SYNDBG] = {
 .desc = "Enable synthetic kernel debugger channel (hv-syndbg)",
 .flags = {
@@ -1002,7 +1001,6 @@ static struct {
 },
 .dependencies = BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_RELAXED)
 },
-#endif
 [HYPERV_FEAT_MSR_BITMAP] = {
 .desc = "enlightened MSR-Bitmap (hv-emsr-bitmap)",
 .flags = {
@@ -1254,6 +1252,13 @@ static bool hyperv_feature_supported(CPUState *cs, int 
feature)
 uint32_t func, bits;
 int i, reg;
 
+/*
+ * kvm_hyperv_properties needs to define at least one CPUID flag which
+ * must be used to detect the feature, it's hard to say whether it is
+ * supported or not otherwise.
+ */
+assert(kvm_hyperv_properties[feature].flags[0].func);
+
 for (i = 0; i < ARRAY_SIZE(kvm_hyperv_properties[feature].flags); i++) {
 
 func = kvm_hyperv_properties[feature].flags[i].func;
@@ -3483,13 +3488,11 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
 kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_STATUS,
   env->msr_hv_tsc_emulation_status);
 }
-#ifdef CONFIG_SYNDBG
 if (hyperv_feat_enabled(cpu, HYPERV_FEAT_SYNDBG) &&
 has_msr_hv_syndbg_options) {
 kvm_msr_entry_add(cpu, HV_X64_MSR_SYNDBG_OPTIONS,
   hyperv_syndbg_query_options());
 }
-#endif
 }
 if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VAPIC)) {
 kvm_msr_entry_add(cpu, HV_X64_MSR_APIC_ASSIST_PAGE,
-- 
2.41.0

[PATCH RESEND 0/2] i386: Fix Hyper-V Gen1 guests stuck on boot with 'hv-passthrough'

2023-09-22 Thread Vitaly Kuznetsov

Hyper-V Gen1 guests are getting stuck on boot when 'hv-passthrough' is
used. While 'hv-passthrough' is a debug only feature, this significantly
limit its usefullness. While debugging the problem, I found that there are
two loosely connected issues:
- 'hv-passthrough' enables 'hv-syndbg' and this is undesired.
- 'hv-syndbg's support by KVM is detected incorrectly when !CONFIG_SYNDBG.

Fix both issues; exclude 'hv-syndbg' from 'hv-passthrough' and don't allow
to turn on 'hv-syndbg' for !CONFIG_SYNDBG builds. 

Vitaly Kuznetsov (2):
  i386: Fix conditional CONFIG_SYNDBG enablement
  i386: Exclude 'hv-syndbg' from 'hv-passthrough'

 docs/system/i386/hyperv.rst | 13 +
 target/i386/cpu.c   |  2 ++
 target/i386/kvm/kvm.c   | 18 --
 3 files changed, 23 insertions(+), 10 deletions(-)

-- 
2.41.0

[PATCH RESEND 2/2] i386: Exclude 'hv-syndbg' from 'hv-passthrough'

2023-09-22 Thread Vitaly Kuznetsov

Windows with Hyper-V role enabled doesn't boot with 'hv-passthrough' when
no debugger is configured, this significantly limits the usefulness of the
feature as there's no support for subtracting Hyper-V features from CPU
flags at this moment (e.g. "-cpu host,hv-passthrough,-hv-syndbg" does not
work). While this is also theoretically fixable, 'hv-syndbg' is likely
very special and unneeded in the default set. Genuine Hyper-V doesn't seem
to enable it either.

Introduce 'skip_passthrough' flag to 'kvm_hyperv_properties' and use it as
one-off to skip 'hv-syndbg' when enabling features in 'hv-passthrough'
mode. Note, "-cpu host,hv-passthrough,hv-syndbg" can still be used if
needed.

As both 'hv-passthrough' and 'hv-syndbg' are debug features, the change
should not have any effect on production environments.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/system/i386/hyperv.rst | 13 +
 target/i386/kvm/kvm.c   |  7 +--
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/docs/system/i386/hyperv.rst b/docs/system/i386/hyperv.rst
index 2505dc4c86e0..009947e39141 100644
--- a/docs/system/i386/hyperv.rst
+++ b/docs/system/i386/hyperv.rst
@@ -262,14 +262,19 @@ Supplementary features
 ``hv-passthrough``
   In some cases (e.g. during development) it may make sense to use QEMU in
   'pass-through' mode and give Windows guests all enlightenments currently
-  supported by KVM. This pass-through mode is enabled by "hv-passthrough" CPU
-  flag.
+  supported by KVM.
 
   Note: ``hv-passthrough`` flag only enables enlightenments which are known to 
QEMU
   (have corresponding 'hv-' flag) and copies ``hv-spinlocks`` and 
``hv-vendor-id``
   values from KVM to QEMU. ``hv-passthrough`` overrides all other 'hv-' 
settings on
-  the command line. Also, enabling this flag effectively prevents migration as 
the
-  list of enabled enlightenments may differ between target and destination 
hosts.
+  the command line.
+
+  Note: ``hv-passthrough`` does not enable ``hv-syndbg`` which can prevent 
certain
+  Windows guests from booting when used without proper configuration. If 
needed,
+  ``hv-syndbg`` can be enabled additionally.
+
+  Note: ``hv-passthrough`` effectively prevents migration as the list of 
enabled
+  enlightenments may differ between target and destination hosts.
 
 ``hv-enforce-cpuid``
   By default, KVM allows the guest to use all currently supported Hyper-V
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 51b381a2fbbc..cfb24ba87df5 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -871,6 +871,7 @@ static struct {
 uint32_t bits;
 } flags[2];
 uint64_t dependencies;
+bool skip_passthrough;
 } kvm_hyperv_properties[] = {
 [HYPERV_FEAT_RELAXED] = {
 .desc = "relaxed timing (hv-relaxed)",
@@ -999,7 +1000,8 @@ static struct {
 {.func = HV_CPUID_FEATURES, .reg = R_EDX,
  .bits = HV_FEATURE_DEBUG_MSRS_AVAILABLE}
 },
-.dependencies = BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_RELAXED)
+.dependencies = BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_RELAXED),
+.skip_passthrough = true,
 },
 [HYPERV_FEAT_MSR_BITMAP] = {
 .desc = "enlightened MSR-Bitmap (hv-emsr-bitmap)",
@@ -1408,7 +1410,8 @@ bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
  * hv_build_cpuid_leaf() uses this info to build guest CPUIDs.
  */
 for (feat = 0; feat < ARRAY_SIZE(kvm_hyperv_properties); feat++) {
-if (hyperv_feature_supported(cs, feat)) {
+if (hyperv_feature_supported(cs, feat) &&
+!kvm_hyperv_properties[feat].skip_passthrough) {
 cpu->hyperv_features |= BIT(feat);
 }
 }
-- 
2.41.0

Re: [PATCH 0/2] i386: Fix Hyper-V Gen1 guests stuck on boot with 'hv-passthrough'

2023-09-22 Thread Vitaly Kuznetsov

Vitaly Kuznetsov  writes:

> Vitaly Kuznetsov  writes:
>
>> Vitaly Kuznetsov  writes:
>>
>>> Hyper-V Gen1 guests are getting stuck on boot when 'hv-passthrough' is
>>> used. While 'hv-passthrough' is a debug only feature, this significantly
>>> limit its usefullness. While debugging the problem, I found that there are
>>> two loosely connected issues:
>>> - 'hv-passthrough' enables 'hv-syndbg' and this is undesired.
>>> - 'hv-syndbg's support by KVM is detected incorrectly when !CONFIG_SYNDBG.
>>>
>>> Fix both issues; exclude 'hv-syndbg' from 'hv-passthrough' and don't allow
>>> to turn on 'hv-syndbg' for !CONFIG_SYNDBG builds. 
>>>
>>> Vitaly Kuznetsov (2):
>>>   i386: Fix conditional CONFIG_SYNDBG enablement
>>>   i386: Exclude 'hv-syndbg' from 'hv-passthrough'
>>>
>>>  docs/system/i386/hyperv.rst | 13 +
>>>  target/i386/cpu.c   |  2 ++
>>>  target/i386/kvm/kvm.c   | 18 --
>>>  3 files changed, 23 insertions(+), 10 deletions(-)
>
> Monthly ping)

Turns out these patches were never merged and honestly I forgot about
them myself. Will resend shortly.

-- 
Vitaly

Re: [PATCH 0/2] i386: Fix Hyper-V Gen1 guests stuck on boot with 'hv-passthrough'

2023-07-28 Thread Vitaly Kuznetsov

Vitaly Kuznetsov  writes:

> Vitaly Kuznetsov  writes:
>
>> Hyper-V Gen1 guests are getting stuck on boot when 'hv-passthrough' is
>> used. While 'hv-passthrough' is a debug only feature, this significantly
>> limit its usefullness. While debugging the problem, I found that there are
>> two loosely connected issues:
>> - 'hv-passthrough' enables 'hv-syndbg' and this is undesired.
>> - 'hv-syndbg's support by KVM is detected incorrectly when !CONFIG_SYNDBG.
>>
>> Fix both issues; exclude 'hv-syndbg' from 'hv-passthrough' and don't allow
>> to turn on 'hv-syndbg' for !CONFIG_SYNDBG builds. 
>>
>> Vitaly Kuznetsov (2):
>>   i386: Fix conditional CONFIG_SYNDBG enablement
>>   i386: Exclude 'hv-syndbg' from 'hv-passthrough'
>>
>>  docs/system/i386/hyperv.rst | 13 +
>>  target/i386/cpu.c   |  2 ++
>>  target/i386/kvm/kvm.c   | 18 --
>>  3 files changed, 23 insertions(+), 10 deletions(-)

Monthly ping)

-- 
Vitaly

Re: [PATCH 0/2] i386: Fix Hyper-V Gen1 guests stuck on boot with 'hv-passthrough'

2023-06-27 Thread Vitaly Kuznetsov

Vitaly Kuznetsov  writes:

> Hyper-V Gen1 guests are getting stuck on boot when 'hv-passthrough' is
> used. While 'hv-passthrough' is a debug only feature, this significantly
> limit its usefullness. While debugging the problem, I found that there are
> two loosely connected issues:
> - 'hv-passthrough' enables 'hv-syndbg' and this is undesired.
> - 'hv-syndbg's support by KVM is detected incorrectly when !CONFIG_SYNDBG.
>
> Fix both issues; exclude 'hv-syndbg' from 'hv-passthrough' and don't allow
> to turn on 'hv-syndbg' for !CONFIG_SYNDBG builds. 
>
> Vitaly Kuznetsov (2):
>   i386: Fix conditional CONFIG_SYNDBG enablement
>   i386: Exclude 'hv-syndbg' from 'hv-passthrough'
>
>  docs/system/i386/hyperv.rst | 13 +
>  target/i386/cpu.c   |  2 ++
>  target/i386/kvm/kvm.c   | 18 --
>  3 files changed, 23 insertions(+), 10 deletions(-)

Ping)

-- 
Vitaly

[PATCH 2/2] i386: Exclude 'hv-syndbg' from 'hv-passthrough'

2023-06-12 Thread Vitaly Kuznetsov

Windows with Hyper-V role enabled doesn't boot with 'hv-passthrough' when
no debugger is configured, this significantly limits the usefulness of the
feature as there's no support for subtracting Hyper-V features from CPU
flags at this moment (e.g. "-cpu host,hv-passthrough,-hv-syndbg" does not
work). While this is also theoretically fixable, 'hv-syndbg' is likely
very special and unneeded in the default set. Genuine Hyper-V doesn't seem
to enable it either.

Introduce 'skip_passthrough' flag to 'kvm_hyperv_properties' and use it as
one-off to skip 'hv-syndbg' when enabling features in 'hv-passthrough'
mode. Note, "-cpu host,hv-passthrough,hv-syndbg" can still be used if
needed.

As both 'hv-passthrough' and 'hv-syndbg' are debug features, the change
should not have any effect on production environments.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/system/i386/hyperv.rst | 13 +
 target/i386/kvm/kvm.c   |  7 +--
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/docs/system/i386/hyperv.rst b/docs/system/i386/hyperv.rst
index 2505dc4c86e0..009947e39141 100644
--- a/docs/system/i386/hyperv.rst
+++ b/docs/system/i386/hyperv.rst
@@ -262,14 +262,19 @@ Supplementary features
 ``hv-passthrough``
   In some cases (e.g. during development) it may make sense to use QEMU in
   'pass-through' mode and give Windows guests all enlightenments currently
-  supported by KVM. This pass-through mode is enabled by "hv-passthrough" CPU
-  flag.
+  supported by KVM.
 
   Note: ``hv-passthrough`` flag only enables enlightenments which are known to 
QEMU
   (have corresponding 'hv-' flag) and copies ``hv-spinlocks`` and 
``hv-vendor-id``
   values from KVM to QEMU. ``hv-passthrough`` overrides all other 'hv-' 
settings on
-  the command line. Also, enabling this flag effectively prevents migration as 
the
-  list of enabled enlightenments may differ between target and destination 
hosts.
+  the command line.
+
+  Note: ``hv-passthrough`` does not enable ``hv-syndbg`` which can prevent 
certain
+  Windows guests from booting when used without proper configuration. If 
needed,
+  ``hv-syndbg`` can be enabled additionally.
+
+  Note: ``hv-passthrough`` effectively prevents migration as the list of 
enabled
+  enlightenments may differ between target and destination hosts.
 
 ``hv-enforce-cpuid``
   By default, KVM allows the guest to use all currently supported Hyper-V
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 88c75f58f0a6..fbaaacf9877c 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -867,6 +867,7 @@ static struct {
 uint32_t bits;
 } flags[2];
 uint64_t dependencies;
+bool skip_passthrough;
 } kvm_hyperv_properties[] = {
 [HYPERV_FEAT_RELAXED] = {
 .desc = "relaxed timing (hv-relaxed)",
@@ -995,7 +996,8 @@ static struct {
 {.func = HV_CPUID_FEATURES, .reg = R_EDX,
  .bits = HV_FEATURE_DEBUG_MSRS_AVAILABLE}
 },
-.dependencies = BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_RELAXED)
+.dependencies = BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_RELAXED),
+.skip_passthrough = true,
 },
 [HYPERV_FEAT_MSR_BITMAP] = {
 .desc = "enlightened MSR-Bitmap (hv-emsr-bitmap)",
@@ -1404,7 +1406,8 @@ bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
  * hv_build_cpuid_leaf() uses this info to build guest CPUIDs.
  */
 for (feat = 0; feat < ARRAY_SIZE(kvm_hyperv_properties); feat++) {
-if (hyperv_feature_supported(cs, feat)) {
+if (hyperv_feature_supported(cs, feat) &&
+!kvm_hyperv_properties[feat].skip_passthrough) {
 cpu->hyperv_features |= BIT(feat);
 }
 }
-- 
2.40.1

[PATCH 0/2] i386: Fix Hyper-V Gen1 guests stuck on boot with 'hv-passthrough'

2023-06-12 Thread Vitaly Kuznetsov

Hyper-V Gen1 guests are getting stuck on boot when 'hv-passthrough' is
used. While 'hv-passthrough' is a debug only feature, this significantly
limit its usefullness. While debugging the problem, I found that there are
two loosely connected issues:
- 'hv-passthrough' enables 'hv-syndbg' and this is undesired.
- 'hv-syndbg's support by KVM is detected incorrectly when !CONFIG_SYNDBG.

Fix both issues; exclude 'hv-syndbg' from 'hv-passthrough' and don't allow
to turn on 'hv-syndbg' for !CONFIG_SYNDBG builds. 

Vitaly Kuznetsov (2):
  i386: Fix conditional CONFIG_SYNDBG enablement
  i386: Exclude 'hv-syndbg' from 'hv-passthrough'

 docs/system/i386/hyperv.rst | 13 +
 target/i386/cpu.c   |  2 ++
 target/i386/kvm/kvm.c   | 18 --
 3 files changed, 23 insertions(+), 10 deletions(-)

-- 
2.40.1

[PATCH 1/2] i386: Fix conditional CONFIG_SYNDBG enablement

2023-06-12 Thread Vitaly Kuznetsov

Putting HYPERV_FEAT_SYNDBG entry under "#ifdef CONFIG_SYNDBG" in
'kvm_hyperv_properties' array is wrong: as HYPERV_FEAT_SYNDBG is not
the highest feature number, the result is an empty (zeroed) entry in
the array (and not a skipped entry!). hyperv_feature_supported() is
designed to check that all CPUID bits are set but for a zeroed
feature in 'kvm_hyperv_properties' it returns 'true' so QEMU considers
HYPERV_FEAT_SYNDBG as always supported, regardless of whether KVM host
actually supports it.

To fix the issue, leave HYPERV_FEAT_SYNDBG's definition in
'kvm_hyperv_properties' array, there's nothing wrong in having it defined
even when 'CONFIG_SYNDBG' is not set. Instead, put "hv-syndbg" CPU property
under '#ifdef CONFIG_SYNDBG' to alter the existing behavior when the flag
is silently skipped in !CONFIG_SYNDBG builds.

Leave an 'assert' sentinel in hyperv_feature_supported() making sure there
are no 'holes' or improperly defined features in 'kvm_hyperv_properties'.

Fixes: d8701185f40c ("hw: hyperv: Initial commit for Synthetic Debugging 
device")
Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/cpu.c |  2 ++
 target/i386/kvm/kvm.c | 11 +++
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 1242bd541a53..caa207849e9a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7564,8 +7564,10 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_TLBFLUSH_DIRECT, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
+#ifdef CONFIG_SYNDBG
 DEFINE_PROP_BIT64("hv-syndbg", X86CPU, hyperv_features,
   HYPERV_FEAT_SYNDBG, 0),
+#endif
 DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
 DEFINE_PROP_BOOL("hv-enforce-cpuid", X86CPU, hyperv_enforce_cpuid, false),
 
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index de531842f6b1..88c75f58f0a6 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -989,7 +989,6 @@ static struct {
  .bits = HV_DEPRECATING_AEOI_RECOMMENDED}
 }
 },
-#ifdef CONFIG_SYNDBG
 [HYPERV_FEAT_SYNDBG] = {
 .desc = "Enable synthetic kernel debugger channel (hv-syndbg)",
 .flags = {
@@ -998,7 +997,6 @@ static struct {
 },
 .dependencies = BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_RELAXED)
 },
-#endif
 [HYPERV_FEAT_MSR_BITMAP] = {
 .desc = "enlightened MSR-Bitmap (hv-emsr-bitmap)",
 .flags = {
@@ -1250,6 +1248,13 @@ static bool hyperv_feature_supported(CPUState *cs, int 
feature)
 uint32_t func, bits;
 int i, reg;
 
+/*
+ * kvm_hyperv_properties needs to define at least one CPUID flag which
+ * must be used to detect the feature, it's hard to say whether it is
+ * supported or not otherwise.
+ */
+assert(kvm_hyperv_properties[feature].flags[0].func);
+
 for (i = 0; i < ARRAY_SIZE(kvm_hyperv_properties[feature].flags); i++) {
 
 func = kvm_hyperv_properties[feature].flags[i].func;
@@ -3474,13 +3479,11 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
 kvm_msr_entry_add(cpu, HV_X64_MSR_TSC_EMULATION_STATUS,
   env->msr_hv_tsc_emulation_status);
 }
-#ifdef CONFIG_SYNDBG
 if (hyperv_feat_enabled(cpu, HYPERV_FEAT_SYNDBG) &&
 has_msr_hv_syndbg_options) {
 kvm_msr_entry_add(cpu, HV_X64_MSR_SYNDBG_OPTIONS,
   hyperv_syndbg_query_options());
 }
-#endif
 }
 if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VAPIC)) {
 kvm_msr_entry_add(cpu, HV_X64_MSR_APIC_ASSIST_PAGE,
-- 
2.40.1

Re: Expose support for HyperV features via QMP

2023-02-09 Thread Vitaly Kuznetsov

Alex Bennée  writes:

> "manish.mishra"  writes:
>
>> Hi Everyone,
>>
>> Checking if there is any feedback on this.
>
> I've expanded the CC list to some relevant maintainers and people who
> have touched that code in case this was missed.
>
>> Thanks
>>
>> Manish Mishra
>>
>> On 31/01/23 8:17 pm, manish.mishra wrote:
>>
>>  Hi Everyone,
>>  I hope everyone is doing great. We wanted to check why we do not expose 
>> support for HyperV features in
>>  Qemu similar to what we do for normal CPU features via query-cpu-defs or 
>> cpu-model-expansion QMP
>>  commands. This support is required for live migration with HyperV features 
>> as hyperv passthrough is not
>>  an option. If users had knowledge of what features are supported by source 
>> and destination, VM can be
>>  started with an intersection of features supported by both source and 
>> destination.
>>  If there is no specific reason for not doing this, does it make sense to 
>> add a new QMP which expose
>>  support (internally also validating with KVM or KVM_GET_SUPPORTED_HV_CPUID 
>> ioctl) for HyperV
>>  features.
>>  Apologies in advance if i misunderstood something.
>>

Thanks for Ccing me. 

Hyper-V features should appear in QMP since

commit 071ce4b03becf9e2df6b758fde9609be8ddf56f1
Author: Vitaly Kuznetsov 
Date:   Tue Jun 8 14:08:13 2021 +0200

i386: expand Hyper-V features during CPU feature expansion time

also, the support for Hypre-V feature discovery was just added to
libvirt:

903ea9370d qemu_capabilities: Report Hyper-V Enlightenments in domcapabilities
10f4784864 qemu_capabilities: Query for Hyper-V Enlightenments
ff8731680b qemuMonitorJSONGetCPUModelExpansion: Introduce @hv_passthrough 
argument
7c12eb2397 qemuMonitorJSONMakeCPUModel: Introduce @hv_passthrough argument
7c1ecfd512 domain_capabilities: Expose Hyper-V Enlightenments
179e45d237 virDomainCapsEnumFormat: Retrun void
a7789d9324 virDomainCapsEnumFormat: Switch to virXMLFormatElement()

in case this is not enough, could you please elaborate on the use-case
you have in mind?

-- 
Vitaly

Re: [PATCH] target/i386/cpu: disable PERFCORE for AMD when cpu.pmu is off

2022-10-31 Thread Vitaly Kuznetsov

Liang Yan  writes:

> With cpu.pmu=off, perfctr_core could still be seen in an AMD guest cpuid.
> By further digging, I found cpu.perfctr_core did the trick. However,
> considering the 'enable_pmu' in KVM could work on both Intel and AMD,
> we may add AMD PMU control under 'enabe_pmu' in QEMU too.
>
> This change will overide the property 'perfctr_ctr' and change the AMD PMU
> to off by default.
>
> Signed-off-by: Liang Yan 
> ---
>  target/i386/cpu.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 22b681ca37..edf5413c90 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5706,6 +5706,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
> uint32_t count,
>  *ecx |= 1 << 1;/* CmpLegacy bit */
>  }
>  }
> +
> +if (!cpu->enable_pmu) {
> +*ecx &= ~CPUID_EXT3_PERFCORE;
> +}
>  break;
>  case 0x8002:
>  case 0x8003:

I may be missing something but my first impression is that this will
make CPUID_EXT3_PERFCORE bit disappear when a !enable_pmu VM is migrated
from an old QEMU (pre-patch) to a new one. If so, then additional
precautions should be taking against that (e.g. tying the change to
CPU/machine model versions, for example).

-- 
Vitaly

Re: [PATCH] i386: Fix KVM_CAP_ADJUST_CLOCK capability check

2022-10-07 Thread Vitaly Kuznetsov

Paolo Bonzini  writes:

> Hi, a similar patch is now in.
>

Indeed,

commit c4ef867f2949bf2a2ae18a4e27cf1a34bbc8aecb
Author: Ray Zhang 
Date:   Thu Sep 22 18:05:23 2022 +0800

target/i386/kvm: fix kvmclock_current_nsec: Assertion `time.tsc_timestamp 
<= migration_tsc' failed

solves the problem as well.

-- 
Vitaly

Re: [PATCH] i386: Fix KVM_CAP_ADJUST_CLOCK capability check

2022-10-07 Thread Vitaly Kuznetsov

Vitaly Kuznetsov  writes:

> Vitaly Kuznetsov  writes:
>
>> KVM commit c68dc1b577ea ("KVM: x86: Report host tsc and realtime values in
>> KVM_GET_CLOCK") broke migration of certain workloads, e.g. Win11 + WSL2
>> guest reboots immediately after migration. KVM, however, is not to
>> blame this time. When KVM_CAP_ADJUST_CLOCK capability is checked, the
>> result is all supported flags (which the above mentioned KVM commit
>> enhanced) but kvm_has_adjust_clock_stable() wants it to be
>> KVM_CLOCK_TSC_STABLE precisely. The result is that 'clock_is_reliable'
>> is not set in vmstate and the saved clock reading is discarded in
>> kvmclock_vm_state_change().
>>
>> Signed-off-by: Vitaly Kuznetsov 
>> ---
>>  target/i386/kvm/kvm.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>> index a1fd1f53791d..c33192a87dcb 100644
>> --- a/target/i386/kvm/kvm.c
>> +++ b/target/i386/kvm/kvm.c
>> @@ -157,7 +157,7 @@ bool kvm_has_adjust_clock_stable(void)
>>  {
>>  int ret = kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK);
>>  
>> -return (ret == KVM_CLOCK_TSC_STABLE);
>> +return ret & KVM_CLOCK_TSC_STABLE;
>>  }
>>  
>>  bool kvm_has_adjust_clock(void)
>
> Ping) This issue seems to introduce major migration issues with KVM >= v5.16

Ping)

-- 
Vitaly

Re: [PATCH] i386: Fix KVM_CAP_ADJUST_CLOCK capability check

2022-09-27 Thread Vitaly Kuznetsov

Vitaly Kuznetsov  writes:

> KVM commit c68dc1b577ea ("KVM: x86: Report host tsc and realtime values in
> KVM_GET_CLOCK") broke migration of certain workloads, e.g. Win11 + WSL2
> guest reboots immediately after migration. KVM, however, is not to
> blame this time. When KVM_CAP_ADJUST_CLOCK capability is checked, the
> result is all supported flags (which the above mentioned KVM commit
> enhanced) but kvm_has_adjust_clock_stable() wants it to be
> KVM_CLOCK_TSC_STABLE precisely. The result is that 'clock_is_reliable'
> is not set in vmstate and the saved clock reading is discarded in
> kvmclock_vm_state_change().
>
> Signed-off-by: Vitaly Kuznetsov 
> ---
>  target/i386/kvm/kvm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index a1fd1f53791d..c33192a87dcb 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -157,7 +157,7 @@ bool kvm_has_adjust_clock_stable(void)
>  {
>  int ret = kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK);
>  
> -return (ret == KVM_CLOCK_TSC_STABLE);
> +return ret & KVM_CLOCK_TSC_STABLE;
>  }
>  
>  bool kvm_has_adjust_clock(void)

Ping) This issue seems to introduce major migration issues with KVM >= v5.16

-- 
Vitaly

[PATCH] i386: Fix KVM_CAP_ADJUST_CLOCK capability check

2022-09-20 Thread Vitaly Kuznetsov

KVM commit c68dc1b577ea ("KVM: x86: Report host tsc and realtime values in
KVM_GET_CLOCK") broke migration of certain workloads, e.g. Win11 + WSL2
guest reboots immediately after migration. KVM, however, is not to
blame this time. When KVM_CAP_ADJUST_CLOCK capability is checked, the
result is all supported flags (which the above mentioned KVM commit
enhanced) but kvm_has_adjust_clock_stable() wants it to be
KVM_CLOCK_TSC_STABLE precisely. The result is that 'clock_is_reliable'
is not set in vmstate and the saved clock reading is discarded in
kvmclock_vm_state_change().

Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/kvm/kvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index a1fd1f53791d..c33192a87dcb 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -157,7 +157,7 @@ bool kvm_has_adjust_clock_stable(void)
 {
 int ret = kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK);
 
-return (ret == KVM_CLOCK_TSC_STABLE);
+return ret & KVM_CLOCK_TSC_STABLE;
 }
 
 bool kvm_has_adjust_clock(void)
-- 
2.37.3

[PATCH v1 1/2] i386: reset KVM nested state upon CPU reset

2022-08-18 Thread Vitaly Kuznetsov

Make sure env->nested_state is cleaned up when a vCPU is reset, it may
be stale after an incoming migration, kvm_arch_put_registers() may
end up failing or putting vCPU in a weird state.

Reviewed-by: Maxim Levitsky 
Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/kvm/kvm.c | 37 +++--
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index f148a6d52fa4..4f8dacc1d4b5 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1695,6 +1695,30 @@ static void kvm_init_xsave(CPUX86State *env)
env->xsave_buf_len);
 }
 
+static void kvm_init_nested_state(CPUX86State *env)
+{
+struct kvm_vmx_nested_state_hdr *vmx_hdr;
+uint32_t size;
+
+if (!env->nested_state) {
+return;
+}
+
+size = env->nested_state->size;
+
+memset(env->nested_state, 0, size);
+env->nested_state->size = size;
+
+if (cpu_has_vmx(env)) {
+env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
+vmx_hdr = >nested_state->hdr.vmx;
+vmx_hdr->vmxon_pa = -1ull;
+vmx_hdr->vmcs12_pa = -1ull;
+} else if (cpu_has_svm(env)) {
+env->nested_state->format = KVM_STATE_NESTED_FORMAT_SVM;
+}
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
 struct {
@@ -2122,19 +2146,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
 assert(max_nested_state_len >= offsetof(struct kvm_nested_state, 
data));
 
 if (cpu_has_vmx(env) || cpu_has_svm(env)) {
-struct kvm_vmx_nested_state_hdr *vmx_hdr;
-
 env->nested_state = g_malloc0(max_nested_state_len);
 env->nested_state->size = max_nested_state_len;
 
-if (cpu_has_vmx(env)) {
-env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
-vmx_hdr = >nested_state->hdr.vmx;
-vmx_hdr->vmxon_pa = -1ull;
-vmx_hdr->vmcs12_pa = -1ull;
-} else {
-env->nested_state->format = KVM_STATE_NESTED_FORMAT_SVM;
-}
+kvm_init_nested_state(env);
 }
 }
 
@@ -2199,6 +2214,8 @@ void kvm_arch_reset_vcpu(X86CPU *cpu)
 /* enabled by default */
 env->poll_control_msr = 1;
 
+kvm_init_nested_state(env);
+
 sev_es_set_reset_vector(CPU(cpu));
 }
 
-- 
2.37.1

[PATCH v1 0/2] i386: KVM: Fix 'system_reset' failures when vCPU is in VMX root operation

2022-08-18 Thread Vitaly Kuznetsov

Changes since RFC:
- Call kvm_put_msr_feature_control() before kvm_put_sregs2() to achieve
 the same result [Paolo].
- Add Maxim's R-b to PATCH1.

It was discovered that Windows 11 with WSL2 (Hyper-V) enabled guests fail
to reboot when QEMU's 'system_reset' command is issued. The problem appears
to be that KVM_SET_SREGS2 fails because zeroed CR4 register value doesn't
pass vmx_is_valid_cr4() check in KVM as certain bits can't be zero while in
VMX root operation (post-VMXON). kvm_arch_put_registers() does call 
kvm_put_nested_state() which is supposed to kick vCPU out of VMX root
operation, however, it only does so after kvm_put_sregs2() and there's
a good reason for that: 'real' nested state requires e.g. EFER.SVME to
be set. 

The root cause of the issue seems to be that QEMU is doing quite a lot
to forcefully reset a vCPU as KVM doesn't export kvm_vcpu_reset() (or,
rather, it's super-set) yet. While all the numerous existing APIs for
setting a vCPU state work fine for a newly created vCPU, using them for
vCPU reset is a mess caused by various dependencies between different
components of the state (VMX, SMM, MSRs, XCRs, CPUIDs, ...). It would've
been possible to allow to set 'inconsistent' state and only validate it
upon VCPU_RUN from the very beginning but that ship has long sailed for
KVM. A new, dedicated API for vCPU reset is likely the way to go.

Resolve the immediate issue by setting MSR_IA32_FEATURE_CONTROL before
kvm_put_sregs2() (and kvm_put_nested_state()), this ensures vCPU gets
kicked out of VMX root operation.

Vitaly Kuznetsov (2):
  i386: reset KVM nested state upon CPU reset
  i386: do kvm_put_msr_feature_control() first thing when vCPU is reset

 target/i386/kvm/kvm.c | 54 +++
 1 file changed, 39 insertions(+), 15 deletions(-)

-- 
2.37.1

[PATCH v1 2/2] i386: do kvm_put_msr_feature_control() first thing when vCPU is reset

2022-08-18 Thread Vitaly Kuznetsov

kvm_put_sregs2() fails to reset 'locked' CR4/CR0 bits upon vCPU reset when
it is in VMX root operation. Do kvm_put_msr_feature_control() before
kvm_put_sregs2() to (possibly) kick vCPU out of VMX root operation. It also
seems logical to do kvm_put_msr_feature_control() before
kvm_put_nested_state() and not after it, especially when 'real' nested
state is set.

Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/kvm/kvm.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 4f8dacc1d4b5..a1fd1f53791d 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -4529,6 +4529,18 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 
 assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
 
+/*
+ * Put MSR_IA32_FEATURE_CONTROL first, this ensures the VM gets out of VMX
+ * root operation upon vCPU reset. kvm_put_msr_feature_control() should 
also
+ * preceed kvm_put_nested_state() when 'real' nested state is set.
+ */
+if (level >= KVM_PUT_RESET_STATE) {
+ret = kvm_put_msr_feature_control(x86_cpu);
+if (ret < 0) {
+return ret;
+}
+}
+
 /* must be before kvm_put_nested_state so that EFER.SVME is set */
 ret = has_sregs2 ? kvm_put_sregs2(x86_cpu) : kvm_put_sregs(x86_cpu);
 if (ret < 0) {
@@ -4540,11 +4552,6 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 if (ret < 0) {
 return ret;
 }
-
-ret = kvm_put_msr_feature_control(x86_cpu);
-if (ret < 0) {
-return ret;
-}
 }
 
 if (level == KVM_PUT_FULL_STATE) {
-- 
2.37.1

Re: [PATCH RFC v1 2/2] i386: reorder kvm_put_sregs2() and kvm_put_nested_state() when vCPU is reset

2022-08-10 Thread Vitaly Kuznetsov

Maxim Levitsky  writes:

> On Wed, 2022-08-10 at 16:00 +0200, Vitaly Kuznetsov wrote:
>> Setting nested state upon migration needs to happen after kvm_put_sregs2()
>> to e.g. have EFER.SVME set. This, however, doesn't work for vCPU reset:
>> when vCPU is in VMX root operation, certain CR bits are locked and
>> kvm_put_sregs2() may fail. As nested state is fully cleaned up upon
>> vCPU reset (kvm_arch_reset_vcpu() -> kvm_init_nested_state()), calling
>> kvm_put_nested_state() before kvm_put_sregs2() is OK, this will ensure
>> that vCPU is *not* in VMX root opertaion.
>> 
>> Signed-off-by: Vitaly Kuznetsov 
>> ---
>>  target/i386/kvm/kvm.c | 20 ++--
>>  1 file changed, 18 insertions(+), 2 deletions(-)
>> 
>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>> index 4f8dacc1d4b5..73e3880fa57b 100644
>> --- a/target/i386/kvm/kvm.c
>> +++ b/target/i386/kvm/kvm.c
>> @@ -4529,18 +4529,34 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
>>  
>>  assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
>>  
>> -    /* must be before kvm_put_nested_state so that EFER.SVME is set */
>> +    /*
>> + * When resetting a vCPU, make sure to reset nested state first to
>> + * e.g clear VMXON state and unlock certain CR4 bits.
>> + */
>> +    if (level == KVM_PUT_RESET_STATE) {
>> +    ret = kvm_put_nested_state(x86_cpu);
>> +    if (ret < 0) {
>> +    return ret;
>> +    }
>
> I should have mentioned this, I actually already debugged the same issue while
> trying to reproduce the smm int window bug.
> 100% my fault.
>
> I also share the same feeling that this might be yet another 'whack a mole' 
> and
> break somewhere else, but overall it does make sense.

This certainly *is* a 'whack a mole' and I'm sure there are other cases
when one of calls in kvm_arch_put_registers() fails. We need to work on
what's missing so we can expose kvm_vcpu_reset() to VMMs.

>
>
> Reviewed-by: Maxim Levitsky 
>

Thanks!

-- 
Vitaly

[PATCH RFC v1 1/2] i386: reset KVM nested state upon CPU reset

2022-08-10 Thread Vitaly Kuznetsov

Make sure env->nested_state is cleaned up when a vCPU is reset, it may
be stale after an incoming migration, kvm_arch_put_registers() may
end up failing or putting vCPU in a weird state.

Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/kvm/kvm.c | 37 +++--
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index f148a6d52fa4..4f8dacc1d4b5 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1695,6 +1695,30 @@ static void kvm_init_xsave(CPUX86State *env)
env->xsave_buf_len);
 }
 
+static void kvm_init_nested_state(CPUX86State *env)
+{
+struct kvm_vmx_nested_state_hdr *vmx_hdr;
+uint32_t size;
+
+if (!env->nested_state) {
+return;
+}
+
+size = env->nested_state->size;
+
+memset(env->nested_state, 0, size);
+env->nested_state->size = size;
+
+if (cpu_has_vmx(env)) {
+env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
+vmx_hdr = >nested_state->hdr.vmx;
+vmx_hdr->vmxon_pa = -1ull;
+vmx_hdr->vmcs12_pa = -1ull;
+} else if (cpu_has_svm(env)) {
+env->nested_state->format = KVM_STATE_NESTED_FORMAT_SVM;
+}
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
 struct {
@@ -2122,19 +2146,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
 assert(max_nested_state_len >= offsetof(struct kvm_nested_state, 
data));
 
 if (cpu_has_vmx(env) || cpu_has_svm(env)) {
-struct kvm_vmx_nested_state_hdr *vmx_hdr;
-
 env->nested_state = g_malloc0(max_nested_state_len);
 env->nested_state->size = max_nested_state_len;
 
-if (cpu_has_vmx(env)) {
-env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
-vmx_hdr = >nested_state->hdr.vmx;
-vmx_hdr->vmxon_pa = -1ull;
-vmx_hdr->vmcs12_pa = -1ull;
-} else {
-env->nested_state->format = KVM_STATE_NESTED_FORMAT_SVM;
-}
+kvm_init_nested_state(env);
 }
 }
 
@@ -2199,6 +2214,8 @@ void kvm_arch_reset_vcpu(X86CPU *cpu)
 /* enabled by default */
 env->poll_control_msr = 1;
 
+kvm_init_nested_state(env);
+
 sev_es_set_reset_vector(CPU(cpu));
 }
 
-- 
2.37.1

[PATCH RFC v1 2/2] i386: reorder kvm_put_sregs2() and kvm_put_nested_state() when vCPU is reset

2022-08-10 Thread Vitaly Kuznetsov

Setting nested state upon migration needs to happen after kvm_put_sregs2()
to e.g. have EFER.SVME set. This, however, doesn't work for vCPU reset:
when vCPU is in VMX root operation, certain CR bits are locked and
kvm_put_sregs2() may fail. As nested state is fully cleaned up upon
vCPU reset (kvm_arch_reset_vcpu() -> kvm_init_nested_state()), calling
kvm_put_nested_state() before kvm_put_sregs2() is OK, this will ensure
that vCPU is *not* in VMX root opertaion.

Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/kvm/kvm.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 4f8dacc1d4b5..73e3880fa57b 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -4529,18 +4529,34 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 
 assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
 
-/* must be before kvm_put_nested_state so that EFER.SVME is set */
+/*
+ * When resetting a vCPU, make sure to reset nested state first to
+ * e.g clear VMXON state and unlock certain CR4 bits.
+ */
+if (level == KVM_PUT_RESET_STATE) {
+ret = kvm_put_nested_state(x86_cpu);
+if (ret < 0) {
+return ret;
+}
+}
+
 ret = has_sregs2 ? kvm_put_sregs2(x86_cpu) : kvm_put_sregs(x86_cpu);
 if (ret < 0) {
 return ret;
 }
 
-if (level >= KVM_PUT_RESET_STATE) {
+/*
+ * When putting full CPU state, kvm_put_nested_state() must happen after
+ * kvm_put_sregs{,2} so that e.g. EFER.SVME is already set.
+ */
+if (level == KVM_PUT_FULL_STATE) {
 ret = kvm_put_nested_state(x86_cpu);
 if (ret < 0) {
 return ret;
 }
+}
 
+if (level >= KVM_PUT_RESET_STATE) {
 ret = kvm_put_msr_feature_control(x86_cpu);
 if (ret < 0) {
 return ret;
-- 
2.37.1

[PATCH RFC v1 0/2] i386: KVM: Fix 'system_reset' failures when vCPU is in VMX root operation

2022-08-10 Thread Vitaly Kuznetsov

It was discovered that Windows 11 with WSL2 (Hyper-V) enabled guests fail
to reboot when QEMU's 'system_reset' command is issued. The problem appears
to be that KVM_SET_SREGS2 fails because zeroed CR4 register value doesn't
pass vmx_is_valid_cr4() check in KVM as certain bits can't be zero while in
VMX root operation (post-VMXON). kvm_arch_put_registers() does call 
kvm_put_nested_state() which is supposed to kick vCPU out of VMX root
operation, however, it only does so after kvm_put_sregs2() and there's
a good reason for that: 'real' nested state requires e.g. EFER.SVME to
be set. While swapping kvm_put_sregs2()/kvm_put_nested_state() order
in kvm_arch_put_registers() can't be done in KVM_PUT_FULL_STATE case,
doing it in KVM_PUT_RESET_STATE seems like a reasonable band aid.

The root cause of the issue seems to be that QEMU is doing quite a lot
to forcefully reset a vCPU as KVM doesn't export kvm_vcpu_reset() (or,
rather, it's super-set) yet. While all the numerous existing APIs for
setting a vCPU state work fine for a newly created vCPU, using them for
vCPU reset is a mess caused by various dependencies between different
components of the state (VMX, SMM, MSRs, XCRs, CPUIDs, ...). It would've
been possible to allow to set 'inconsistent' state and only validate it
upon VCPU_RUN from the very beginning but that ship has long sailed for
KVM. A new, dedicated API for vCPU reset is likely the way to go.

RFC part: the immediate issue could've probably been solved in KVM too
by avoiding vmx_is_valid_cr4() check from __set_sregs2() and hoping that
someone will check for the resulting inconsistency later. I don't quite
like this option so I didn't explore it in depth.

Vitaly Kuznetsov (2):
  i386: reset KVM nested state upon CPU reset
  i386: reorder kvm_put_sregs2() and kvm_put_nested_state() when vCPU is
reset

 target/i386/kvm/kvm.c | 57 ++-
 1 file changed, 45 insertions(+), 12 deletions(-)

-- 
2.37.1

[PATCH v4 6/6] i386: docs: Convert hyperv.txt to rST

2022-05-25 Thread Vitaly Kuznetsov

rSTify docs/hyperv.txt and link it from docs/system/target-i386.rst.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt | 303 
 docs/system/i386/hyperv.rst | 288 ++
 docs/system/target-i386.rst |   1 +
 3 files changed, 289 insertions(+), 303 deletions(-)
 delete mode 100644 docs/hyperv.txt
 create mode 100644 docs/system/i386/hyperv.rst

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
deleted file mode 100644
index 14a7f449ead9..
--- a/docs/hyperv.txt
+++ /dev/null
@@ -1,303 +0,0 @@
-Hyper-V Enlightenments
-==
-
-
-1. Description
-===
-In some cases when implementing a hardware interface in software is slow, KVM
-implements its own paravirtualized interfaces. This works well for Linux as
-guest support for such features is added simultaneously with the feature 
itself.
-It may, however, be hard-to-impossible to add support for these interfaces to
-proprietary OSes, namely, Microsoft Windows.
-
-KVM on x86 implements Hyper-V Enlightenments for Windows guests. These features
-make Windows and Hyper-V guests think they're running on top of a Hyper-V
-compatible hypervisor and use Hyper-V specific features.
-
-
-2. Setup
-=
-No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
-QEMU, individual enlightenments can be enabled through CPU flags, e.g:
-
-  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
-
-Sometimes there are dependencies between enlightenments, QEMU is supposed to
-check that the supplied configuration is sane.
-
-When any set of the Hyper-V enlightenments is enabled, QEMU changes hypervisor
-identification (CPUID 0x4000..0x400A) to Hyper-V. KVM identification
-and features are kept in leaves 0x4100..0x4101.
-
-
-3. Existing enlightenments
-===
-
-3.1. hv-relaxed
-
-This feature tells guest OS to disable watchdog timeouts as it is running on a
-hypervisor. It is known that some Windows versions will do this even when they
-see 'hypervisor' CPU flag.
-
-3.2. hv-vapic
-==
-Provides so-called VP Assist page MSR to guest allowing it to work with APIC
-more efficiently. In particular, this enlightenment allows paravirtualized
-(exit-less) EOI processing.
-
-3.3. hv-spinlocks=xxx
-==
-Enables paravirtualized spinlocks. The parameter indicates how many times
-spinlock acquisition should be attempted before indicating the situation to the
-hypervisor. A special value 0x indicates "never notify".
-
-3.4. hv-vpindex
-
-Provides HV_X64_MSR_VP_INDEX (0x4002) MSR to the guest which has Virtual
-processor index information. This enlightenment makes sense in conjunction with
-hv-synic, hv-stimer and other enlightenments which require the guest to know 
its
-Virtual Processor indices (e.g. when VP index needs to be passed in a
-hypercall).
-
-3.5. hv-runtime
-
-Provides HV_X64_MSR_VP_RUNTIME (0x4010) MSR to the guest. The MSR keeps the
-virtual processor run time in 100ns units. This gives guest operating system an
-idea of how much time was 'stolen' from it (when the virtual CPU was preempted
-to perform some other work).
-
-3.6. hv-crash
-==
-Provides HV_X64_MSR_CRASH_P0..HV_X64_MSR_CRASH_P5 (0x4100..0x4105) and
-HV_X64_MSR_CRASH_CTL (0x4105) MSRs to the guest. These MSRs are written to
-by the guest when it crashes, HV_X64_MSR_CRASH_P0..HV_X64_MSR_CRASH_P5 MSRs
-contain additional crash information. This information is outputted in QEMU log
-and through QAPI.
-Note: unlike under genuine Hyper-V, write to HV_X64_MSR_CRASH_CTL causes guest
-to shutdown. This effectively blocks crash dump generation by Windows.
-
-3.7. hv-time
-=
-Enables two Hyper-V-specific clocksources available to the guest: MSR-based
-Hyper-V clocksource (HV_X64_MSR_TIME_REF_COUNT, 0x4020) and Reference TSC
-page (enabled via MSR HV_X64_MSR_REFERENCE_TSC, 0x4021). Both clocksources
-are per-guest, Reference TSC page clocksource allows for exit-less time stamp
-readings. Using this enlightenment leads to significant speedup of all 
timestamp
-related operations.
-
-3.8. hv-synic
-==
-Enables Hyper-V Synthetic interrupt controller - an extension of a local APIC.
-When enabled, this enlightenment provides additional communication facilities
-to the guest: SynIC messages and Events. This is a pre-requisite for
-implementing VMBus devices (not yet in QEMU). Additionally, this enlightenment
-is needed to enable Hyper-V synthetic timers. SynIC is controlled through MSRs
-HV_X64_MSR_SCONTROL..HV_X64_MSR_EOM (0x4080..0x4084) and
-HV_X64_MSR_SINT0..HV_X64_MSR_SINT15 (0x4090..0x409F)
-
-Requires: hv-vpindex
-
-3.9. hv-stimer
-===
-Enables Hyper-V synthetic timers. There are four synthetic timers per virtual
-CPU controll

[PATCH v4 5/6] i386: Hyper-V Direct TLB flush hypercall

2022-05-25 Thread Vitaly Kuznetsov

Hyper-V TLFS allows for L0 and L1 hypervisors to collaborate on L2's
TLB flush hypercalls handling. With the correct setup, L2's TLB flush
hypercalls can be handled by L0 directly, without the need to exit to
L1.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt| 11 +++
 target/i386/cpu.c  |  2 ++
 target/i386/cpu.h  |  1 +
 target/i386/kvm/hyperv-proto.h |  1 +
 target/i386/kvm/kvm.c  |  8 
 5 files changed, 23 insertions(+)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index 4b132b1c941a..14a7f449ead9 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -262,6 +262,17 @@ Allow for extended GVA ranges to be passed to Hyper-V TLB 
flush hypercalls
 
 Requires: hv-tlbflush
 
+3.25. hv-tlbflush-direct
+=
+The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
+enabled, it allows L0 (KVM) to directly handle TLB flush hypercalls from L2
+guest without the need to exit to L1 (Hyper-V) hypervisor. While the feature is
+supported for both VMX (Intel) and SVM (AMD), the VMX implementation requires
+Enlightened VMCS ('hv-evmcs') feature to also be enabled.
+
+Requires: hv-vapic
+Recommended: hv-evmcs (Intel)
+
 4. Supplementary features
 =
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index a5331e6140fc..dfbf5a65f92f 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6966,6 +6966,8 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_XMM_INPUT, 0),
 DEFINE_PROP_BIT64("hv-tlbflush-ext", X86CPU, hyperv_features,
   HYPERV_FEAT_TLBFLUSH_EXT, 0),
+DEFINE_PROP_BIT64("hv-tlbflush-direct", X86CPU, hyperv_features,
+  HYPERV_FEAT_TLBFLUSH_DIRECT, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
 DEFINE_PROP_BIT64("hv-syndbg", X86CPU, hyperv_features,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 5ff48257e513..82004b65b944 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1109,6 +1109,7 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define HYPERV_FEAT_MSR_BITMAP  17
 #define HYPERV_FEAT_XMM_INPUT   18
 #define HYPERV_FEAT_TLBFLUSH_EXT19
+#define HYPERV_FEAT_TLBFLUSH_DIRECT 20
 
 #ifndef HYPERV_SPINLOCK_NEVER_NOTIFY
 #define HYPERV_SPINLOCK_NEVER_NOTIFY 0x
diff --git a/target/i386/kvm/hyperv-proto.h b/target/i386/kvm/hyperv-proto.h
index c7854ed6d306..464fbf09e35a 100644
--- a/target/i386/kvm/hyperv-proto.h
+++ b/target/i386/kvm/hyperv-proto.h
@@ -90,6 +90,7 @@
 /*
  * HV_CPUID_NESTED_FEATURES.EAX bits
  */
+#define HV_NESTED_DIRECT_FLUSH  (1u << 17)
 #define HV_NESTED_MSR_BITMAP(1u << 19)
 
 /*
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 7bd1b4396e8e..8b58bfd0fd4a 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -995,6 +995,14 @@ static struct {
 },
 .dependencies = BIT(HYPERV_FEAT_TLBFLUSH)
 },
+[HYPERV_FEAT_TLBFLUSH_DIRECT] = {
+.desc = "direct TLB flush (hv-tlbflush-direct)",
+.flags = {
+{.func = HV_CPUID_NESTED_FEATURES, .reg = R_EAX,
+ .bits = HV_NESTED_DIRECT_FLUSH}
+},
+.dependencies = BIT(HYPERV_FEAT_VAPIC)
+},
 };
 
 static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max,
-- 
2.35.3

[PATCH v4 3/6] i386: Hyper-V XMM fast hypercall input feature

2022-05-25 Thread Vitaly Kuznetsov

Hyper-V specification allows to pass parameters for certain hypercalls
using XMM registers ("XMM Fast Hypercall Input"). When the feature is
in use, it allows for faster hypercalls processing as KVM can avoid
reading guest's memory.

KVM supports the feature since v5.14.

Rename HV_HYPERCALL_{PARAMS_XMM_AVAILABLE -> XMM_INPUT_AVAILABLE} to
comply with KVM.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt| 6 ++
 target/i386/cpu.c  | 2 ++
 target/i386/cpu.h  | 1 +
 target/i386/kvm/hyperv-proto.h | 2 +-
 target/i386/kvm/kvm.c  | 7 +++
 5 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index 5d85569b9941..af1b10c0b3d1 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -249,6 +249,12 @@ Enlightened VMCS ('hv-evmcs') feature to also be enabled.
 
 Recommended: hv-evmcs (Intel)
 
+3.23. hv-xmm-input
+===
+Hyper-V specification allows to pass parameters for certain hypercalls using 
XMM
+registers ("XMM Fast Hypercall Input"). When the feature is in use, it allows
+for faster hypercalls processing as KVM can avoid reading guest's memory.
+
 4. Supplementary features
 =
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 5aabf0c12e8d..cb86c11f71d4 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6962,6 +6962,8 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_AVIC, 0),
 DEFINE_PROP_BIT64("hv-emsr-bitmap", X86CPU, hyperv_features,
   HYPERV_FEAT_MSR_BITMAP, 0),
+DEFINE_PROP_BIT64("hv-xmm-input", X86CPU, hyperv_features,
+  HYPERV_FEAT_XMM_INPUT, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
 DEFINE_PROP_BIT64("hv-syndbg", X86CPU, hyperv_features,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index c7882857366d..37e95535843b 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1107,6 +1107,7 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define HYPERV_FEAT_AVIC15
 #define HYPERV_FEAT_SYNDBG  16
 #define HYPERV_FEAT_MSR_BITMAP  17
+#define HYPERV_FEAT_XMM_INPUT   18
 
 #ifndef HYPERV_SPINLOCK_NEVER_NOTIFY
 #define HYPERV_SPINLOCK_NEVER_NOTIFY 0x
diff --git a/target/i386/kvm/hyperv-proto.h b/target/i386/kvm/hyperv-proto.h
index cea18dbc0e23..f5f16474fa25 100644
--- a/target/i386/kvm/hyperv-proto.h
+++ b/target/i386/kvm/hyperv-proto.h
@@ -54,7 +54,7 @@
 #define HV_GUEST_DEBUGGING_AVAILABLE(1u << 1)
 #define HV_PERF_MONITOR_AVAILABLE   (1u << 2)
 #define HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE   (1u << 3)
-#define HV_HYPERCALL_PARAMS_XMM_AVAILABLE   (1u << 4)
+#define HV_HYPERCALL_XMM_INPUT_AVAILABLE(1u << 4)
 #define HV_GUEST_IDLE_STATE_AVAILABLE   (1u << 5)
 #define HV_FREQUENCY_MSRS_AVAILABLE (1u << 8)
 #define HV_GUEST_CRASH_MSR_AVAILABLE(1u << 10)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 82d1f0275c42..96d6c50ad5d9 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -980,6 +980,13 @@ static struct {
  .bits = HV_NESTED_MSR_BITMAP}
 }
 },
+[HYPERV_FEAT_XMM_INPUT] = {
+.desc = "XMM fast hypercall input (hv-xmm-input)",
+.flags = {
+{.func = HV_CPUID_FEATURES, .reg = R_EDX,
+ .bits = HV_HYPERCALL_XMM_INPUT_AVAILABLE}
+}
+},
 };
 
 static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max,
-- 
2.35.3

[PATCH v4 0/6] i386: Enable newly introduced KVM Hyper-V enlightenments

2022-05-25 Thread Vitaly Kuznetsov

Changes since v3:
- Rebase, resolve merge conflict with 73d24074078a ("hyperv: Add support to
  process syndbg commands")
- Include "i386: docs:  Convert hyperv.txt to rST" patch which was previously
  posted separately.

Original description:

This series enables four new KVM Hyper-V enlightenmtes:

'XMM fast hypercall input feature' is supported by KVM since v5.14,
it allows for faster Hyper-V hypercall processing.

'Enlightened MSR-Bitmap' is a new nested specific enlightenment speeds up
L2 vmexits by avoiding unnecessary updates to L2 MSR-Bitmap. KVM support
for the feature on Intel CPUs is in v5.17 and in  5.18 for AMD CPUs.

'Extended GVA ranges for TLB flush hypercalls' indicates that extended GVA
ranges are allowed to be passed to Hyper-V TLB flush hypercalls.

'Direct TLB flush hypercall' features allows L0 (KVM) to directly handle 
L2's TLB flush hypercalls without the need to exit to L1 (Hyper-V).

The last two features are not merged in KVM yet:
https://lore.kernel.org/kvm/20220525090133.1264239-1-vkuzn...@redhat.com/
however, there's no direct dependency on the kernel part as thanks to
KVM_GET_SUPPORTED_HV_CPUID no new capabilities are introduced.

Vitaly Kuznetsov (6):
  i386: Use hv_build_cpuid_leaf() for HV_CPUID_NESTED_FEATURES
  i386: Hyper-V Enlightened MSR bitmap feature
  i386: Hyper-V XMM fast hypercall input feature
  i386: Hyper-V Support extended GVA ranges for TLB flush hypercalls
  i386: Hyper-V Direct TLB flush hypercall
  i386: docs:  Convert hyperv.txt to rST

 docs/hyperv.txt| 270 ---
 docs/system/i386/hyperv.rst| 288 +
 docs/system/target-i386.rst|   1 +
 target/i386/cpu.c  |   8 +
 target/i386/cpu.h  |   5 +-
 target/i386/kvm/hyperv-proto.h |   9 +-
 target/i386/kvm/kvm.c  |  55 +--
 7 files changed, 354 insertions(+), 282 deletions(-)
 delete mode 100644 docs/hyperv.txt
 create mode 100644 docs/system/i386/hyperv.rst

-- 
2.35.3

[PATCH v4 4/6] i386: Hyper-V Support extended GVA ranges for TLB flush hypercalls

2022-05-25 Thread Vitaly Kuznetsov

KVM kind of supported "extended GVA ranges" (up to 4095 additional GFNs
per hypercall) since the implementation of Hyper-V PV TLB flush feature
(Linux-4.18) as regardless of the request, full TLB flush was always
performed. "Extended GVA ranges for TLB flush hypercalls" feature bit
wasn't exposed then. Now, as KVM gains support for fine-grained TLB
flush handling, exposing this feature starts making sense.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt| 7 +++
 target/i386/cpu.c  | 2 ++
 target/i386/cpu.h  | 1 +
 target/i386/kvm/hyperv-proto.h | 1 +
 target/i386/kvm/kvm.c  | 8 
 5 files changed, 19 insertions(+)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index af1b10c0b3d1..4b132b1c941a 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -255,6 +255,13 @@ Hyper-V specification allows to pass parameters for 
certain hypercalls using XMM
 registers ("XMM Fast Hypercall Input"). When the feature is in use, it allows
 for faster hypercalls processing as KVM can avoid reading guest's memory.
 
+3.24. hv-tlbflush-ext
+=
+Allow for extended GVA ranges to be passed to Hyper-V TLB flush hypercalls
+(HvFlushVirtualAddressList/HvFlushVirtualAddressListEx).
+
+Requires: hv-tlbflush
+
 4. Supplementary features
 =
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index cb86c11f71d4..a5331e6140fc 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6964,6 +6964,8 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_MSR_BITMAP, 0),
 DEFINE_PROP_BIT64("hv-xmm-input", X86CPU, hyperv_features,
   HYPERV_FEAT_XMM_INPUT, 0),
+DEFINE_PROP_BIT64("hv-tlbflush-ext", X86CPU, hyperv_features,
+  HYPERV_FEAT_TLBFLUSH_EXT, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
 DEFINE_PROP_BIT64("hv-syndbg", X86CPU, hyperv_features,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 37e95535843b..5ff48257e513 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1108,6 +1108,7 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define HYPERV_FEAT_SYNDBG  16
 #define HYPERV_FEAT_MSR_BITMAP  17
 #define HYPERV_FEAT_XMM_INPUT   18
+#define HYPERV_FEAT_TLBFLUSH_EXT19
 
 #ifndef HYPERV_SPINLOCK_NEVER_NOTIFY
 #define HYPERV_SPINLOCK_NEVER_NOTIFY 0x
diff --git a/target/i386/kvm/hyperv-proto.h b/target/i386/kvm/hyperv-proto.h
index f5f16474fa25..c7854ed6d306 100644
--- a/target/i386/kvm/hyperv-proto.h
+++ b/target/i386/kvm/hyperv-proto.h
@@ -59,6 +59,7 @@
 #define HV_FREQUENCY_MSRS_AVAILABLE (1u << 8)
 #define HV_GUEST_CRASH_MSR_AVAILABLE(1u << 10)
 #define HV_FEATURE_DEBUG_MSRS_AVAILABLE (1u << 11)
+#define HV_EXT_GVA_RANGES_FLUSH_AVAILABLE   (1u << 14)
 #define HV_STIMER_DIRECT_MODE_AVAILABLE (1u << 19)
 
 /*
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 96d6c50ad5d9..7bd1b4396e8e 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -987,6 +987,14 @@ static struct {
  .bits = HV_HYPERCALL_XMM_INPUT_AVAILABLE}
 }
 },
+[HYPERV_FEAT_TLBFLUSH_EXT] = {
+.desc = "Extended gva ranges for TLB flush hypercalls 
(hv-tlbflush-ext)",
+.flags = {
+{.func = HV_CPUID_FEATURES, .reg = R_EDX,
+ .bits = HV_EXT_GVA_RANGES_FLUSH_AVAILABLE}
+},
+.dependencies = BIT(HYPERV_FEAT_TLBFLUSH)
+},
 };
 
 static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max,
-- 
2.35.3

[PATCH v4 1/6] i386: Use hv_build_cpuid_leaf() for HV_CPUID_NESTED_FEATURES

2022-05-25 Thread Vitaly Kuznetsov

Previously, HV_CPUID_NESTED_FEATURES.EAX CPUID leaf was handled differently
as it was only used to encode the supported eVMCS version range. In fact,
there are also feature (e.g. Enlightened MSR-Bitmap) bits there. In
preparation to adding these features, move HV_CPUID_NESTED_FEATURES leaf
handling to hv_build_cpuid_leaf() and drop now-unneeded 'hyperv_nested'.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/cpu.h |  1 -
 target/i386/kvm/kvm.c | 25 +++--
 2 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 0d528ac58f32..2e918daf6bef 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1804,7 +1804,6 @@ struct ArchCPU {
 uint32_t hyperv_vendor_id[3];
 uint32_t hyperv_interface_id[4];
 uint32_t hyperv_limits[3];
-uint32_t hyperv_nested[4];
 bool hyperv_enforce_cpuid;
 uint32_t hyperv_ver_id_build;
 uint16_t hyperv_ver_id_major;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index a9ee8eebd76f..93bfefa4a79e 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -831,6 +831,8 @@ static bool tsc_is_stable_and_known(CPUX86State *env)
 || env->user_tsc_khz;
 }
 
+#define DEFAULT_EVMCS_VERSION ((1 << 8) | 1)
+
 static struct {
 const char *desc;
 struct {
@@ -1254,6 +1256,13 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, 
uint32_t func, int reg)
 }
 }
 
+/* HV_CPUID_NESTED_FEATURES.EAX also encodes the supported eVMCS range */
+if (func == HV_CPUID_NESTED_FEATURES && reg == R_EAX) {
+if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS)) {
+r |= DEFAULT_EVMCS_VERSION;
+}
+}
+
 return r;
 }
 
@@ -1384,11 +1393,11 @@ static int hyperv_fill_cpuids(CPUState *cs,
 struct kvm_cpuid_entry2 *c;
 uint32_t signature[3];
 uint32_t cpuid_i = 0, max_cpuid_leaf = 0;
+uint32_t nested_eax =
+hv_build_cpuid_leaf(cs, HV_CPUID_NESTED_FEATURES, R_EAX);
 
-max_cpuid_leaf = HV_CPUID_IMPLEMENT_LIMITS;
-if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS)) {
-max_cpuid_leaf = MAX(max_cpuid_leaf, HV_CPUID_NESTED_FEATURES);
-}
+max_cpuid_leaf = nested_eax ? HV_CPUID_NESTED_FEATURES :
+HV_CPUID_IMPLEMENT_LIMITS;
 
 if (hyperv_feat_enabled(cpu, HYPERV_FEAT_SYNDBG)) {
 max_cpuid_leaf =
@@ -1461,7 +1470,7 @@ static int hyperv_fill_cpuids(CPUState *cs,
 c->ecx = cpu->hyperv_limits[1];
 c->edx = cpu->hyperv_limits[2];
 
-if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS)) {
+if (nested_eax) {
 uint32_t function;
 
 /* Create zeroed 0x4006..0x4009 leaves */
@@ -1473,7 +1482,7 @@ static int hyperv_fill_cpuids(CPUState *cs,
 
 c = _ent[cpuid_i++];
 c->function = HV_CPUID_NESTED_FEATURES;
-c->eax = cpu->hyperv_nested[0];
+c->eax = nested_eax;
 }
 
 if (hyperv_feat_enabled(cpu, HYPERV_FEAT_SYNDBG)) {
@@ -1522,8 +1531,6 @@ static bool evmcs_version_supported(uint16_t 
evmcs_version,
 (max_version <= max_supported_version);
 }
 
-#define DEFAULT_EVMCS_VERSION ((1 << 8) | 1)
-
 static int hyperv_init_vcpu(X86CPU *cpu)
 {
 CPUState *cs = CPU(cpu);
@@ -1620,8 +1627,6 @@ static int hyperv_init_vcpu(X86CPU *cpu)
  supported_evmcs_version >> 8);
 return -ENOTSUP;
 }
-
-cpu->hyperv_nested[0] = evmcs_version;
 }
 
 if (cpu->hyperv_enforce_cpuid) {
-- 
2.35.3

Re: [PATCH v3 0/5] i386: Enable newly introduced KVM Hyper-V enlightenments

2022-05-25 Thread Vitaly Kuznetsov

Vitaly Kuznetsov  writes:

> Paolo Bonzini  writes:
>
>>> This series enables four new KVM Hyper-V enlightenmtes [...]
>>>
>>> docs/hyperv.txt| 34 ++
>>
>> Queued, thanks.  
>
> Thanks!
>

It seems these patches didn't make it upstream yet but there's a
(small) conflict with

commit 73d24074078a2cefb5305047e3bf50b73daa3f98
Author: Jon Doron 
Date:   Wed Feb 16 12:24:59 2022 +0200

hyperv: Add support to process syndbg commands

which did.

>> Would you please convert hyperv.txt to rST in docs/system/i386?
>
> Sure, it's on my TODO list.

I've sent it out some time ago:
https://lore.kernel.org/qemu-devel/20220503144906.3618426-1-vkuzn...@redhat.com/

but it also conflicts with 73d24074078a now because of 'hv-syndbg'. I'm
going to send out 'v4' including the conversion to rst to (hopefully)
facilitate acceptance.

-- 
Vitaly

[PATCH v4 2/6] i386: Hyper-V Enlightened MSR bitmap feature

2022-05-25 Thread Vitaly Kuznetsov

The newly introduced enlightenment allow L0 (KVM) and L1 (Hyper-V)
hypervisors to collaborate to avoid unnecessary updates to L2
MSR-Bitmap upon vmexits.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt| 9 +
 target/i386/cpu.c  | 2 ++
 target/i386/cpu.h  | 1 +
 target/i386/kvm/hyperv-proto.h | 5 +
 target/i386/kvm/kvm.c  | 7 +++
 5 files changed, 24 insertions(+)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index 33588a03961f..5d85569b9941 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -239,6 +239,15 @@ This enlightenment requires a VMBus device (-device 
vmbus-bridge,irq=15)
 and the follow enlightenments to work:
 hv-relaxed,hv_time,hv-vapic,hv-vpindex,hv-synic,hv-runtime,hv-stimer
 
+3.22. hv-emsr-bitmap
+=
+The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
+enabled, it allows L0 (KVM) and L1 (Hyper-V) hypervisors to collaborate to
+avoid unnecessary updates to L2 MSR-Bitmap upon vmexits. While the protocol is
+supported for both VMX (Intel) and SVM (AMD), the VMX implementation requires
+Enlightened VMCS ('hv-evmcs') feature to also be enabled.
+
+Recommended: hv-evmcs (Intel)
 
 4. Supplementary features
 =
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 35c3475e6c90..5aabf0c12e8d 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6960,6 +6960,8 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_STIMER_DIRECT, 0),
 DEFINE_PROP_BIT64("hv-avic", X86CPU, hyperv_features,
   HYPERV_FEAT_AVIC, 0),
+DEFINE_PROP_BIT64("hv-emsr-bitmap", X86CPU, hyperv_features,
+  HYPERV_FEAT_MSR_BITMAP, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
 DEFINE_PROP_BIT64("hv-syndbg", X86CPU, hyperv_features,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 2e918daf6bef..c7882857366d 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1106,6 +1106,7 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define HYPERV_FEAT_STIMER_DIRECT   14
 #define HYPERV_FEAT_AVIC15
 #define HYPERV_FEAT_SYNDBG  16
+#define HYPERV_FEAT_MSR_BITMAP  17
 
 #ifndef HYPERV_SPINLOCK_NEVER_NOTIFY
 #define HYPERV_SPINLOCK_NEVER_NOTIFY 0x
diff --git a/target/i386/kvm/hyperv-proto.h b/target/i386/kvm/hyperv-proto.h
index e40e59411c83..cea18dbc0e23 100644
--- a/target/i386/kvm/hyperv-proto.h
+++ b/target/i386/kvm/hyperv-proto.h
@@ -86,6 +86,11 @@
  */
 #define HV_SYNDBG_CAP_ALLOW_KERNEL_DEBUGGING(1u << 1)
 
+/*
+ * HV_CPUID_NESTED_FEATURES.EAX bits
+ */
+#define HV_NESTED_MSR_BITMAP(1u << 19)
+
 /*
  * Basic virtualized MSRs
  */
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 93bfefa4a79e..82d1f0275c42 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -973,6 +973,13 @@ static struct {
 .dependencies = BIT(HYPERV_FEAT_SYNIC) | BIT(HYPERV_FEAT_RELAXED)
 },
 #endif
+[HYPERV_FEAT_MSR_BITMAP] = {
+.desc = "enlightened MSR-Bitmap (hv-emsr-bitmap)",
+.flags = {
+{.func = HV_CPUID_NESTED_FEATURES, .reg = R_EAX,
+ .bits = HV_NESTED_MSR_BITMAP}
+}
+},
 };
 
 static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max,
-- 
2.35.3

Re: [PATCH] vmxcap: add tertiary execution controls

2022-05-12 Thread Vitaly Kuznetsov

Paolo Bonzini  writes:

> Signed-off-by: Paolo Bonzini 
> ---
>  scripts/kvm/vmxcap | 17 +
>  1 file changed, 17 insertions(+)
>
> diff --git a/scripts/kvm/vmxcap b/scripts/kvm/vmxcap
> index f140040104..ce27f5e635 100755
> --- a/scripts/kvm/vmxcap
> +++ b/scripts/kvm/vmxcap
> @@ -23,6 +23,7 @@ MSR_IA32_VMX_TRUE_PROCBASED_CTLS = 0x48E
>  MSR_IA32_VMX_TRUE_EXIT_CTLS = 0x48F
>  MSR_IA32_VMX_TRUE_ENTRY_CTLS = 0x490
>  MSR_IA32_VMX_VMFUNC = 0x491
> +MSR_IA32_VMX_PROCBASED_CTLS3 = 0x492
>  
>  class msr(object):
>  def __init__(self):
> @@ -71,6 +72,13 @@ class Control(object):
>  s = 'yes'
>  print('  %-40s %s' % (self.bits[bit], s))
>  
> +# All 64 bits in the tertiary controls MSR are allowed-1
> +class Allowed1Control(Control):
> +def read2(self, nr):
> +m = msr()
> +val = m.read(nr, 0)
> +return (0, val)
> +
>  class Misc(object):
>  def __init__(self, name, bits, msr):
>  self.name = name
> @@ -135,6 +143,7 @@ controls = [
>  12: 'RDTSC exiting',
>  15: 'CR3-load exiting',
>  16: 'CR3-store exiting',
> +17: 'Activate tertiary controls',
>  19: 'CR8-load exiting',
>  20: 'CR8-store exiting',
>  21: 'Use TPR shadow',
> @@ -186,6 +195,14 @@ controls = [
>  cap_msr = MSR_IA32_VMX_PROCBASED_CTLS2,
>  ),
>  
> +Allowed1Control(
> +name = 'tertiary processor-based controls',
> +bits = {
> +4: 'Enable IPI virtualization'
> +},
> +cap_msr = MSR_IA32_VMX_PROCBASED_CTLS3,
> +),
> +
>  Control(
>  name = 'VM-Exit controls',
>  bits = {

Not sure which particular CPUs are going to implement this (whould be
nice to add this info to the blurb) but this matches Intel doc
(https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html)
and "IPI virtualization support for VM" series for KVM, so

Reviewed-by: Vitaly Kuznetsov 

-- 
Vitaly

[PATCH] i386: docs: Convert hyperv.txt to rST

2022-05-03 Thread Vitaly Kuznetsov

rSTify docs/hyperv.txt and link it from docs/system/target-i386.rst.

Signed-off-by: Vitaly Kuznetsov 
---
- The patch is supposed to be applied on top of "[PATCH v3 0/5] i386:
Enable newly introduced KVM Hyper-V enlightenments".
---
 docs/hyperv.txt | 289 
 docs/system/i386/hyperv.rst | 275 ++
 docs/system/target-i386.rst |   1 +
 3 files changed, 276 insertions(+), 289 deletions(-)
 delete mode 100644 docs/hyperv.txt
 create mode 100644 docs/system/i386/hyperv.rst

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
deleted file mode 100644
index 9553e5c03c6b..
--- a/docs/hyperv.txt
+++ /dev/null
@@ -1,289 +0,0 @@
-Hyper-V Enlightenments
-==
-
-
-1. Description
-===
-In some cases when implementing a hardware interface in software is slow, KVM
-implements its own paravirtualized interfaces. This works well for Linux as
-guest support for such features is added simultaneously with the feature 
itself.
-It may, however, be hard-to-impossible to add support for these interfaces to
-proprietary OSes, namely, Microsoft Windows.
-
-KVM on x86 implements Hyper-V Enlightenments for Windows guests. These features
-make Windows and Hyper-V guests think they're running on top of a Hyper-V
-compatible hypervisor and use Hyper-V specific features.
-
-
-2. Setup
-=
-No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
-QEMU, individual enlightenments can be enabled through CPU flags, e.g:
-
-  qemu-system-x86_64 --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
-
-Sometimes there are dependencies between enlightenments, QEMU is supposed to
-check that the supplied configuration is sane.
-
-When any set of the Hyper-V enlightenments is enabled, QEMU changes hypervisor
-identification (CPUID 0x4000..0x400A) to Hyper-V. KVM identification
-and features are kept in leaves 0x4100..0x4101.
-
-
-3. Existing enlightenments
-===
-
-3.1. hv-relaxed
-
-This feature tells guest OS to disable watchdog timeouts as it is running on a
-hypervisor. It is known that some Windows versions will do this even when they
-see 'hypervisor' CPU flag.
-
-3.2. hv-vapic
-==
-Provides so-called VP Assist page MSR to guest allowing it to work with APIC
-more efficiently. In particular, this enlightenment allows paravirtualized
-(exit-less) EOI processing.
-
-3.3. hv-spinlocks=xxx
-==
-Enables paravirtualized spinlocks. The parameter indicates how many times
-spinlock acquisition should be attempted before indicating the situation to the
-hypervisor. A special value 0x indicates "never notify".
-
-3.4. hv-vpindex
-
-Provides HV_X64_MSR_VP_INDEX (0x4002) MSR to the guest which has Virtual
-processor index information. This enlightenment makes sense in conjunction with
-hv-synic, hv-stimer and other enlightenments which require the guest to know 
its
-Virtual Processor indices (e.g. when VP index needs to be passed in a
-hypercall).
-
-3.5. hv-runtime
-
-Provides HV_X64_MSR_VP_RUNTIME (0x4010) MSR to the guest. The MSR keeps the
-virtual processor run time in 100ns units. This gives guest operating system an
-idea of how much time was 'stolen' from it (when the virtual CPU was preempted
-to perform some other work).
-
-3.6. hv-crash
-==
-Provides HV_X64_MSR_CRASH_P0..HV_X64_MSR_CRASH_P5 (0x4100..0x4105) and
-HV_X64_MSR_CRASH_CTL (0x4105) MSRs to the guest. These MSRs are written to
-by the guest when it crashes, HV_X64_MSR_CRASH_P0..HV_X64_MSR_CRASH_P5 MSRs
-contain additional crash information. This information is outputted in QEMU log
-and through QAPI.
-Note: unlike under genuine Hyper-V, write to HV_X64_MSR_CRASH_CTL causes guest
-to shutdown. This effectively blocks crash dump generation by Windows.
-
-3.7. hv-time
-=
-Enables two Hyper-V-specific clocksources available to the guest: MSR-based
-Hyper-V clocksource (HV_X64_MSR_TIME_REF_COUNT, 0x4020) and Reference TSC
-page (enabled via MSR HV_X64_MSR_REFERENCE_TSC, 0x4021). Both clocksources
-are per-guest, Reference TSC page clocksource allows for exit-less time stamp
-readings. Using this enlightenment leads to significant speedup of all 
timestamp
-related operations.
-
-3.8. hv-synic
-==
-Enables Hyper-V Synthetic interrupt controller - an extension of a local APIC.
-When enabled, this enlightenment provides additional communication facilities
-to the guest: SynIC messages and Events. This is a pre-requisite for
-implementing VMBus devices (not yet in QEMU). Additionally, this enlightenment
-is needed to enable Hyper-V synthetic timers. SynIC is controlled through MSRs
-HV_X64_MSR_SCONTROL..HV_X64_MSR_EOM (0x4080..0x4084) and
-HV_X64_MSR_SINT0..HV_X64_MSR_SINT15 (0x4090..0x409F)
-
-Requires: hv-vpind

Re: [PATCH v3 0/5] i386: Enable newly introduced KVM Hyper-V enlightenments

2022-04-29 Thread Vitaly Kuznetsov

Paolo Bonzini  writes:

>> This series enables four new KVM Hyper-V enlightenmtes [...]
>>
>> docs/hyperv.txt| 34 ++
>
> Queued, thanks.  

Thanks!

> Would you please convert hyperv.txt to rST in docs/system/i386?

Sure, it's on my TODO list.

-- 
Vitaly

[PATCH v3 4/5] i386: Hyper-V Support extended GVA ranges for TLB flush hypercalls

2022-04-19 Thread Vitaly Kuznetsov

KVM kind of supported "extended GVA ranges" (up to 4095 additional GFNs
per hypercall) since the implementation of Hyper-V PV TLB flush feature
(Linux-4.18) as regardless of the request, full TLB flush was always
performed. "Extended GVA ranges for TLB flush hypercalls" feature bit
wasn't exposed then. Now, as KVM gains support for fine-grained TLB
flush handling, exposing this feature starts making sense.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt| 7 +++
 target/i386/cpu.c  | 2 ++
 target/i386/cpu.h  | 1 +
 target/i386/kvm/hyperv-proto.h | 1 +
 target/i386/kvm/kvm.c  | 8 
 5 files changed, 19 insertions(+)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index 857268d37d61..acc411eb84cf 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -241,6 +241,13 @@ Hyper-V specification allows to pass parameters for 
certain hypercalls using XMM
 registers ("XMM Fast Hypercall Input"). When the feature is in use, it allows
 for faster hypercalls processing as KVM can avoid reading guest's memory.
 
+3.23. hv-tlbflush-ext
+=
+Allow for extended GVA ranges to be passed to Hyper-V TLB flush hypercalls
+(HvFlushVirtualAddressList/HvFlushVirtualAddressListEx).
+
+Requires: hv-tlbflush
+
 4. Supplementary features
 =
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index c4be8ffe7988..f80db9a403bd 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6929,6 +6929,8 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_MSR_BITMAP, 0),
 DEFINE_PROP_BIT64("hv-xmm-input", X86CPU, hyperv_features,
   HYPERV_FEAT_XMM_INPUT, 0),
+DEFINE_PROP_BIT64("hv-tlbflush-ext", X86CPU, hyperv_features,
+  HYPERV_FEAT_TLBFLUSH_EXT, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
 DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index ea561e18f934..ec96b0e7a4cb 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1086,6 +1086,7 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define HYPERV_FEAT_AVIC15
 #define HYPERV_FEAT_MSR_BITMAP  16
 #define HYPERV_FEAT_XMM_INPUT   17
+#define HYPERV_FEAT_TLBFLUSH_EXT18
 
 #ifndef HYPERV_SPINLOCK_NEVER_NOTIFY
 #define HYPERV_SPINLOCK_NEVER_NOTIFY 0x
diff --git a/target/i386/kvm/hyperv-proto.h b/target/i386/kvm/hyperv-proto.h
index 74d91adb7a16..b3f42ab92051 100644
--- a/target/i386/kvm/hyperv-proto.h
+++ b/target/i386/kvm/hyperv-proto.h
@@ -55,6 +55,7 @@
 #define HV_GUEST_IDLE_STATE_AVAILABLE   (1u << 5)
 #define HV_FREQUENCY_MSRS_AVAILABLE (1u << 8)
 #define HV_GUEST_CRASH_MSR_AVAILABLE(1u << 10)
+#define HV_EXT_GVA_RANGES_FLUSH_AVAILABLE   (1u << 14)
 #define HV_STIMER_DIRECT_MODE_AVAILABLE (1u << 19)
 
 /*
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 7f752ef4376a..8a71de07f3c7 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -980,6 +980,14 @@ static struct {
  .bits = HV_HYPERCALL_XMM_INPUT_AVAILABLE}
 }
 },
+[HYPERV_FEAT_TLBFLUSH_EXT] = {
+.desc = "Extended gva ranges for TLB flush hypercalls 
(hv-tlbflush-ext)",
+.flags = {
+{.func = HV_CPUID_FEATURES, .reg = R_EDX,
+ .bits = HV_EXT_GVA_RANGES_FLUSH_AVAILABLE}
+},
+.dependencies = BIT(HYPERV_FEAT_TLBFLUSH)
+},
 };
 
 static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max,
-- 
2.35.1

[PATCH v3 2/5] i386: Hyper-V Enlightened MSR bitmap feature

2022-04-19 Thread Vitaly Kuznetsov

The newly introduced enlightenment allow L0 (KVM) and L1 (Hyper-V)
hypervisors to collaborate to avoid unnecessary updates to L2
MSR-Bitmap upon vmexits.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt| 10 ++
 target/i386/cpu.c  |  2 ++
 target/i386/cpu.h  |  1 +
 target/i386/kvm/hyperv-proto.h |  5 +
 target/i386/kvm/kvm.c  |  7 +++
 5 files changed, 25 insertions(+)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index 0417c183a3b0..08429124a634 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -225,6 +225,16 @@ default (WS2016).
 Note: hv-version-id-* are not enlightenments and thus don't enable Hyper-V
 identification when specified without any other enlightenments.
 
+3.21. hv-emsr-bitmap
+=
+The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
+enabled, it allows L0 (KVM) and L1 (Hyper-V) hypervisors to collaborate to
+avoid unnecessary updates to L2 MSR-Bitmap upon vmexits. While the protocol is
+supported for both VMX (Intel) and SVM (AMD), the VMX implementation requires
+Enlightened VMCS ('hv-evmcs') feature to also be enabled.
+
+Recommended: hv-evmcs (Intel)
+
 4. Supplementary features
 =
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index cb6b5467d067..3f053919685f 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6925,6 +6925,8 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_STIMER_DIRECT, 0),
 DEFINE_PROP_BIT64("hv-avic", X86CPU, hyperv_features,
   HYPERV_FEAT_AVIC, 0),
+DEFINE_PROP_BIT64("hv-emsr-bitmap", X86CPU, hyperv_features,
+  HYPERV_FEAT_MSR_BITMAP, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
 DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 73dc387c52f5..9615c330315f 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1084,6 +1084,7 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define HYPERV_FEAT_IPI 13
 #define HYPERV_FEAT_STIMER_DIRECT   14
 #define HYPERV_FEAT_AVIC15
+#define HYPERV_FEAT_MSR_BITMAP  16
 
 #ifndef HYPERV_SPINLOCK_NEVER_NOTIFY
 #define HYPERV_SPINLOCK_NEVER_NOTIFY 0x
diff --git a/target/i386/kvm/hyperv-proto.h b/target/i386/kvm/hyperv-proto.h
index 89f81afda7c6..38e25468122d 100644
--- a/target/i386/kvm/hyperv-proto.h
+++ b/target/i386/kvm/hyperv-proto.h
@@ -72,6 +72,11 @@
 #define HV_ENLIGHTENED_VMCS_RECOMMENDED (1u << 14)
 #define HV_NO_NONARCH_CORESHARING   (1u << 18)
 
+/*
+ * HV_CPUID_NESTED_FEATURES.EAX bits
+ */
+#define HV_NESTED_MSR_BITMAP(1u << 19)
+
 /*
  * Basic virtualized MSRs
  */
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index ff79994faa87..4059b46b9449 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -966,6 +966,13 @@ static struct {
  .bits = HV_DEPRECATING_AEOI_RECOMMENDED}
 }
 },
+[HYPERV_FEAT_MSR_BITMAP] = {
+.desc = "enlightened MSR-Bitmap (hv-emsr-bitmap)",
+.flags = {
+{.func = HV_CPUID_NESTED_FEATURES, .reg = R_EAX,
+ .bits = HV_NESTED_MSR_BITMAP}
+}
+},
 };
 
 static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max,
-- 
2.35.1

[PATCH v3 1/5] i386: Use hv_build_cpuid_leaf() for HV_CPUID_NESTED_FEATURES

2022-04-19 Thread Vitaly Kuznetsov

Previously, HV_CPUID_NESTED_FEATURES.EAX CPUID leaf was handled differently
as it was only used to encode the supported eVMCS version range. In fact,
there are also feature (e.g. Enlightened MSR-Bitmap) bits there. In
preparation to adding these features, move HV_CPUID_NESTED_FEATURES leaf
handling to hv_build_cpuid_leaf() and drop now-unneeded 'hyperv_nested'.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/cpu.h |  1 -
 target/i386/kvm/kvm.c | 23 +++
 2 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 982c5323537c..73dc387c52f5 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1770,7 +1770,6 @@ struct ArchCPU {
 uint32_t hyperv_vendor_id[3];
 uint32_t hyperv_interface_id[4];
 uint32_t hyperv_limits[3];
-uint32_t hyperv_nested[4];
 bool hyperv_enforce_cpuid;
 uint32_t hyperv_ver_id_build;
 uint16_t hyperv_ver_id_major;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 9cf8e036698d..ff79994faa87 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -834,6 +834,8 @@ static bool tsc_is_stable_and_known(CPUX86State *env)
 || env->user_tsc_khz;
 }
 
+#define DEFAULT_EVMCS_VERSION ((1 << 8) | 1)
+
 static struct {
 const char *desc;
 struct {
@@ -1241,6 +1243,13 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, 
uint32_t func, int reg)
 }
 }
 
+/* HV_CPUID_NESTED_FEATURES.EAX also encodes the supported eVMCS range */
+if (func == HV_CPUID_NESTED_FEATURES && reg == R_EAX) {
+if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS)) {
+r |= DEFAULT_EVMCS_VERSION;
+}
+}
+
 return r;
 }
 
@@ -1370,11 +1379,13 @@ static int hyperv_fill_cpuids(CPUState *cs,
 X86CPU *cpu = X86_CPU(cs);
 struct kvm_cpuid_entry2 *c;
 uint32_t cpuid_i = 0;
+uint32_t nested_eax =
+hv_build_cpuid_leaf(cs, HV_CPUID_NESTED_FEATURES, R_EAX);
 
 c = _ent[cpuid_i++];
 c->function = HV_CPUID_VENDOR_AND_MAX_FUNCTIONS;
-c->eax = hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS) ?
-HV_CPUID_NESTED_FEATURES : HV_CPUID_IMPLEMENT_LIMITS;
+c->eax = nested_eax ? HV_CPUID_NESTED_FEATURES :
+HV_CPUID_IMPLEMENT_LIMITS;
 c->ebx = cpu->hyperv_vendor_id[0];
 c->ecx = cpu->hyperv_vendor_id[1];
 c->edx = cpu->hyperv_vendor_id[2];
@@ -1438,7 +1449,7 @@ static int hyperv_fill_cpuids(CPUState *cs,
 c->ecx = cpu->hyperv_limits[1];
 c->edx = cpu->hyperv_limits[2];
 
-if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS)) {
+if (nested_eax) {
 uint32_t function;
 
 /* Create zeroed 0x4006..0x4009 leaves */
@@ -1450,7 +1461,7 @@ static int hyperv_fill_cpuids(CPUState *cs,
 
 c = _ent[cpuid_i++];
 c->function = HV_CPUID_NESTED_FEATURES;
-c->eax = cpu->hyperv_nested[0];
+c->eax = nested_eax;
 }
 
 return cpuid_i;
@@ -1472,8 +1483,6 @@ static bool evmcs_version_supported(uint16_t 
evmcs_version,
 (max_version <= max_supported_version);
 }
 
-#define DEFAULT_EVMCS_VERSION ((1 << 8) | 1)
-
 static int hyperv_init_vcpu(X86CPU *cpu)
 {
 CPUState *cs = CPU(cpu);
@@ -1577,8 +1586,6 @@ static int hyperv_init_vcpu(X86CPU *cpu)
  supported_evmcs_version >> 8);
 return -ENOTSUP;
 }
-
-cpu->hyperv_nested[0] = evmcs_version;
 }
 
 if (cpu->hyperv_enforce_cpuid) {
-- 
2.35.1

[PATCH v3 0/5] i386: Enable newly introduced KVM Hyper-V enlightenments

2022-04-19 Thread Vitaly Kuznetsov

This is a continuation of "[PATCH v2 0/3] i386: Add support for Hyper-V
Enlightened MSR-Bitmap and XMM fast hypercall input features":
https://lore.kernel.org/qemu-devel/20220217142949.297454-1-vkuzn...@redhat.com/
work which wasn't merged for 7.0, thus 'v3'.

This series enables four new KVM Hyper-V enlightenmtes:

'XMM fast hypercall input feature' is supported by KVM since v5.14,
it allows for faster Hyper-V hypercall processing.

'Enlightened MSR-Bitmap' is a new nested specific enlightenment speeds up
L2 vmexits by avoiding unnecessary updates to L2 MSR-Bitmap. KVM support
for the feature on Intel CPUs is in v5.17 and in  5.18 for AMD CPUs.

'Extended GVA ranges for TLB flush hypercalls' indicates that extended GVA
ranges are allowed to be passed to Hyper-V TLB flush hypercalls.

'Direct TLB flush hypercall' features allows L0 (KVM) to directly handle 
L2's TLB flush hypercalls without the need to exit to L1 (Hyper-V).

The last two features are not merged in KVM yet:
https://lore.kernel.org/kvm/20220414132013.1588929-1-vkuzn...@redhat.com/
however, there's no direct dependency on the kernel part as thanks to
KVM_GET_SUPPORTED_HV_CPUID no new capabilities are introduced.

Vitaly Kuznetsov (5):
  i386: Use hv_build_cpuid_leaf() for HV_CPUID_NESTED_FEATURES
  i386: Hyper-V Enlightened MSR bitmap feature
  i386: Hyper-V XMM fast hypercall input feature
  i386: Hyper-V Support extended GVA ranges for TLB flush hypercalls
  i386: Hyper-V Direct TLB flush hypercall

 docs/hyperv.txt| 34 ++
 target/i386/cpu.c  |  8 +
 target/i386/cpu.h  |  5 +++-
 target/i386/kvm/hyperv-proto.h |  9 +-
 target/i386/kvm/kvm.c  | 53 +-
 5 files changed, 99 insertions(+), 10 deletions(-)

-- 
2.35.1

[PATCH v3 5/5] i386: Hyper-V Direct TLB flush hypercall

2022-04-19 Thread Vitaly Kuznetsov

Hyper-V TLFS allows for L0 and L1 hypervisors to collaborate on L2's
TLB flush hypercalls handling. With the correct setup, L2's TLB flush
hypercalls can be handled by L0 directly, without the need to exit to
L1.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt| 11 +++
 target/i386/cpu.c  |  2 ++
 target/i386/cpu.h  |  1 +
 target/i386/kvm/hyperv-proto.h |  1 +
 target/i386/kvm/kvm.c  |  8 
 5 files changed, 23 insertions(+)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index acc411eb84cf..9553e5c03c6b 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -248,6 +248,17 @@ Allow for extended GVA ranges to be passed to Hyper-V TLB 
flush hypercalls
 
 Requires: hv-tlbflush
 
+3.24. hv-tlbflush-direct
+=
+The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
+enabled, it allows L0 (KVM) to directly handle TLB flush hypercalls from L2
+guest without the need to exit to L1 (Hyper-V) hypervisor. While the feature is
+supported for both VMX (Intel) and SVM (AMD), the VMX implementation requires
+Enlightened VMCS ('hv-evmcs') feature to also be enabled.
+
+Requires: hv-vapic
+Recommended: hv-evmcs (Intel)
+
 4. Supplementary features
 =
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f80db9a403bd..e8bbaf24d38d 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6931,6 +6931,8 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_XMM_INPUT, 0),
 DEFINE_PROP_BIT64("hv-tlbflush-ext", X86CPU, hyperv_features,
   HYPERV_FEAT_TLBFLUSH_EXT, 0),
+DEFINE_PROP_BIT64("hv-tlbflush-direct", X86CPU, hyperv_features,
+  HYPERV_FEAT_TLBFLUSH_DIRECT, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
 DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index ec96b0e7a4cb..2d17d52c00c1 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1087,6 +1087,7 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define HYPERV_FEAT_MSR_BITMAP  16
 #define HYPERV_FEAT_XMM_INPUT   17
 #define HYPERV_FEAT_TLBFLUSH_EXT18
+#define HYPERV_FEAT_TLBFLUSH_DIRECT 19
 
 #ifndef HYPERV_SPINLOCK_NEVER_NOTIFY
 #define HYPERV_SPINLOCK_NEVER_NOTIFY 0x
diff --git a/target/i386/kvm/hyperv-proto.h b/target/i386/kvm/hyperv-proto.h
index b3f42ab92051..28d7759770e1 100644
--- a/target/i386/kvm/hyperv-proto.h
+++ b/target/i386/kvm/hyperv-proto.h
@@ -76,6 +76,7 @@
 /*
  * HV_CPUID_NESTED_FEATURES.EAX bits
  */
+#define HV_NESTED_DIRECT_FLUSH  (1u << 17)
 #define HV_NESTED_MSR_BITMAP(1u << 19)
 
 /*
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 8a71de07f3c7..e966ab467b74 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -988,6 +988,14 @@ static struct {
 },
 .dependencies = BIT(HYPERV_FEAT_TLBFLUSH)
 },
+[HYPERV_FEAT_TLBFLUSH_DIRECT] = {
+.desc = "direct TLB flush (hv-tlbflush-direct)",
+.flags = {
+{.func = HV_CPUID_NESTED_FEATURES, .reg = R_EAX,
+ .bits = HV_NESTED_DIRECT_FLUSH}
+},
+.dependencies = BIT(HYPERV_FEAT_VAPIC)
+},
 };
 
 static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max,
-- 
2.35.1

[PATCH v3 3/5] i386: Hyper-V XMM fast hypercall input feature

2022-04-19 Thread Vitaly Kuznetsov

Hyper-V specification allows to pass parameters for certain hypercalls
using XMM registers ("XMM Fast Hypercall Input"). When the feature is
in use, it allows for faster hypercalls processing as KVM can avoid
reading guest's memory.

KVM supports the feature since v5.14.

Rename HV_HYPERCALL_{PARAMS_XMM_AVAILABLE -> XMM_INPUT_AVAILABLE} to
comply with KVM.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt| 6 ++
 target/i386/cpu.c  | 2 ++
 target/i386/cpu.h  | 1 +
 target/i386/kvm/hyperv-proto.h | 2 +-
 target/i386/kvm/kvm.c  | 7 +++
 5 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index 08429124a634..857268d37d61 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -235,6 +235,12 @@ Enlightened VMCS ('hv-evmcs') feature to also be enabled.
 
 Recommended: hv-evmcs (Intel)
 
+3.22. hv-xmm-input
+===
+Hyper-V specification allows to pass parameters for certain hypercalls using 
XMM
+registers ("XMM Fast Hypercall Input"). When the feature is in use, it allows
+for faster hypercalls processing as KVM can avoid reading guest's memory.
+
 4. Supplementary features
 =
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 3f053919685f..c4be8ffe7988 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6927,6 +6927,8 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_AVIC, 0),
 DEFINE_PROP_BIT64("hv-emsr-bitmap", X86CPU, hyperv_features,
   HYPERV_FEAT_MSR_BITMAP, 0),
+DEFINE_PROP_BIT64("hv-xmm-input", X86CPU, hyperv_features,
+  HYPERV_FEAT_XMM_INPUT, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
 DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 9615c330315f..ea561e18f934 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1085,6 +1085,7 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define HYPERV_FEAT_STIMER_DIRECT   14
 #define HYPERV_FEAT_AVIC15
 #define HYPERV_FEAT_MSR_BITMAP  16
+#define HYPERV_FEAT_XMM_INPUT   17
 
 #ifndef HYPERV_SPINLOCK_NEVER_NOTIFY
 #define HYPERV_SPINLOCK_NEVER_NOTIFY 0x
diff --git a/target/i386/kvm/hyperv-proto.h b/target/i386/kvm/hyperv-proto.h
index 38e25468122d..74d91adb7a16 100644
--- a/target/i386/kvm/hyperv-proto.h
+++ b/target/i386/kvm/hyperv-proto.h
@@ -51,7 +51,7 @@
 #define HV_GUEST_DEBUGGING_AVAILABLE(1u << 1)
 #define HV_PERF_MONITOR_AVAILABLE   (1u << 2)
 #define HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE   (1u << 3)
-#define HV_HYPERCALL_PARAMS_XMM_AVAILABLE   (1u << 4)
+#define HV_HYPERCALL_XMM_INPUT_AVAILABLE(1u << 4)
 #define HV_GUEST_IDLE_STATE_AVAILABLE   (1u << 5)
 #define HV_FREQUENCY_MSRS_AVAILABLE (1u << 8)
 #define HV_GUEST_CRASH_MSR_AVAILABLE(1u << 10)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 4059b46b9449..7f752ef4376a 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -973,6 +973,13 @@ static struct {
  .bits = HV_NESTED_MSR_BITMAP}
 }
 },
+[HYPERV_FEAT_XMM_INPUT] = {
+.desc = "XMM fast hypercall input (hv-xmm-input)",
+.flags = {
+{.func = HV_CPUID_FEATURES, .reg = R_EDX,
+ .bits = HV_HYPERCALL_XMM_INPUT_AVAILABLE}
+}
+},
 };
 
 static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max,
-- 
2.35.1

Re: [Qemu-devel] [PATCH 6/8] i386/kvm: hv-stimer requires hv-time and hv-synic

2022-04-12 Thread Vitaly Kuznetsov

Divya Garg  writes:

> On 12/04/22 6:18 pm, Vitaly Kuznetsov wrote:
>> Divya Garg  writes:
>>
>>> Hi Vitaly Kuznetsov !
>>> I was working on hyperv flags and saw that we introduced new
>>> dependencies some
>>> time back
>>> (https://urldefense.proofpoint.com/v2/url?u=https-3A__sourcegraph.com_github.com_qemu_qemu_-2D_commit_c686193072a47032d83cb4e131dc49ae30f9e5d7-3Fvisible-3D1=DwIBAg=s883GpUCOChKOHiocYtGcg=2QGHz-fTCVWImEBKe1ZcSe5t6UfasnhvdzD5DcixwOE=ln-t0rKlkFkOEKe97jJTLi2BoKK5E9lLMPHjPihl4kpdbvBStPeD0Ku9wTed7GPf=AtipQDs1Mi-0FQtb1AyvBpR34bpjp64troGF_nr_08E=
>>>  ).
>>> After these changes, if we try to live migrate a vm from older qemu to newer
>>> one having these changes, it fails showing dependency issue.
>>>
>>> I was wondering if this is the expected behaviour or if there is any work
>>> around for handing it ? Or something needs to be done to ensure backward
>>> compatibility ?
>> Hi Divya,
>>
>> configurations with 'hv-stimer' and without 'hv-synic'/'hv-time' were
>> always incorrect as Windows can't use the feature, that's why the
>> dependencies were added. It is true that it doesn't seem to be possible
>> to forward-migrate such VMs to newer QEMU versions. We could've tied
>> these new dependencies to newer machine types I guess (so old machine
>> types would not fail to start) but we didn't do that back in 4.1 and
>> it's been awhile since... Not sure whether it would make much sense to
>> introduce something for pre-4.1 machine types now.
>>
>> Out of curiosity, why do such "incorrect" configurations exist? Can you
>> just update them to include missing flags on older QEMU so they migrate
>> to newer ones without issues?
>>
> Hi Vitaly !
>
> Thanks for the response. I understand that these were incorrect 
> configurations
> and should be corrected. Only issue is, we want to avoid power cycling those
> VMs. But yeah I think, since the configurations were wrong we should 
> update and
> power cycle the VM.  Just for understanding purpose, is it possible to 
> disable
> the feature by throwing out some warning message and update libvirt to 
> metigate
> this change and handle live migration ?
>

I'm not exactly sure about libvirt, I was under the impression it makes
sure that QEMU command line is the same on the destination and on the
source. If there's a way to add something, I'd suggest you add the
missing features (hv-time, hv-synic) on the destination rather than
remove 'hv-stimer' as it is probably safer.

> Or maybe update libvirt to not to ask for this feature from qemu during live
> migration and handle different configuration on source and destination 
> host ?

You can also modify QEMU locally and throw away these dependencies,
it'll allow these configurations again but generally speaking checking
that the set of hyper-v features is exactly the same on the source and
destination is the right thing to do: there are no guarantees that guest
OS (Windows) will keep behaving sane when the corresponding CPUIDs
change while it's running, all sorts of things are possible I believe.

-- 
Vitaly

Re: [Qemu-devel] [PATCH 6/8] i386/kvm: hv-stimer requires hv-time and hv-synic

2022-04-12 Thread Vitaly Kuznetsov

Divya Garg  writes:

> Hi Vitaly Kuznetsov !
> I was working on hyperv flags and saw that we introduced new 
> dependencies some
> time back 
> (https://sourcegraph.com/github.com/qemu/qemu/-/commit/c686193072a47032d83cb4e131dc49ae30f9e5d7?visible=1).
> After these changes, if we try to live migrate a vm from older qemu to newer
> one having these changes, it fails showing dependency issue.
>
> I was wondering if this is the expected behaviour or if there is any work
> around for handing it ? Or something needs to be done to ensure backward
> compatibility ?

Hi Divya,

configurations with 'hv-stimer' and without 'hv-synic'/'hv-time' were
always incorrect as Windows can't use the feature, that's why the
dependencies were added. It is true that it doesn't seem to be possible
to forward-migrate such VMs to newer QEMU versions. We could've tied
these new dependencies to newer machine types I guess (so old machine
types would not fail to start) but we didn't do that back in 4.1 and
it's been awhile since... Not sure whether it would make much sense to
introduce something for pre-4.1 machine types now.

Out of curiosity, why do such "incorrect" configurations exist? Can you
just update them to include missing flags on older QEMU so they migrate
to newer ones without issues?

-- 
Vitaly

Re: [PATCH v2 0/3] i386: Add support for Hyper-V Enlightened MSR-Bitmap and XMM fast hypercall input features

2022-03-06 Thread Vitaly Kuznetsov

Vitaly Kuznetsov  writes:

> 'XMM fast hypercall input feature' is supported by KVM since v5.14,
> it allows for faster Hyper-V hypercall processing.
>
> 'Enlightened MSR-Bitmap' is a new nested specific enlightenment speeds up
> L2 vmexits by avoiding unnecessary updates to L2 MSR-Bitmap. KVM support
> for the feature on Intel CPUs is coming in v5.17 and is queued for 5.18 for
> AMD CPUs.
>

Gentle ping) It seems the time is running out to get this in 7.0...

-- 
Vitaly

[PATCH 1/2] i386: Add Icelake-Server-v6 CPU model with 5-level EPT support

2022-02-21 Thread Vitaly Kuznetsov

Windows 11 with WSL2 enabled (Hyper-V) fails to boot with Icelake-Server
{-v5} CPU model but boots well with '-cpu host'. Apparently, it expects
5-level paging and 5-level EPT support to come in pair but QEMU's
Icelake-Server CPU model lacks the later. Introduce 'Icelake-Server-v6'
CPU model with 'vmx-page-walk-5' enabled by default.

Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/cpu.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index aa9e6368004c..6e25d1333971 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3505,6 +3505,14 @@ static const X86CPUDefinition builtin_x86_defs[] = {
 { /* end of list */ }
 },
 },
+{
+.version = 6,
+.note = "5-level EPT",
+.props = (PropValue[]) {
+{ "vmx-page-walk-5", "on" },
+{ /* end of list */ }
+},
+},
 { /* end of list */ }
 }
 },
-- 
2.35.1

[PATCH 2/2] vmxcap: Add 5-level EPT bit

2022-02-21 Thread Vitaly Kuznetsov

5-level EPT is present in Icelake Server CPUs and is supported by QEMU
('vmx-page-walk-5').

Signed-off-by: Vitaly Kuznetsov 
---
 scripts/kvm/vmxcap | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/kvm/vmxcap b/scripts/kvm/vmxcap
index 6fe66d5f5753..f140040104bf 100755
--- a/scripts/kvm/vmxcap
+++ b/scripts/kvm/vmxcap
@@ -249,6 +249,7 @@ controls = [
 bits = {
 0: 'Execute-only EPT translations',
 6: 'Page-walk length 4',
+7: 'Page-walk length 5',
 8: 'Paging-structure memory type UC',
 14: 'Paging-structure memory type WB',
 16: '2MB EPT pages',
-- 
2.35.1

[PATCH v2 3/3] i386: Hyper-V XMM fast hypercall input feature

2022-02-17 Thread Vitaly Kuznetsov

Hyper-V specification allows to pass parameters for certain hypercalls
using XMM registers ("XMM Fast Hypercall Input"). When the feature is
in use, it allows for faster hypercalls processing as KVM can avoid
reading guest's memory.

KVM supports the feature since v5.14.

Rename HV_HYPERCALL_{PARAMS_XMM_AVAILABLE -> XMM_INPUT_AVAILABLE} to
comply with KVM.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt| 6 ++
 target/i386/cpu.c  | 2 ++
 target/i386/cpu.h  | 1 +
 target/i386/kvm/hyperv-proto.h | 2 +-
 target/i386/kvm/kvm.c  | 7 +++
 5 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index 08429124a634..857268d37d61 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -235,6 +235,12 @@ Enlightened VMCS ('hv-evmcs') feature to also be enabled.
 
 Recommended: hv-evmcs (Intel)
 
+3.22. hv-xmm-input
+===
+Hyper-V specification allows to pass parameters for certain hypercalls using 
XMM
+registers ("XMM Fast Hypercall Input"). When the feature is in use, it allows
+for faster hypercalls processing as KVM can avoid reading guest's memory.
+
 4. Supplementary features
 =
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f7405fdf4fa5..0b171db1d046 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6841,6 +6841,8 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_AVIC, 0),
 DEFINE_PROP_BIT64("hv-emsr-bitmap", X86CPU, hyperv_features,
   HYPERV_FEAT_MSR_BITMAP, 0),
+DEFINE_PROP_BIT64("hv-xmm-input", X86CPU, hyperv_features,
+  HYPERV_FEAT_XMM_INPUT, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
 DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index d6ae9e60a9a0..da251d165d13 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1061,6 +1061,7 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS];
 #define HYPERV_FEAT_STIMER_DIRECT   14
 #define HYPERV_FEAT_AVIC15
 #define HYPERV_FEAT_MSR_BITMAP  16
+#define HYPERV_FEAT_XMM_INPUT   17
 
 #ifndef HYPERV_SPINLOCK_NEVER_NOTIFY
 #define HYPERV_SPINLOCK_NEVER_NOTIFY 0x
diff --git a/target/i386/kvm/hyperv-proto.h b/target/i386/kvm/hyperv-proto.h
index 38e25468122d..74d91adb7a16 100644
--- a/target/i386/kvm/hyperv-proto.h
+++ b/target/i386/kvm/hyperv-proto.h
@@ -51,7 +51,7 @@
 #define HV_GUEST_DEBUGGING_AVAILABLE(1u << 1)
 #define HV_PERF_MONITOR_AVAILABLE   (1u << 2)
 #define HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE   (1u << 3)
-#define HV_HYPERCALL_PARAMS_XMM_AVAILABLE   (1u << 4)
+#define HV_HYPERCALL_XMM_INPUT_AVAILABLE(1u << 4)
 #define HV_GUEST_IDLE_STATE_AVAILABLE   (1u << 5)
 #define HV_FREQUENCY_MSRS_AVAILABLE (1u << 8)
 #define HV_GUEST_CRASH_MSR_AVAILABLE(1u << 10)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index f719ef3f8384..8279b116fac6 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -941,6 +941,13 @@ static struct {
  .bits = HV_NESTED_MSR_BITMAP}
 }
 },
+[HYPERV_FEAT_XMM_INPUT] = {
+.desc = "XMM fast hypercall input (hv-xmm-input)",
+.flags = {
+{.func = HV_CPUID_FEATURES, .reg = R_EDX,
+ .bits = HV_HYPERCALL_XMM_INPUT_AVAILABLE}
+}
+},
 };
 
 static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max,
-- 
2.35.1

[PATCH v2 2/3] i386: Hyper-V Enlightened MSR bitmap feature

2022-02-17 Thread Vitaly Kuznetsov

The newly introduced enlightenment allow L0 (KVM) and L1 (Hyper-V)
hypervisors to collaborate to avoid unnecessary updates to L2
MSR-Bitmap upon vmexits.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt| 10 ++
 target/i386/cpu.c  |  2 ++
 target/i386/cpu.h  |  1 +
 target/i386/kvm/hyperv-proto.h |  5 +
 target/i386/kvm/kvm.c  |  7 +++
 5 files changed, 25 insertions(+)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index 0417c183a3b0..08429124a634 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -225,6 +225,16 @@ default (WS2016).
 Note: hv-version-id-* are not enlightenments and thus don't enable Hyper-V
 identification when specified without any other enlightenments.
 
+3.21. hv-emsr-bitmap
+=
+The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
+enabled, it allows L0 (KVM) and L1 (Hyper-V) hypervisors to collaborate to
+avoid unnecessary updates to L2 MSR-Bitmap upon vmexits. While the protocol is
+supported for both VMX (Intel) and SVM (AMD), the VMX implementation requires
+Enlightened VMCS ('hv-evmcs') feature to also be enabled.
+
+Recommended: hv-evmcs (Intel)
+
 4. Supplementary features
 =
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index aa9e6368004c..f7405fdf4fa5 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6839,6 +6839,8 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_STIMER_DIRECT, 0),
 DEFINE_PROP_BIT64("hv-avic", X86CPU, hyperv_features,
   HYPERV_FEAT_AVIC, 0),
+DEFINE_PROP_BIT64("hv-emsr-bitmap", X86CPU, hyperv_features,
+  HYPERV_FEAT_MSR_BITMAP, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
 DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 537479d24928..d6ae9e60a9a0 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1060,6 +1060,7 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS];
 #define HYPERV_FEAT_IPI 13
 #define HYPERV_FEAT_STIMER_DIRECT   14
 #define HYPERV_FEAT_AVIC15
+#define HYPERV_FEAT_MSR_BITMAP  16
 
 #ifndef HYPERV_SPINLOCK_NEVER_NOTIFY
 #define HYPERV_SPINLOCK_NEVER_NOTIFY 0x
diff --git a/target/i386/kvm/hyperv-proto.h b/target/i386/kvm/hyperv-proto.h
index 89f81afda7c6..38e25468122d 100644
--- a/target/i386/kvm/hyperv-proto.h
+++ b/target/i386/kvm/hyperv-proto.h
@@ -72,6 +72,11 @@
 #define HV_ENLIGHTENED_VMCS_RECOMMENDED (1u << 14)
 #define HV_NO_NONARCH_CORESHARING   (1u << 18)
 
+/*
+ * HV_CPUID_NESTED_FEATURES.EAX bits
+ */
+#define HV_NESTED_MSR_BITMAP(1u << 19)
+
 /*
  * Basic virtualized MSRs
  */
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index ceb331db8963..f719ef3f8384 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -934,6 +934,13 @@ static struct {
  .bits = HV_DEPRECATING_AEOI_RECOMMENDED}
 }
 },
+[HYPERV_FEAT_MSR_BITMAP] = {
+.desc = "enlightened MSR-Bitmap (hv-emsr-bitmap)",
+.flags = {
+{.func = HV_CPUID_NESTED_FEATURES, .reg = R_EAX,
+ .bits = HV_NESTED_MSR_BITMAP}
+}
+},
 };
 
 static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max,
-- 
2.35.1

[PATCH v2 0/3] i386: Add support for Hyper-V Enlightened MSR-Bitmap and XMM fast hypercall input features

2022-02-17 Thread Vitaly Kuznetsov

'XMM fast hypercall input feature' is supported by KVM since v5.14,
it allows for faster Hyper-V hypercall processing.

'Enlightened MSR-Bitmap' is a new nested specific enlightenment speeds up
L2 vmexits by avoiding unnecessary updates to L2 MSR-Bitmap. KVM support
for the feature on Intel CPUs is coming in v5.17 and is queued for 5.18 for
AMD CPUs.

Vitaly Kuznetsov (3):
  i386: Use hv_build_cpuid_leaf() for HV_CPUID_NESTED_FEATURES
  i386: Hyper-V Enlightened MSR bitmap feature
  i386: Hyper-V XMM fast hypercall input feature

 docs/hyperv.txt| 16 +++
 target/i386/cpu.c  |  4 
 target/i386/cpu.h  |  3 ++-
 target/i386/kvm/hyperv-proto.h |  7 ++-
 target/i386/kvm/kvm.c  | 37 ++
 5 files changed, 57 insertions(+), 10 deletions(-)

-- 
2.35.1

[PATCH v2 1/3] i386: Use hv_build_cpuid_leaf() for HV_CPUID_NESTED_FEATURES

2022-02-17 Thread Vitaly Kuznetsov

Previously, HV_CPUID_NESTED_FEATURES.EAX CPUID leaf was handled differently
as it was only used to encode the supported eVMCS version range. In fact,
there are also feature (e.g. Enlightened MSR-Bitmap) bits there. In
preparation to adding these features, move HV_CPUID_NESTED_FEATURES leaf
handling to hv_build_cpuid_leaf() and drop now-unneeded 'hyperv_nested'.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/cpu.h |  1 -
 target/i386/kvm/kvm.c | 23 +++
 2 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 9911d7c8711b..537479d24928 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1725,7 +1725,6 @@ struct X86CPU {
 uint32_t hyperv_vendor_id[3];
 uint32_t hyperv_interface_id[4];
 uint32_t hyperv_limits[3];
-uint32_t hyperv_nested[4];
 bool hyperv_enforce_cpuid;
 uint32_t hyperv_ver_id_build;
 uint16_t hyperv_ver_id_major;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 2c8feb4a6f7b..ceb331db8963 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -802,6 +802,8 @@ static bool tsc_is_stable_and_known(CPUX86State *env)
 || env->user_tsc_khz;
 }
 
+#define DEFAULT_EVMCS_VERSION ((1 << 8) | 1)
+
 static struct {
 const char *desc;
 struct {
@@ -1209,6 +1211,13 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, 
uint32_t func, int reg)
 }
 }
 
+/* HV_CPUID_NESTED_FEATURES.EAX also encodes the supported eVMCS range */
+if (func == HV_CPUID_NESTED_FEATURES && reg == R_EAX) {
+if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS)) {
+r |= DEFAULT_EVMCS_VERSION;
+}
+}
+
 return r;
 }
 
@@ -1338,11 +1347,13 @@ static int hyperv_fill_cpuids(CPUState *cs,
 X86CPU *cpu = X86_CPU(cs);
 struct kvm_cpuid_entry2 *c;
 uint32_t cpuid_i = 0;
+uint32_t nested_eax =
+hv_build_cpuid_leaf(cs, HV_CPUID_NESTED_FEATURES, R_EAX);
 
 c = _ent[cpuid_i++];
 c->function = HV_CPUID_VENDOR_AND_MAX_FUNCTIONS;
-c->eax = hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS) ?
-HV_CPUID_NESTED_FEATURES : HV_CPUID_IMPLEMENT_LIMITS;
+c->eax = nested_eax ? HV_CPUID_NESTED_FEATURES :
+HV_CPUID_IMPLEMENT_LIMITS;
 c->ebx = cpu->hyperv_vendor_id[0];
 c->ecx = cpu->hyperv_vendor_id[1];
 c->edx = cpu->hyperv_vendor_id[2];
@@ -1406,7 +1417,7 @@ static int hyperv_fill_cpuids(CPUState *cs,
 c->ecx = cpu->hyperv_limits[1];
 c->edx = cpu->hyperv_limits[2];
 
-if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS)) {
+if (nested_eax) {
 uint32_t function;
 
 /* Create zeroed 0x4006..0x4009 leaves */
@@ -1418,7 +1429,7 @@ static int hyperv_fill_cpuids(CPUState *cs,
 
 c = _ent[cpuid_i++];
 c->function = HV_CPUID_NESTED_FEATURES;
-c->eax = cpu->hyperv_nested[0];
+c->eax = nested_eax;
 }
 
 return cpuid_i;
@@ -1440,8 +1451,6 @@ static bool evmcs_version_supported(uint16_t 
evmcs_version,
 (max_version <= max_supported_version);
 }
 
-#define DEFAULT_EVMCS_VERSION ((1 << 8) | 1)
-
 static int hyperv_init_vcpu(X86CPU *cpu)
 {
 CPUState *cs = CPU(cpu);
@@ -1545,8 +1554,6 @@ static int hyperv_init_vcpu(X86CPU *cpu)
  supported_evmcs_version >> 8);
 return -ENOTSUP;
 }
-
-cpu->hyperv_nested[0] = evmcs_version;
 }
 
 if (cpu->hyperv_enforce_cpuid) {
-- 
2.35.1

Re: [PATCH 0/2] i386: Add support for Hyper-V Enlightened MSR-Bitmap feature

2022-01-27 Thread Vitaly Kuznetsov

Vitaly Kuznetsov  writes:

> The new nested specific enlightenment speeds up L2 vmexits by avoiding
> unnecessary updates to L2 MSR-Bitmap. Support for both VMX and SVM is
> coming to KVM:
> https://lore.kernel.org/kvm/20211129094704.326635-1-vkuzn...@redhat.com/
> https://lore.kernel.org/kvm/20211220152139.418372-1-vkuzn...@redhat.com/
>

Ping)

VMX part made it to KVM in v5.17-rc1:

commit 502d2bf5f2fd7c05adc2d4f057910bd5d4c4c63e
Author: Vitaly Kuznetsov 
Date:   Mon Nov 29 10:47:04 2021 +0100

KVM: nVMX: Implement Enlightened MSR Bitmap feature

SVM part is still pending, will likely go to 5.18. QEMU enablement code
is, however, the same.

> Vitaly Kuznetsov (2):
>   i386: Use hv_build_cpuid_leaf() for HV_CPUID_NESTED_FEATURES
>   i386: Hyper-V Enlightened MSR bitmap feature
>
>  docs/hyperv.txt| 10 ++
>  target/i386/cpu.c  |  2 ++
>  target/i386/cpu.h  |  2 +-
>  target/i386/kvm/hyperv-proto.h |  5 +
>  target/i386/kvm/kvm.c  | 30 ++
>  5 files changed, 40 insertions(+), 9 deletions(-)

-- 
Vitaly

[PATCH 1/2] i386: Use hv_build_cpuid_leaf() for HV_CPUID_NESTED_FEATURES

2022-01-05 Thread Vitaly Kuznetsov

Previously, HV_CPUID_NESTED_FEATURES.EAX CPUID leaf was handled differently
as it was only used to encode the supported eVMCS version range. In fact,
there are also feature (e.g. Enlightened MSR-Bitmap) bits there. In
preparation to adding these features, move HV_CPUID_NESTED_FEATURES leaf
handling to hv_build_cpuid_leaf() and drop now-unneeded 'hyperv_nested'.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/cpu.h |  1 -
 target/i386/kvm/kvm.c | 23 +++
 2 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 04f2b790c9fa..a1165215d972 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1722,7 +1722,6 @@ struct X86CPU {
 uint32_t hyperv_vendor_id[3];
 uint32_t hyperv_interface_id[4];
 uint32_t hyperv_limits[3];
-uint32_t hyperv_nested[4];
 bool hyperv_enforce_cpuid;
 uint32_t hyperv_ver_id_build;
 uint16_t hyperv_ver_id_major;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 13f8e30c2a54..c8f4956a4e0e 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -801,6 +801,8 @@ static bool tsc_is_stable_and_known(CPUX86State *env)
 || env->user_tsc_khz;
 }
 
+#define DEFAULT_EVMCS_VERSION ((1 << 8) | 1)
+
 static struct {
 const char *desc;
 struct {
@@ -1208,6 +1210,13 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, 
uint32_t func, int reg)
 }
 }
 
+/* HV_CPUID_NESTED_FEATURES.EAX also encodes the supported eVMCS range */
+if (func == HV_CPUID_NESTED_FEATURES && reg == R_EAX) {
+if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS)) {
+r |= DEFAULT_EVMCS_VERSION;
+}
+}
+
 return r;
 }
 
@@ -1337,11 +1346,13 @@ static int hyperv_fill_cpuids(CPUState *cs,
 X86CPU *cpu = X86_CPU(cs);
 struct kvm_cpuid_entry2 *c;
 uint32_t cpuid_i = 0;
+uint32_t nested_eax =
+hv_build_cpuid_leaf(cs, HV_CPUID_NESTED_FEATURES, R_EAX);
 
 c = _ent[cpuid_i++];
 c->function = HV_CPUID_VENDOR_AND_MAX_FUNCTIONS;
-c->eax = hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS) ?
-HV_CPUID_NESTED_FEATURES : HV_CPUID_IMPLEMENT_LIMITS;
+c->eax = nested_eax ? HV_CPUID_NESTED_FEATURES :
+HV_CPUID_IMPLEMENT_LIMITS;
 c->ebx = cpu->hyperv_vendor_id[0];
 c->ecx = cpu->hyperv_vendor_id[1];
 c->edx = cpu->hyperv_vendor_id[2];
@@ -1405,7 +1416,7 @@ static int hyperv_fill_cpuids(CPUState *cs,
 c->ecx = cpu->hyperv_limits[1];
 c->edx = cpu->hyperv_limits[2];
 
-if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS)) {
+if (nested_eax) {
 uint32_t function;
 
 /* Create zeroed 0x4006..0x4009 leaves */
@@ -1417,7 +1428,7 @@ static int hyperv_fill_cpuids(CPUState *cs,
 
 c = _ent[cpuid_i++];
 c->function = HV_CPUID_NESTED_FEATURES;
-c->eax = cpu->hyperv_nested[0];
+c->eax = nested_eax;
 }
 
 return cpuid_i;
@@ -1439,8 +1450,6 @@ static bool evmcs_version_supported(uint16_t 
evmcs_version,
 (max_version <= max_supported_version);
 }
 
-#define DEFAULT_EVMCS_VERSION ((1 << 8) | 1)
-
 static int hyperv_init_vcpu(X86CPU *cpu)
 {
 CPUState *cs = CPU(cpu);
@@ -1544,8 +1553,6 @@ static int hyperv_init_vcpu(X86CPU *cpu)
  supported_evmcs_version >> 8);
 return -ENOTSUP;
 }
-
-cpu->hyperv_nested[0] = evmcs_version;
 }
 
 if (cpu->hyperv_enforce_cpuid) {
-- 
2.33.1

[PATCH 0/2] i386: Add support for Hyper-V Enlightened MSR-Bitmap feature

2022-01-05 Thread Vitaly Kuznetsov

The new nested specific enlightenment speeds up L2 vmexits by avoiding
unnecessary updates to L2 MSR-Bitmap. Support for both VMX and SVM is
coming to KVM:
https://lore.kernel.org/kvm/20211129094704.326635-1-vkuzn...@redhat.com/
https://lore.kernel.org/kvm/20211220152139.418372-1-vkuzn...@redhat.com/

Vitaly Kuznetsov (2):
  i386: Use hv_build_cpuid_leaf() for HV_CPUID_NESTED_FEATURES
  i386: Hyper-V Enlightened MSR bitmap feature

 docs/hyperv.txt| 10 ++
 target/i386/cpu.c  |  2 ++
 target/i386/cpu.h  |  2 +-
 target/i386/kvm/hyperv-proto.h |  5 +
 target/i386/kvm/kvm.c  | 30 ++
 5 files changed, 40 insertions(+), 9 deletions(-)

-- 
2.33.1

[PATCH 2/2] i386: Hyper-V Enlightened MSR bitmap feature

2022-01-05 Thread Vitaly Kuznetsov

The newly introduced enlightenment allow L0 (KVM) and L1 (Hyper-V)
hypervisors to collaborate to avoid unnecessary updates to L2
MSR-Bitmap upon vmexits.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt| 10 ++
 target/i386/cpu.c  |  2 ++
 target/i386/cpu.h  |  1 +
 target/i386/kvm/hyperv-proto.h |  5 +
 target/i386/kvm/kvm.c  |  7 +++
 5 files changed, 25 insertions(+)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index 0417c183a3b0..08429124a634 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -225,6 +225,16 @@ default (WS2016).
 Note: hv-version-id-* are not enlightenments and thus don't enable Hyper-V
 identification when specified without any other enlightenments.
 
+3.21. hv-emsr-bitmap
+=
+The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
+enabled, it allows L0 (KVM) and L1 (Hyper-V) hypervisors to collaborate to
+avoid unnecessary updates to L2 MSR-Bitmap upon vmexits. While the protocol is
+supported for both VMX (Intel) and SVM (AMD), the VMX implementation requires
+Enlightened VMCS ('hv-evmcs') feature to also be enabled.
+
+Recommended: hv-evmcs (Intel)
+
 4. Supplementary features
 =
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index aa9e6368004c..f7405fdf4fa5 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6839,6 +6839,8 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_STIMER_DIRECT, 0),
 DEFINE_PROP_BIT64("hv-avic", X86CPU, hyperv_features,
   HYPERV_FEAT_AVIC, 0),
+DEFINE_PROP_BIT64("hv-emsr-bitmap", X86CPU, hyperv_features,
+  HYPERV_FEAT_MSR_BITMAP, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
 DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index a1165215d972..04e3b38abf25 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1060,6 +1060,7 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS];
 #define HYPERV_FEAT_IPI 13
 #define HYPERV_FEAT_STIMER_DIRECT   14
 #define HYPERV_FEAT_AVIC15
+#define HYPERV_FEAT_MSR_BITMAP  16
 
 #ifndef HYPERV_SPINLOCK_NEVER_NOTIFY
 #define HYPERV_SPINLOCK_NEVER_NOTIFY 0x
diff --git a/target/i386/kvm/hyperv-proto.h b/target/i386/kvm/hyperv-proto.h
index 89f81afda7c6..38e25468122d 100644
--- a/target/i386/kvm/hyperv-proto.h
+++ b/target/i386/kvm/hyperv-proto.h
@@ -72,6 +72,11 @@
 #define HV_ENLIGHTENED_VMCS_RECOMMENDED (1u << 14)
 #define HV_NO_NONARCH_CORESHARING   (1u << 18)
 
+/*
+ * HV_CPUID_NESTED_FEATURES.EAX bits
+ */
+#define HV_NESTED_MSR_BITMAP(1u << 19)
+
 /*
  * Basic virtualized MSRs
  */
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index c8f4956a4e0e..2baa9609e181 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -933,6 +933,13 @@ static struct {
  .bits = HV_DEPRECATING_AEOI_RECOMMENDED}
 }
 },
+[HYPERV_FEAT_MSR_BITMAP] = {
+.desc = "enlightened MSR-Bitmap (hv-emsr-bitmap)",
+.flags = {
+{.func = HV_CPUID_NESTED_FEATURES, .reg = R_EAX,
+ .bits = HV_NESTED_MSR_BITMAP}
+}
+},
 };
 
 static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max,
-- 
2.33.1

Re: [PATCH v3] i386: docs: Briefly describe KVM PV features

2021-10-27 Thread Vitaly Kuznetsov

Igor Mammedov  writes:

> On Mon,  4 Oct 2021 16:04:45 +0200
> Vitaly Kuznetsov  wrote:
>

Thanks for the review! As I can see, the patch already made it to
'master':

commit 7f7c8d0ce3630849a4df3d627b11de354fcb3bb0
Author: Vitaly Kuznetsov 
Date:   Mon Oct 4 16:04:45 2021 +0200

i386: docs: Briefly describe KVM PV features

we can send follow-ups, of course. 

>> KVM PV features don't seem to be documented anywhere, in particular, the
>> fact that some of the features are enabled by default and some are not can
>> only be figured out from the code.
>> 
>> Signed-off-by: Vitaly Kuznetsov 
>> ---
>> Changes since "[PATCH v2 0/8] i386: Assorted KVM PV and Hyper-V feature
>>  improvements" [Paolo Bonzini]:
>> - Convert to 'rst' and move to docs/system/i386/kvm-pv.rst.
>> - Add information about the version of Linux that introduced the particular
>>   PV feature.
>> ---
>>  docs/system/i386/kvm-pv.rst | 100 
>>  docs/system/target-i386.rst |   1 +
>>  2 files changed, 101 insertions(+)
>>  create mode 100644 docs/system/i386/kvm-pv.rst
>> 
>> diff --git a/docs/system/i386/kvm-pv.rst b/docs/system/i386/kvm-pv.rst
>> new file mode 100644
>> index ..1e5a9923ef45
>> --- /dev/null
>> +++ b/docs/system/i386/kvm-pv.rst
>> @@ -0,0 +1,100 @@
>> +Paravirtualized KVM features
>> +
>> +
>> +Description
>> +---
>> +
>> +In some cases when implementing hardware interfaces in software is slow, 
>> ``KVM``
>> +implements its own paravirtualized interfaces.
>> +
>> +Setup
>> +-
>> +
>> +Paravirtualized ``KVM`` features are represented as CPU flags. The following
>> +features are enabled by default for any CPU model when ``KVM`` acceleration 
>> is
>> +enabled:
>
> /if host kernel supports them
>

It does as QEMU requires linux >= 4.5. I'm not sure what happens if it
doesn't, maybe it won't start. 

>> +
>> +- ``kvmclock``
>> +- ``kvm-nopiodelay``
>
>> +- ``kvm-asyncpf``
>
> later you say it's not enabled by default since x.y and something else
> should be used instead

The situation is a bit weird. QEMU will still be enabling kvm-asyncpf by
default. This, however, has no effect currently as KVM dropped support
for this feature (in favor of kvm-asyncpf-int but this one is *not*
enabled by default)

>
> maybe add a kernel version for each item in this list aka: (since: ... 
> [,till])
>
>> +- ``kvm-steal-time``
>> +- ``kvm-pv-eoi``
>> +- ``kvmclock-stable-bit``
>> +
>> +``kvm-msi-ext-dest-id`` feature is enabled by default in x2apic mode with 
>> split
>> +irqchip (e.g. "-machine ...,kernel-irqchip=split -cpu ...,x2apic").
>
>
>> +Note: when CPU model ``host`` is used, QEMU passes through all supported
>> +paravirtualized ``KVM`` features to the guest.
>
> Is it true in case of kvm-pv-enforce-cpuid=on ?

Yes, I believe these two things are orthogonal: CPU model 'host' will
give you everything supported by the kernel, 'kvm-pv-enforce-cpuid' will
tell KVM to forbid using features, not exposed in guest visible CPUIDs:
but combined with 'host' this is going to be an empty set as all
features are enabled.

>
> Also I'd s/passes through/enables/
> on the grounds that host CPUID simply doesn't have such CPUIDs
> so it's a bit confusing.

I meant to say 'passes through' from KVM, not from pCPU but I see why
this is not clear.

>
>
>> +Existing features
>> +-
>> +
>> +``kvmclock``
>> +  Expose a ``KVM`` specific paravirtualized clocksource to the guest. 
>> Supported
>> +  since Linux v2.6.26.
>> +
>> +``kvm-nopiodelay``
>> +  The guest doesn't need to perform delays on PIO operations. Supported 
>> since
>> +  Linux v2.6.26.
>> +
>> +``kvm-mmu``
>> +  This feature is deprecated.
>> +
>> +``kvm-asyncpf``
>> +  Enable asynchronous page fault mechanism. Supported since Linux v2.6.38.
>> +  Note: since Linux v5.10 the feature is deprecated and not enabled by 
>> ``KVM``.
>
>> +  Use ``kvm-asyncpf-int`` instead.
> 'Use' or 'Used' by default?
>

'kvm-asyncpf' is a dead feature now so in case users want to get
Asynchronouse Page Faults they need to enable 'kvm-asyncpf-int'
manually, thus 'use'.

>
>> +``kvm-steal-time``
>> +  Enable stolen (when guest vCPU is not running) time accounting. Supported
>> +  since Linux v3.1.
>> +
>> +``kvm-pv-eoi``
>> +  Enable paravirtualized end-of-interrupt signaling. Supported since Linux
>> +  v3.10.
>> +
>&

[PATCH v3] i386: docs: Briefly describe KVM PV features

2021-10-04 Thread Vitaly Kuznetsov

KVM PV features don't seem to be documented anywhere, in particular, the
fact that some of the features are enabled by default and some are not can
only be figured out from the code.

Signed-off-by: Vitaly Kuznetsov 
---
Changes since "[PATCH v2 0/8] i386: Assorted KVM PV and Hyper-V feature
 improvements" [Paolo Bonzini]:
- Convert to 'rst' and move to docs/system/i386/kvm-pv.rst.
- Add information about the version of Linux that introduced the particular
  PV feature.
---
 docs/system/i386/kvm-pv.rst | 100 
 docs/system/target-i386.rst |   1 +
 2 files changed, 101 insertions(+)
 create mode 100644 docs/system/i386/kvm-pv.rst

diff --git a/docs/system/i386/kvm-pv.rst b/docs/system/i386/kvm-pv.rst
new file mode 100644
index ..1e5a9923ef45
--- /dev/null
+++ b/docs/system/i386/kvm-pv.rst
@@ -0,0 +1,100 @@
+Paravirtualized KVM features
+
+
+Description
+---
+
+In some cases when implementing hardware interfaces in software is slow, 
``KVM``
+implements its own paravirtualized interfaces.
+
+Setup
+-
+
+Paravirtualized ``KVM`` features are represented as CPU flags. The following
+features are enabled by default for any CPU model when ``KVM`` acceleration is
+enabled:
+
+- ``kvmclock``
+- ``kvm-nopiodelay``
+- ``kvm-asyncpf``
+- ``kvm-steal-time``
+- ``kvm-pv-eoi``
+- ``kvmclock-stable-bit``
+
+``kvm-msi-ext-dest-id`` feature is enabled by default in x2apic mode with split
+irqchip (e.g. "-machine ...,kernel-irqchip=split -cpu ...,x2apic").
+
+Note: when CPU model ``host`` is used, QEMU passes through all supported
+paravirtualized ``KVM`` features to the guest.
+
+Existing features
+-
+
+``kvmclock``
+  Expose a ``KVM`` specific paravirtualized clocksource to the guest. Supported
+  since Linux v2.6.26.
+
+``kvm-nopiodelay``
+  The guest doesn't need to perform delays on PIO operations. Supported since
+  Linux v2.6.26.
+
+``kvm-mmu``
+  This feature is deprecated.
+
+``kvm-asyncpf``
+  Enable asynchronous page fault mechanism. Supported since Linux v2.6.38.
+  Note: since Linux v5.10 the feature is deprecated and not enabled by ``KVM``.
+  Use ``kvm-asyncpf-int`` instead.
+
+``kvm-steal-time``
+  Enable stolen (when guest vCPU is not running) time accounting. Supported
+  since Linux v3.1.
+
+``kvm-pv-eoi``
+  Enable paravirtualized end-of-interrupt signaling. Supported since Linux
+  v3.10.
+
+``kvm-pv-unhalt``
+  Enable paravirtualized spinlocks support. Supported since Linux v3.12.
+
+``kvm-pv-tlb-flush``
+  Enable paravirtualized TLB flush mechanism. Supported since Linux v4.16.
+
+``kvm-pv-ipi``
+  Enable paravirtualized IPI mechanism. Supported since Linux v4.19.
+
+``kvm-poll-control``
+  Enable host-side polling on HLT control from the guest. Supported since Linux
+  v5.10.
+
+``kvm-pv-sched-yield``
+  Enable paravirtualized sched yield feature. Supported since Linux v5.10.
+
+``kvm-asyncpf-int``
+  Enable interrupt based asynchronous page fault mechanism. Supported since 
Linux
+  v5.10.
+
+``kvm-msi-ext-dest-id``
+  Support 'Extended Destination ID' for external interrupts. The feature allows
+  to use up to 32768 CPUs without IRQ remapping (but other limits may apply 
making
+  the number of supported vCPUs for a given configuration lower). Supported 
since
+  Linux v5.10.
+
+``kvmclock-stable-bit``
+  Tell the guest that guest visible TSC value can be fully trusted for kvmclock
+  computations and no warps are expected. Supported since Linux v2.6.35.
+
+Supplementary features
+--
+
+``kvm-pv-enforce-cpuid``
+  Limit the supported paravirtualized feature set to the exposed features only.
+  Note, by default, ``KVM`` allows the guest to use all currently supported
+  paravirtualized features even when they were not announced in guest visible
+  CPUIDs. Supported since Linux v5.10.
+
+
+Useful links
+
+
+Please refer to Documentation/virt/kvm in Linux for additional details.
diff --git a/docs/system/target-i386.rst b/docs/system/target-i386.rst
index 6a86d638633a..4daa53c35d8f 100644
--- a/docs/system/target-i386.rst
+++ b/docs/system/target-i386.rst
@@ -26,6 +26,7 @@ Architectural features
:maxdepth: 1
 
i386/cpu
+   i386/kvm-pv
i386/sgx
 
 .. _pcsys_005freq:
-- 
2.31.1

Re: [PATCH v2 0/8] i386: Assorted KVM PV and Hyper-V feature improvements

2021-09-30 Thread Vitaly Kuznetsov

Paolo Bonzini  writes:

> On 02/09/21 11:35, Vitaly Kuznetsov wrote:
>> This is a continuation of "[PATCH 0/3] i386/kvm: Paravirtualized features 
>> usage
>> enforcement" series, thus v2.
>> 
>> This series implements several unrelated features but as there are code
>> dependencies between them I'm sending it as one series.
>> 
>> PATCH1 adds empty 6.2 machine types and the required compat infrastructure
>> (to be used by PATCH8)
>> PATCH2 adds documentation for KVM PV features
>> PATCH3 adds support for KVM_CAP_ENFORCE_PV_FEATURE_CPUID
>> PATCH4 adds support for KVM_CAP_HYPERV_ENFORCE_CPUID
>> PATCHes5-6 add 'hv-avic' feature
>> PATCH7 makes Hyper-V version info settable
>> PATCH8 changes the default Hyper-V version to 2016
>> 
>> Vitaly Kuznetsov (8):
>>i386: Add 6.2 machine types
>>i386: docs: Briefly describe KVM PV features
>>i386: Support KVM_CAP_ENFORCE_PV_FEATURE_CPUID
>>i386: Support KVM_CAP_HYPERV_ENFORCE_CPUID
>>i386: Move HV_APIC_ACCESS_RECOMMENDED bit setting to
>>  hyperv_fill_cpuids()
>>i386: Implement pseudo 'hv-avic' ('hv-apicv') enlightenment
>>i386: Make Hyper-V version id configurable
>>i386: Change the default Hyper-V version to match WS2016
>> 
>>   docs/hyperv.txt|  41 +++--
>>   docs/kvm-pv.txt| 103 +
>>   hw/core/machine.c  |   3 +
>>   hw/i386/pc.c   |   7 +++
>>   hw/i386/pc_piix.c  |  14 -
>>   hw/i386/pc_q35.c   |  13 -
>>   include/hw/boards.h|   3 +
>>   include/hw/i386/pc.h   |   3 +
>>   target/i386/cpu.c  |  22 +--
>>   target/i386/cpu.h  |  12 +++-
>>   target/i386/kvm/hyperv-proto.h |   1 +
>>   target/i386/kvm/kvm.c  |  62 +++-
>>   12 files changed, 260 insertions(+), 24 deletions(-)
>>   create mode 100644 docs/kvm-pv.txt
>> 
>
> Queued patches 3-8, thanks.

Patch3 with the hunk to docs/kvm-pv.txt dropped I suppose (as PATCH2
introducing the file is not queued)? I can include it in the next
submission then.

Thanks!

-- 
Vitaly

Re: [PATCH v2 0/8] i386: Assorted KVM PV and Hyper-V feature improvements

2021-09-17 Thread Vitaly Kuznetsov

Vitaly Kuznetsov  writes:

> This is a continuation of "[PATCH 0/3] i386/kvm: Paravirtualized features 
> usage 
> enforcement" series, thus v2.
>
> This series implements several unrelated features but as there are code
> dependencies between them I'm sending it as one series.
>
> PATCH1 adds empty 6.2 machine types and the required compat infrastructure
> (to be used by PATCH8)
> PATCH2 adds documentation for KVM PV features
> PATCH3 adds support for KVM_CAP_ENFORCE_PV_FEATURE_CPUID
> PATCH4 adds support for KVM_CAP_HYPERV_ENFORCE_CPUID
> PATCHes5-6 add 'hv-avic' feature
> PATCH7 makes Hyper-V version info settable
> PATCH8 changes the default Hyper-V version to 2016

Eduardo, Paolo, all,

any comments? It seems patches can still be applied to 'master' with no
issues.

-- 
Vitaly

[PATCH v2 5/8] i386: Move HV_APIC_ACCESS_RECOMMENDED bit setting to hyperv_fill_cpuids()

2021-09-02 Thread Vitaly Kuznetsov

In preparation to enabling Hyper-V + APICv/AVIC move
HV_APIC_ACCESS_RECOMMENDED setting out of kvm_hyperv_properties[]: the
'real' feature bit for the vAPIC features is HV_APIC_ACCESS_AVAILABLE,
HV_APIC_ACCESS_RECOMMENDED is a recommendation to use the feature which
we may not always want to give.

Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/kvm/kvm.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index bd0b53416315..430007c2691a 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -821,9 +821,7 @@ static struct {
 .desc = "virtual APIC (hv-vapic)",
 .flags = {
 {.func = HV_CPUID_FEATURES, .reg = R_EAX,
- .bits = HV_APIC_ACCESS_AVAILABLE},
-{.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX,
- .bits = HV_APIC_ACCESS_RECOMMENDED}
+ .bits = HV_APIC_ACCESS_AVAILABLE}
 }
 },
 [HYPERV_FEAT_TIME] = {
@@ -1366,6 +1364,7 @@ static int hyperv_fill_cpuids(CPUState *cs,
 c->ebx |= HV_POST_MESSAGES | HV_SIGNAL_EVENTS;
 }
 
+
 /* Not exposed by KVM but needed to make CPU hotplug in Windows work */
 c->edx |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
 
@@ -1374,6 +1373,10 @@ static int hyperv_fill_cpuids(CPUState *cs,
 c->eax = hv_build_cpuid_leaf(cs, HV_CPUID_ENLIGHTMENT_INFO, R_EAX);
 c->ebx = cpu->hyperv_spinlock_attempts;
 
+if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VAPIC)) {
+c->eax |= HV_APIC_ACCESS_RECOMMENDED;
+}
+
 if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_ON) {
 c->eax |= HV_NO_NONARCH_CORESHARING;
 } else if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_AUTO) {
-- 
2.31.1

[PATCH v2 0/8] i386: Assorted KVM PV and Hyper-V feature improvements

2021-09-02 Thread Vitaly Kuznetsov

This is a continuation of "[PATCH 0/3] i386/kvm: Paravirtualized features usage 
enforcement" series, thus v2.

This series implements several unrelated features but as there are code
dependencies between them I'm sending it as one series.

PATCH1 adds empty 6.2 machine types and the required compat infrastructure
(to be used by PATCH8)
PATCH2 adds documentation for KVM PV features
PATCH3 adds support for KVM_CAP_ENFORCE_PV_FEATURE_CPUID
PATCH4 adds support for KVM_CAP_HYPERV_ENFORCE_CPUID
PATCHes5-6 add 'hv-avic' feature
PATCH7 makes Hyper-V version info settable
PATCH8 changes the default Hyper-V version to 2016

Vitaly Kuznetsov (8):
  i386: Add 6.2 machine types
  i386: docs: Briefly describe KVM PV features
  i386: Support KVM_CAP_ENFORCE_PV_FEATURE_CPUID
  i386: Support KVM_CAP_HYPERV_ENFORCE_CPUID
  i386: Move HV_APIC_ACCESS_RECOMMENDED bit setting to
hyperv_fill_cpuids()
  i386: Implement pseudo 'hv-avic' ('hv-apicv') enlightenment
  i386: Make Hyper-V version id configurable
  i386: Change the default Hyper-V version to match WS2016

 docs/hyperv.txt|  41 +++--
 docs/kvm-pv.txt| 103 +
 hw/core/machine.c  |   3 +
 hw/i386/pc.c   |   7 +++
 hw/i386/pc_piix.c  |  14 -
 hw/i386/pc_q35.c   |  13 -
 include/hw/boards.h|   3 +
 include/hw/i386/pc.h   |   3 +
 target/i386/cpu.c  |  22 +--
 target/i386/cpu.h  |  12 +++-
 target/i386/kvm/hyperv-proto.h |   1 +
 target/i386/kvm/kvm.c  |  62 +++-
 12 files changed, 260 insertions(+), 24 deletions(-)
 create mode 100644 docs/kvm-pv.txt

-- 
2.31.1

[PATCH v2 3/8] i386: Support KVM_CAP_ENFORCE_PV_FEATURE_CPUID

2021-09-02 Thread Vitaly Kuznetsov

By default, KVM allows the guest to use all currently supported PV features
even when they were not announced in guest visible CPUIDs. Introduce a new
"kvm-pv-enforce-cpuid" flag to limit the supported feature set to the
exposed features. The feature is supported by Linux >= 5.10 and is not
enabled by default in QEMU.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/kvm-pv.txt   | 13 -
 target/i386/cpu.c |  2 ++
 target/i386/cpu.h |  3 +++
 target/i386/kvm/kvm.c | 10 ++
 4 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/docs/kvm-pv.txt b/docs/kvm-pv.txt
index 84ad7fa60f8d..d1aac533feea 100644
--- a/docs/kvm-pv.txt
+++ b/docs/kvm-pv.txt
@@ -87,6 +87,17 @@ the number of supported vCPUs for a given configuration 
lower).
 Tells the guest that guest visible TSC value can be fully trusted for kvmclock
 computations and no warps are expected.
 
-4. Useful links
+4. Supplementary features
+=
+
+4.1. kvm-pv-enforce-cpuid
+=
+By default, KVM allows the guest to use all currently supported PV features 
even
+when they were not announced in guest visible CPUIDs. 'kvm-pv-enforce-cpuid'
+feature alters this behavior and limits the supported feature set to the
+exposed features only.
+
+
+5. Useful links
 
 Please refer to Documentation/virt/kvm in Linux for additional detail.
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 97e250e8760d..a70038f172d9 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6691,6 +6691,8 @@ static Property x86_cpu_properties[] = {
 DEFINE_PROP_BOOL("l3-cache", X86CPU, enable_l3_cache, true),
 DEFINE_PROP_BOOL("kvm-no-smi-migration", X86CPU, kvm_no_smi_migration,
  false),
+DEFINE_PROP_BOOL("kvm-pv-enforce-cpuid", X86CPU, kvm_pv_enforce_cpuid,
+ false),
 DEFINE_PROP_BOOL("vmware-cpuid-freq", X86CPU, vmware_cpuid_freq, true),
 DEFINE_PROP_BOOL("tcg-cpuid", X86CPU, expose_tcg, true),
 DEFINE_PROP_BOOL("x-migrate-smi-count", X86CPU, migrate_smi_count,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 6c50d3ab4f1d..20273a8069dd 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1782,6 +1782,9 @@ struct X86CPU {
 /* Stop SMI delivery for migration compatibility with old machines */
 bool kvm_no_smi_migration;
 
+/* Forcefully disable KVM PV features not exposed in guest CPUIDs */
+bool kvm_pv_enforce_cpuid;
+
 /* Number of physical address bits supported */
 uint32_t phys_bits;
 
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 500d2e0e686f..49f97f345069 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1629,6 +1629,16 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
 cpu_x86_cpuid(env, 0, 0, , , , );
 
+if (cpu->kvm_pv_enforce_cpuid) {
+r = kvm_vcpu_enable_cap(cs, KVM_CAP_ENFORCE_PV_FEATURE_CPUID, 0, 1);
+if (r < 0) {
+fprintf(stderr,
+"failed to enable KVM_CAP_ENFORCE_PV_FEATURE_CPUID: %s",
+strerror(-r));
+abort();
+}
+}
+
 for (i = 0; i <= limit; i++) {
 if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
 fprintf(stderr, "unsupported level value: 0x%x\n", limit);
-- 
2.31.1

[PATCH v2 8/8] i386: Change the default Hyper-V version to match WS2016

2021-09-02 Thread Vitaly Kuznetsov

KVM implements some Hyper-V 2016 functions so providing WS2008R2 version
is somewhat incorrect. While generally guests shouldn't care about it
and always check feature bits, it is known that some tools in Windows
actually check version info.

For compatibility reasons make the change for 7.2 machine types only.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt   | 2 +-
 hw/i386/pc.c  | 6 +-
 target/i386/cpu.c | 6 +++---
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index 7803495468b7..5d99fd9a72b8 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -214,7 +214,7 @@ exposing correct vCPU topology and vCPU pinning.
 3.20. hv-version-id-{build,major,minor,spack,sbranch,snumber}
 =
 This changes Hyper-V version identification in CPUID 0x4002.EAX-EDX from 
the
-default (WS2008R2).
+default (WS2016).
 - hv-version-id-build sets 'Build Number' (32 bits)
 - hv-version-id-major sets 'Major Version' (16 bits)
 - hv-version-id-minor sets 'Minor Version' (16 bits)
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 1276bfeee456..b2e4eef9d211 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -93,7 +93,11 @@
 #include "trace.h"
 #include CONFIG_DEVICES
 
-GlobalProperty pc_compat_6_1[] = {};
+GlobalProperty pc_compat_6_1[] = {
+{ TYPE_X86_CPU, "hv-version-id-build", "0x1bbc" },
+{ TYPE_X86_CPU, "hv-version-id-major", "0x0006" },
+{ TYPE_X86_CPU, "hv-version-id-minor", "0x0001" },
+};
 const size_t pc_compat_6_1_len = G_N_ELEMENTS(pc_compat_6_1);
 
 GlobalProperty pc_compat_6_0[] = {
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 5766e720093d..569840deaf93 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6669,11 +6669,11 @@ static Property x86_cpu_properties[] = {
 
 /* WS2008R2 identify by default */
 DEFINE_PROP_UINT32("hv-version-id-build", X86CPU, hyperv_ver_id_build,
-   0x1bbc),
+   0x3839),
 DEFINE_PROP_UINT16("hv-version-id-major", X86CPU, hyperv_ver_id_major,
-   0x0006),
+   0x000A),
 DEFINE_PROP_UINT16("hv-version-id-minor", X86CPU, hyperv_ver_id_minor,
-   0x0001),
+   0x),
 DEFINE_PROP_UINT32("hv-version-id-spack", X86CPU, hyperv_ver_id_sp, 0),
 DEFINE_PROP_UINT8("hv-version-id-sbranch", X86CPU, hyperv_ver_id_sb, 0),
 DEFINE_PROP_UINT32("hv-version-id-snumber", X86CPU, hyperv_ver_id_sn, 0),
-- 
2.31.1

[PATCH v2 1/8] i386: Add 6.2 machine types

2021-09-02 Thread Vitaly Kuznetsov

Introduce 6.2 machine types and the required infrastructure for adding
compat properties to pre-6.2 machine types.

Signed-off-by: Vitaly Kuznetsov 
---
 hw/core/machine.c|  3 +++
 hw/i386/pc.c |  3 +++
 hw/i386/pc_piix.c| 14 +-
 hw/i386/pc_q35.c | 13 -
 include/hw/boards.h  |  3 +++
 include/hw/i386/pc.h |  3 +++
 6 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 54e040587dd3..9d0d1194e1ef 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -46,6 +46,9 @@ GlobalProperty hw_compat_6_0[] = {
 };
 const size_t hw_compat_6_0_len = G_N_ELEMENTS(hw_compat_6_0);
 
+GlobalProperty hw_compat_6_1[] = {};
+const size_t hw_compat_6_1_len = G_N_ELEMENTS(hw_compat_6_1);
+
 GlobalProperty hw_compat_5_2[] = {
 { "ICH9-LPC", "smm-compat", "on"},
 { "PIIX4_PM", "smm-compat", "on"},
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 102b22394689..1276bfeee456 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -93,6 +93,9 @@
 #include "trace.h"
 #include CONFIG_DEVICES
 
+GlobalProperty pc_compat_6_1[] = {};
+const size_t pc_compat_6_1_len = G_N_ELEMENTS(pc_compat_6_1);
+
 GlobalProperty pc_compat_6_0[] = {
 { "qemu64" "-" TYPE_X86_CPU, "family", "6" },
 { "qemu64" "-" TYPE_X86_CPU, "model", "6" },
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 1bc30167acc0..c5da7739cef7 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -412,7 +412,7 @@ static void pc_i440fx_machine_options(MachineClass *m)
 machine_class_allow_dynamic_sysbus_dev(m, TYPE_VMBUS_BRIDGE);
 }
 
-static void pc_i440fx_6_1_machine_options(MachineClass *m)
+static void pc_i440fx_6_2_machine_options(MachineClass *m)
 {
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 pc_i440fx_machine_options(m);
@@ -421,6 +421,18 @@ static void pc_i440fx_6_1_machine_options(MachineClass *m)
 pcmc->default_cpu_version = 1;
 }
 
+DEFINE_I440FX_MACHINE(v6_2, "pc-i440fx-6.2", NULL,
+  pc_i440fx_6_2_machine_options);
+
+static void pc_i440fx_6_1_machine_options(MachineClass *m)
+{
+pc_i440fx_6_2_machine_options(m);
+m->alias = NULL;
+m->is_default = false;
+compat_props_add(m->compat_props, hw_compat_6_1, hw_compat_6_1_len);
+compat_props_add(m->compat_props, pc_compat_6_1, pc_compat_6_1_len);
+}
+
 DEFINE_I440FX_MACHINE(v6_1, "pc-i440fx-6.1", NULL,
   pc_i440fx_6_1_machine_options);
 
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index eeb0b185b118..565fadce540c 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -354,7 +354,7 @@ static void pc_q35_machine_options(MachineClass *m)
 m->max_cpus = 288;
 }
 
-static void pc_q35_6_1_machine_options(MachineClass *m)
+static void pc_q35_6_2_machine_options(MachineClass *m)
 {
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 pc_q35_machine_options(m);
@@ -362,6 +362,17 @@ static void pc_q35_6_1_machine_options(MachineClass *m)
 pcmc->default_cpu_version = 1;
 }
 
+DEFINE_Q35_MACHINE(v6_2, "pc-q35-6.2", NULL,
+   pc_q35_6_2_machine_options);
+
+static void pc_q35_6_1_machine_options(MachineClass *m)
+{
+pc_q35_6_2_machine_options(m);
+m->alias = NULL;
+compat_props_add(m->compat_props, hw_compat_6_1, hw_compat_6_1_len);
+compat_props_add(m->compat_props, pc_compat_6_1, pc_compat_6_1_len);
+}
+
 DEFINE_Q35_MACHINE(v6_1, "pc-q35-6.1", NULL,
pc_q35_6_1_machine_options);
 
diff --git a/include/hw/boards.h b/include/hw/boards.h
index accd6eff35ab..463a5514f97d 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -353,6 +353,9 @@ struct MachineState {
 } \
 type_init(machine_initfn##_register_types)
 
+extern GlobalProperty hw_compat_6_1[];
+extern const size_t hw_compat_6_1_len;
+
 extern GlobalProperty hw_compat_6_0[];
 extern const size_t hw_compat_6_0_len;
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 88dffe751724..97b4ab79b534 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -196,6 +196,9 @@ void pc_system_parse_ovmf_flash(uint8_t *flash_ptr, size_t 
flash_size);
 void pc_madt_cpu_entry(AcpiDeviceIf *adev, int uid,
const CPUArchIdList *apic_ids, GArray *entry);
 
+extern GlobalProperty pc_compat_6_1[];
+extern const size_t pc_compat_6_1_len;
+
 extern GlobalProperty pc_compat_6_0[];
 extern const size_t pc_compat_6_0_len;
 
-- 
2.31.1

[PATCH v2 4/8] i386: Support KVM_CAP_HYPERV_ENFORCE_CPUID

2021-09-02 Thread Vitaly Kuznetsov

By default, KVM allows the guest to use all currently supported Hyper-V
enlightenments when Hyper-V CPUID interface was exposed, regardless of if
some features were not announced in guest visible CPUIDs. hv-enforce-cpuid
feature alters this behavior and only allows the guest to use exposed
Hyper-V enlightenments. The feature is supported by Linux >= 5.14 and is
not enabled by default in QEMU.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt   | 17 ++---
 target/i386/cpu.c |  1 +
 target/i386/cpu.h |  1 +
 target/i386/kvm/kvm.c |  9 +
 4 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index 000638a2fd38..072709a68f47 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -203,8 +203,11 @@ When the option is set to 'on' QEMU will always enable the 
feature, regardless
 of host setup. To keep guests secure, this can only be used in conjunction with
 exposing correct vCPU topology and vCPU pinning.
 
-4. Development features
-
+4. Supplementary features
+=
+
+4.1. hv-passthrough
+===
 In some cases (e.g. during development) it may make sense to use QEMU in
 'pass-through' mode and give Windows guests all enlightenments currently
 supported by KVM. This pass-through mode is enabled by "hv-passthrough" CPU
@@ -215,8 +218,16 @@ values from KVM to QEMU. "hv-passthrough" overrides all 
other "hv-*" settings on
 the command line. Also, enabling this flag effectively prevents migration as 
the
 list of enabled enlightenments may differ between target and destination hosts.
 
+4.2. hv-enforce-cpuid
+=
+By default, KVM allows the guest to use all currently supported Hyper-V
+enlightenments when Hyper-V CPUID interface was exposed, regardless of if
+some features were not announced in guest visible CPUIDs. 'hv-enforce-cpuid'
+feature alters this behavior and only allows the guest to use exposed Hyper-V
+enlightenments.
+
 
-4. Useful links
+5. Useful links
 
 Hyper-V Top Level Functional specification and other information:
 https://github.com/MicrosoftDocs/Virtualization-Documentation
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index a70038f172d9..36e1b6ec9c9b 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6665,6 +6665,7 @@ static Property x86_cpu_properties[] = {
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
 DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
+DEFINE_PROP_BOOL("hv-enforce-cpuid", X86CPU, hyperv_enforce_cpuid, false),
 
 DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
 DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 20273a8069dd..8822bea5c9a4 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1699,6 +1699,7 @@ struct X86CPU {
 uint32_t hyperv_version_id[4];
 uint32_t hyperv_limits[3];
 uint32_t hyperv_nested[4];
+bool hyperv_enforce_cpuid;
 
 bool check_cpuid;
 bool enforce_cpuid;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 49f97f345069..bd0b53416315 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1531,6 +1531,15 @@ static int hyperv_init_vcpu(X86CPU *cpu)
 cpu->hyperv_nested[0] = evmcs_version;
 }
 
+if (cpu->hyperv_enforce_cpuid) {
+ret = kvm_vcpu_enable_cap(cs, KVM_CAP_HYPERV_ENFORCE_CPUID, 0, 1);
+if (ret < 0) {
+error_report("failed to enable KVM_CAP_HYPERV_ENFORCE_CPUID: %s",
+ strerror(-ret));
+return ret;
+}
+}
+
 return 0;
 }
 
-- 
2.31.1

[PATCH v2 7/8] i386: Make Hyper-V version id configurable

2021-09-02 Thread Vitaly Kuznetsov

Currently, we hardcode Hyper-V version id (CPUID 0x4002) to
WS2008R2 and it is known that certain tools in Windows check this. It
seems useful to provide some flexibility by making it possible to change
this info at will. CPUID information is defined in TLFS as:

EAX: Build Number
EBX Bits 31-16: Major Version
Bits 15-0: Minor Version
ECX Service Pack
EDX Bits 31-24: Service Branch
Bits 23-0: Service Number

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt   | 14 ++
 target/i386/cpu.c | 15 +++
 target/i386/cpu.h |  7 ++-
 target/i386/kvm/kvm.c | 26 --
 4 files changed, 47 insertions(+), 15 deletions(-)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index cd1ea3bbe9d7..7803495468b7 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -211,6 +211,20 @@ When the option is set to 'on' QEMU will always enable the 
feature, regardless
 of host setup. To keep guests secure, this can only be used in conjunction with
 exposing correct vCPU topology and vCPU pinning.
 
+3.20. hv-version-id-{build,major,minor,spack,sbranch,snumber}
+=
+This changes Hyper-V version identification in CPUID 0x4002.EAX-EDX from 
the
+default (WS2008R2).
+- hv-version-id-build sets 'Build Number' (32 bits)
+- hv-version-id-major sets 'Major Version' (16 bits)
+- hv-version-id-minor sets 'Minor Version' (16 bits)
+- hv-version-id-spack sets 'Service Pack' (32 bits)
+- hv-version-id-sbranch sets 'Service Branch' (8 bits)
+- hv-version-id-snumber sets 'Service Number' (24 bits)
+
+Note: hv-version-id-* are not enlightenments and thus don't enable Hyper-V
+identification when specified without any other enlightenments.
+
 4. Supplementary features
 =
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index a695e200d409..5766e720093d 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6093,10 +6093,6 @@ static void x86_cpu_hyperv_realize(X86CPU *cpu)
 cpu->hyperv_interface_id[2] = 0;
 cpu->hyperv_interface_id[3] = 0;
 
-/* Hypervisor system identity */
-cpu->hyperv_version_id[0] = 0x1bbc;
-cpu->hyperv_version_id[1] = 0x00060001;
-
 /* Hypervisor implementation limits */
 cpu->hyperv_limits[0] = 64;
 cpu->hyperv_limits[1] = 0;
@@ -6671,6 +6667,17 @@ static Property x86_cpu_properties[] = {
 DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
 DEFINE_PROP_BOOL("hv-enforce-cpuid", X86CPU, hyperv_enforce_cpuid, false),
 
+/* WS2008R2 identify by default */
+DEFINE_PROP_UINT32("hv-version-id-build", X86CPU, hyperv_ver_id_build,
+   0x1bbc),
+DEFINE_PROP_UINT16("hv-version-id-major", X86CPU, hyperv_ver_id_major,
+   0x0006),
+DEFINE_PROP_UINT16("hv-version-id-minor", X86CPU, hyperv_ver_id_minor,
+   0x0001),
+DEFINE_PROP_UINT32("hv-version-id-spack", X86CPU, hyperv_ver_id_sp, 0),
+DEFINE_PROP_UINT8("hv-version-id-sbranch", X86CPU, hyperv_ver_id_sb, 0),
+DEFINE_PROP_UINT32("hv-version-id-snumber", X86CPU, hyperv_ver_id_sn, 0),
+
 DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
 DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
 DEFINE_PROP_BOOL("x-force-features", X86CPU, force_features, false),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index d22a8d259967..5c2bf1079745 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1697,10 +1697,15 @@ struct X86CPU {
 OnOffAuto hyperv_no_nonarch_cs;
 uint32_t hyperv_vendor_id[3];
 uint32_t hyperv_interface_id[4];
-uint32_t hyperv_version_id[4];
 uint32_t hyperv_limits[3];
 uint32_t hyperv_nested[4];
 bool hyperv_enforce_cpuid;
+uint32_t hyperv_ver_id_build;
+uint16_t hyperv_ver_id_major;
+uint16_t hyperv_ver_id_minor;
+uint32_t hyperv_ver_id_sp;
+uint8_t hyperv_ver_id_sb;
+uint32_t hyperv_ver_id_sn;
 
 bool check_cpuid;
 bool enforce_cpuid;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 0f3cb61a9cfd..918472905e73 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1258,14 +1258,18 @@ bool kvm_hyperv_expand_features(X86CPU *cpu, Error 
**errp)
 cpu->hyperv_interface_id[3] =
 hv_cpuid_get_host(cs, HV_CPUID_INTERFACE, R_EDX);
 
-cpu->hyperv_version_id[0] =
+cpu->hyperv_ver_id_build =
 hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EAX);
-cpu->hyperv_version_id[1] =
-hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EBX);
-cpu->hyperv_version_id[2] =
+cpu->hyperv_ver_id_major =
+hv_cpuid_get_host(cs, HV_CPUID_VERSION, R_EBX) >> 16;
+cpu->hyperv_ver_id_minor =
+hv_cpuid_get_host(cs, HV_CPUI

[PATCH v2 6/8] i386: Implement pseudo 'hv-avic' ('hv-apicv') enlightenment

2021-09-02 Thread Vitaly Kuznetsov

The enlightenment allows to use Hyper-V SynIC with hardware APICv/AVIC
enabled. Normally, Hyper-V SynIC disables these hardware features and
suggests the guest to use paravirtualized AutoEOI feature. Linux-4.15
gains support for conditional APICv/AVIC disablement, the feature
stays on until the guest tries to use AutoEOI feature with SynIC. With
'HV_DEPRECATING_AEOI_RECOMMENDED' bit exposed, modern enough Windows/
Hyper-V versions should follow the recommendation and not use the
(unwanted) feature.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt| 10 +-
 target/i386/cpu.c  |  4 
 target/i386/cpu.h  |  1 +
 target/i386/kvm/hyperv-proto.h |  1 +
 target/i386/kvm/kvm.c  | 10 +-
 5 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index 072709a68f47..cd1ea3bbe9d7 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -189,7 +189,15 @@ enabled.
 
 Requires: hv-vpindex, hv-synic, hv-time, hv-stimer
 
-3.17. hv-no-nonarch-coresharing=on/off/auto
+3.18. hv-avic (hv-apicv)
+===
+The enlightenment allows to use Hyper-V SynIC with hardware APICv/AVIC enabled.
+Normally, Hyper-V SynIC disables these hardware feature and suggests the guest
+to use paravirtualized AutoEOI feature.
+Note: enabling this feature on old hardware (without APICv/AVIC support) may
+have negative effect on guest's performace.
+
+3.19. hv-no-nonarch-coresharing=on/off/auto
 ===
 This enlightenment tells guest OS that virtual processors will never share a
 physical core unless they are reported as sibling SMT threads. This information
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 36e1b6ec9c9b..a695e200d409 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6477,6 +6477,8 @@ static void x86_cpu_initfn(Object *obj)
 object_property_add_alias(obj, "sse4_1", obj, "sse4.1");
 object_property_add_alias(obj, "sse4_2", obj, "sse4.2");
 
+object_property_add_alias(obj, "hv-apicv", obj, "hv-avic");
+
 if (xcc->model) {
 x86_cpu_load_model(cpu, xcc->model);
 }
@@ -6662,6 +6664,8 @@ static Property x86_cpu_properties[] = {
   HYPERV_FEAT_IPI, 0),
 DEFINE_PROP_BIT64("hv-stimer-direct", X86CPU, hyperv_features,
   HYPERV_FEAT_STIMER_DIRECT, 0),
+DEFINE_PROP_BIT64("hv-avic", X86CPU, hyperv_features,
+  HYPERV_FEAT_AVIC, 0),
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
 DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 8822bea5c9a4..d22a8d259967 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1038,6 +1038,7 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS];
 #define HYPERV_FEAT_EVMCS   12
 #define HYPERV_FEAT_IPI 13
 #define HYPERV_FEAT_STIMER_DIRECT   14
+#define HYPERV_FEAT_AVIC15
 
 #ifndef HYPERV_SPINLOCK_NEVER_NOTIFY
 #define HYPERV_SPINLOCK_NEVER_NOTIFY 0x
diff --git a/target/i386/kvm/hyperv-proto.h b/target/i386/kvm/hyperv-proto.h
index 5fbb385cc136..89f81afda7c6 100644
--- a/target/i386/kvm/hyperv-proto.h
+++ b/target/i386/kvm/hyperv-proto.h
@@ -66,6 +66,7 @@
 #define HV_APIC_ACCESS_RECOMMENDED  (1u << 3)
 #define HV_SYSTEM_RESET_RECOMMENDED (1u << 4)
 #define HV_RELAXED_TIMING_RECOMMENDED   (1u << 5)
+#define HV_DEPRECATING_AEOI_RECOMMENDED (1u << 9)
 #define HV_CLUSTER_IPI_RECOMMENDED  (1u << 10)
 #define HV_EX_PROCESSOR_MASKS_RECOMMENDED   (1u << 11)
 #define HV_ENLIGHTENED_VMCS_RECOMMENDED (1u << 14)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 430007c2691a..0f3cb61a9cfd 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -924,6 +924,13 @@ static struct {
 },
 .dependencies = BIT(HYPERV_FEAT_STIMER)
 },
+[HYPERV_FEAT_AVIC] = {
+.desc = "AVIC/APICv support (hv-avic/hv-apicv)",
+.flags = {
+{.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX,
+ .bits = HV_DEPRECATING_AEOI_RECOMMENDED}
+}
+},
 };
 
 static struct kvm_cpuid2 *try_get_hv_cpuid(CPUState *cs, int max,
@@ -1373,7 +1380,8 @@ static int hyperv_fill_cpuids(CPUState *cs,
 c->eax = hv_build_cpuid_leaf(cs, HV_CPUID_ENLIGHTMENT_INFO, R_EAX);
 c->ebx = cpu->hyperv_spinlock_attempts;
 
-if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VAPIC)) {
+if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VAPIC) &&
+!hyperv_feat_enabled(cpu, HYPERV_FEAT_AVIC)) {
 c->eax |= HV_APIC_ACCESS_RECOMMENDED;
 }
 
-- 
2.31.1

[PATCH v2 2/8] i386: docs: Briefly describe KVM PV features

2021-09-02 Thread Vitaly Kuznetsov

KVM PV features don't seem to be documented anywhere, in particular, the
fact that some of the features are enabled by default and some are not can
only be figured out from the code.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/kvm-pv.txt | 92 +
 1 file changed, 92 insertions(+)
 create mode 100644 docs/kvm-pv.txt

diff --git a/docs/kvm-pv.txt b/docs/kvm-pv.txt
new file mode 100644
index ..84ad7fa60f8d
--- /dev/null
+++ b/docs/kvm-pv.txt
@@ -0,0 +1,92 @@
+KVM paravirtualized features
+
+
+
+1. Description
+===
+In some cases when implementing a hardware interface in software is slow, KVM
+implements its own paravirtualized interfaces.
+
+2. Setup
+=
+KVM PV features are represented as CPU flags. The following features are 
enabled
+by default for any CPU model when KVM is enabled:
+  kvmclock
+  kvm-nopiodelay
+  kvm-asyncpf
+  kvm-steal-time
+  kvm-pv-eoi
+  kvmclock-stable-bit
+
+'kvm-msi-ext-dest-id' feature is enabled by default in x2apic mode with split
+irqchip (e.g. "-machine ...,kernel-irqchip=split -cpu ...,x2apic").
+
+Note: when cpu model 'host' is used, QEMU passes through all KVM PV features
+exposed by KVM to the guest.
+
+3. Existing features
+
+
+3.1. kvmclock
+
+This feature exposes KVM specific PV clocksource to the guest.
+
+3.2. kvm-nopiodelay
+===
+The guest doesn't need to perform delays on PIO operations.
+
+3.3. kvm-mmu
+
+This feature is deprecated.
+
+3.4. kvm-asyncpf
+
+Enables asynchronous page fault mechanism. Note: since Linux-5.10 the feature 
is
+deprecated and not enabled by KVM. Use "kvm-asyncpf-int" instead.
+
+3.5. kvm-steal-time
+===
+Enables stolen (when guest vCPU is not running) time accounting.
+
+3.6. kvm-pv-eoi
+===
+Enables paravirtualized end-of-interrupt signaling.
+
+3.7. kvm-pv-unhalt
+==
+Enables paravirtualized spinlocks support.
+
+3.8. kvm-pv-tlb-flush
+=
+Enables paravirtualized TLB flush mechanism.
+
+3.9. kvm-pv-ipi
+===
+Enables paravirtualized IPI mechanism.
+
+3.10. kvm-poll-control
+==
+Enables host-side polling on HLT control from the guest.
+
+3.11. kvm-pv-sched-yield
+
+Enables paravirtualized sched yield feature.
+
+3.12. kvm-asyncpf-int
+=
+Enables interrupt based asynchronous page fault mechanism.
+
+3.13. kvm-msi-ext-dest-id
+=
+Support 'Extended Destination ID' for external interrupts. The feature allows
+to use up to 32768 CPUs without IRQ remapping (but other limits may apply 
making
+the number of supported vCPUs for a given configuration lower).
+
+3.14. kvmclock-stable-bit
+=
+Tells the guest that guest visible TSC value can be fully trusted for kvmclock
+computations and no warps are expected.
+
+4. Useful links
+
+Please refer to Documentation/virt/kvm in Linux for additional detail.
-- 
2.31.1

[PATCH 1/3] docs: Briefly describe KVM PV features

2021-07-22 Thread Vitaly Kuznetsov

KVM PV features don't seem to be documented anywhere, in particular, the
fact that some of the features are enabled by default and some are not can
only be figured out from the code.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/kvm-pv.txt | 92 +
 1 file changed, 92 insertions(+)
 create mode 100644 docs/kvm-pv.txt

diff --git a/docs/kvm-pv.txt b/docs/kvm-pv.txt
new file mode 100644
index ..84ad7fa60f8d
--- /dev/null
+++ b/docs/kvm-pv.txt
@@ -0,0 +1,92 @@
+KVM paravirtualized features
+
+
+
+1. Description
+===
+In some cases when implementing a hardware interface in software is slow, KVM
+implements its own paravirtualized interfaces.
+
+2. Setup
+=
+KVM PV features are represented as CPU flags. The following features are 
enabled
+by default for any CPU model when KVM is enabled:
+  kvmclock
+  kvm-nopiodelay
+  kvm-asyncpf
+  kvm-steal-time
+  kvm-pv-eoi
+  kvmclock-stable-bit
+
+'kvm-msi-ext-dest-id' feature is enabled by default in x2apic mode with split
+irqchip (e.g. "-machine ...,kernel-irqchip=split -cpu ...,x2apic").
+
+Note: when cpu model 'host' is used, QEMU passes through all KVM PV features
+exposed by KVM to the guest.
+
+3. Existing features
+
+
+3.1. kvmclock
+
+This feature exposes KVM specific PV clocksource to the guest.
+
+3.2. kvm-nopiodelay
+===
+The guest doesn't need to perform delays on PIO operations.
+
+3.3. kvm-mmu
+
+This feature is deprecated.
+
+3.4. kvm-asyncpf
+
+Enables asynchronous page fault mechanism. Note: since Linux-5.10 the feature 
is
+deprecated and not enabled by KVM. Use "kvm-asyncpf-int" instead.
+
+3.5. kvm-steal-time
+===
+Enables stolen (when guest vCPU is not running) time accounting.
+
+3.6. kvm-pv-eoi
+===
+Enables paravirtualized end-of-interrupt signaling.
+
+3.7. kvm-pv-unhalt
+==
+Enables paravirtualized spinlocks support.
+
+3.8. kvm-pv-tlb-flush
+=
+Enables paravirtualized TLB flush mechanism.
+
+3.9. kvm-pv-ipi
+===
+Enables paravirtualized IPI mechanism.
+
+3.10. kvm-poll-control
+==
+Enables host-side polling on HLT control from the guest.
+
+3.11. kvm-pv-sched-yield
+
+Enables paravirtualized sched yield feature.
+
+3.12. kvm-asyncpf-int
+=
+Enables interrupt based asynchronous page fault mechanism.
+
+3.13. kvm-msi-ext-dest-id
+=
+Support 'Extended Destination ID' for external interrupts. The feature allows
+to use up to 32768 CPUs without IRQ remapping (but other limits may apply 
making
+the number of supported vCPUs for a given configuration lower).
+
+3.14. kvmclock-stable-bit
+=
+Tells the guest that guest visible TSC value can be fully trusted for kvmclock
+computations and no warps are expected.
+
+4. Useful links
+
+Please refer to Documentation/virt/kvm in Linux for additional detail.
-- 
2.31.1

[PATCH 3/3] i386: Support KVM_CAP_HYPERV_ENFORCE_CPUID

2021-07-22 Thread Vitaly Kuznetsov

By default, KVM allows the guest to use all currently supported Hyper-V
enlightenments when Hyper-V CPUID interface was exposed, regardless of if
some features were not announced in guest visible CPUIDs. hv-enforce-cpuid
feature alters this behavior and only allows the guest to use exposed
Hyper-V enlightenments. The feature is supported by Linux >= 5.14 and is
not enabled by default in QEMU.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt   | 17 ++---
 target/i386/cpu.c |  1 +
 target/i386/cpu.h |  1 +
 target/i386/kvm/kvm.c |  9 +
 4 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index 000638a2fd38..072709a68f47 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -203,8 +203,11 @@ When the option is set to 'on' QEMU will always enable the 
feature, regardless
 of host setup. To keep guests secure, this can only be used in conjunction with
 exposing correct vCPU topology and vCPU pinning.
 
-4. Development features
-
+4. Supplementary features
+=
+
+4.1. hv-passthrough
+===
 In some cases (e.g. during development) it may make sense to use QEMU in
 'pass-through' mode and give Windows guests all enlightenments currently
 supported by KVM. This pass-through mode is enabled by "hv-passthrough" CPU
@@ -215,8 +218,16 @@ values from KVM to QEMU. "hv-passthrough" overrides all 
other "hv-*" settings on
 the command line. Also, enabling this flag effectively prevents migration as 
the
 list of enabled enlightenments may differ between target and destination hosts.
 
+4.2. hv-enforce-cpuid
+=
+By default, KVM allows the guest to use all currently supported Hyper-V
+enlightenments when Hyper-V CPUID interface was exposed, regardless of if
+some features were not announced in guest visible CPUIDs. 'hv-enforce-cpuid'
+feature alters this behavior and only allows the guest to use exposed Hyper-V
+enlightenments.
+
 
-4. Useful links
+5. Useful links
 
 Hyper-V Top Level Functional specification and other information:
 https://github.com/MicrosoftDocs/Virtualization-Documentation
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 0a0d2cddc9d2..1d4c44c8b762 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6642,6 +6642,7 @@ static Property x86_cpu_properties[] = {
 DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
 hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
 DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
+DEFINE_PROP_BOOL("hv-enforce-cpuid", X86CPU, hyperv_enforce_cpuid, false),
 
 DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
 DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 31f1f7caf116..9539f57199fa 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1685,6 +1685,7 @@ struct X86CPU {
 uint32_t hyperv_version_id[4];
 uint32_t hyperv_limits[3];
 uint32_t hyperv_nested[4];
+bool hyperv_enforce_cpuid;
 
 bool check_cpuid;
 bool enforce_cpuid;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 452b04f469b5..ccbea88080fc 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1519,6 +1519,15 @@ static int hyperv_init_vcpu(X86CPU *cpu)
 cpu->hyperv_nested[0] = evmcs_version;
 }
 
+if (cpu->hyperv_enforce_cpuid) {
+ret = kvm_vcpu_enable_cap(cs, KVM_CAP_HYPERV_ENFORCE_CPUID, 0, 1);
+if (ret < 0) {
+error_report("failed to enable KVM_CAP_HYPERV_ENFORCE_CPUID: %s",
+ strerror(-ret));
+return ret;
+}
+}
+
 return 0;
 }
 
-- 
2.31.1

[PATCH 2/3] i386: Support KVM_CAP_ENFORCE_PV_FEATURE_CPUID

2021-07-22 Thread Vitaly Kuznetsov

By default, KVM allows the guest to use all currently supported PV features
even when they were not announced in guest visible CPUIDs. Introduce a new
"kvm-pv-enforce-cpuid" flag to limit the supported feature set to the
exposed features. The feature is supported by Linux >= 5.10 and is not
enabled by default in QEMU.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/kvm-pv.txt   | 13 -
 target/i386/cpu.c |  2 ++
 target/i386/cpu.h |  3 +++
 target/i386/kvm/kvm.c | 10 ++
 4 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/docs/kvm-pv.txt b/docs/kvm-pv.txt
index 84ad7fa60f8d..d1aac533feea 100644
--- a/docs/kvm-pv.txt
+++ b/docs/kvm-pv.txt
@@ -87,6 +87,17 @@ the number of supported vCPUs for a given configuration 
lower).
 Tells the guest that guest visible TSC value can be fully trusted for kvmclock
 computations and no warps are expected.
 
-4. Useful links
+4. Supplementary features
+=
+
+4.1. kvm-pv-enforce-cpuid
+=
+By default, KVM allows the guest to use all currently supported PV features 
even
+when they were not announced in guest visible CPUIDs. 'kvm-pv-enforce-cpuid'
+feature alters this behavior and limits the supported feature set to the
+exposed features only.
+
+
+5. Useful links
 
 Please refer to Documentation/virt/kvm in Linux for additional detail.
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 48b55ebd0a67..0a0d2cddc9d2 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6668,6 +6668,8 @@ static Property x86_cpu_properties[] = {
 DEFINE_PROP_BOOL("l3-cache", X86CPU, enable_l3_cache, true),
 DEFINE_PROP_BOOL("kvm-no-smi-migration", X86CPU, kvm_no_smi_migration,
  false),
+DEFINE_PROP_BOOL("kvm-pv-enforce-cpuid", X86CPU, kvm_pv_enforce_cpuid,
+ false),
 DEFINE_PROP_BOOL("vmware-cpuid-freq", X86CPU, vmware_cpuid_freq, true),
 DEFINE_PROP_BOOL("tcg-cpuid", X86CPU, expose_tcg, true),
 DEFINE_PROP_BOOL("x-migrate-smi-count", X86CPU, migrate_smi_count,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 5d98a4e7c025..31f1f7caf116 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1768,6 +1768,9 @@ struct X86CPU {
 /* Stop SMI delivery for migration compatibility with old machines */
 bool kvm_no_smi_migration;
 
+/* Forcefully disable KVM PV features not exposed in guest CPUIDs */
+bool kvm_pv_enforce_cpuid;
+
 /* Number of physical address bits supported */
 uint32_t phys_bits;
 
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 59ed8327ac13..452b04f469b5 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1617,6 +1617,16 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
 cpu_x86_cpuid(env, 0, 0, , , , );
 
+if (cpu->kvm_pv_enforce_cpuid) {
+r = kvm_vcpu_enable_cap(cs, KVM_CAP_ENFORCE_PV_FEATURE_CPUID, 0, 1);
+if (r < 0) {
+fprintf(stderr,
+"failed to enable KVM_CAP_ENFORCE_PV_FEATURE_CPUID: %s",
+strerror(-r));
+abort();
+}
+}
+
 for (i = 0; i <= limit; i++) {
 if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
 fprintf(stderr, "unsupported level value: 0x%x\n", limit);
-- 
2.31.1

[PATCH 0/3] i386/kvm: Paravirtualized features usage enforcement

2021-07-22 Thread Vitaly Kuznetsov

[I know this is probably too late for 6.1 but maybe the first patch of the
series is good as it just adds a missing doc?]

By default, KVM doesn't limit the usage of paravirtualized feature (neither
native KVM nor Hyper-V) to what was exposed to the guest in CPUIDs making
it possible to use all of them. KVM_CAP_HYPERV_ENFORCE_CPUID and
KVM_CAP_ENFORCE_PV_FEATURE_CPUID features were recently introduced making
it possible to limit available features to what was actually exposed. Add
support for these to QEMU.

While on it, document all currently supported KVM PV features in
docs/kvm-pv.txt.

Vitaly Kuznetsov (3):
  docs: Briefly describe KVM PV features
  i386: Support KVM_CAP_ENFORCE_PV_FEATURE_CPUID
  i386: Support KVM_CAP_HYPERV_ENFORCE_CPUID

 docs/hyperv.txt   |  17 +--
 docs/kvm-pv.txt   | 103 ++
 target/i386/cpu.c |   3 ++
 target/i386/cpu.h |   4 ++
 target/i386/kvm/kvm.c |  19 
 5 files changed, 143 insertions(+), 3 deletions(-)
 create mode 100644 docs/kvm-pv.txt

-- 
2.31.1

Re: [PATCH] qtest/hyperv: Introduce a simple hyper-v test

2021-07-19 Thread Vitaly Kuznetsov

Andrew Jones  writes:

> On Fri, Jul 16, 2021 at 02:55:28PM +0200, Vitaly Kuznetsov wrote:
>> For the beginning, just test 'hv-passthrough' and a couple of custom
>> Hyper-V  enlightenments configurations through QMP. Later, it would
>> be great to complement this by checking CPUID values from within the
>> guest.
>> 
>> Signed-off-by: Vitaly Kuznetsov 
>> ---
>> - Changes since "[PATCH v8 0/9] i386: KVM: expand Hyper-V features early":
>>  make the test SKIP correctly when KVM is not present.
>> ---
>>  MAINTAINERS   |   1 +
>>  tests/qtest/hyperv-test.c | 228 ++
>>  tests/qtest/meson.build   |   3 +-
>>  3 files changed, 231 insertions(+), 1 deletion(-)
>>  create mode 100644 tests/qtest/hyperv-test.c
>> 
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 148153d74f5b..c1afd744edca 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -1576,6 +1576,7 @@ F: hw/isa/apm.c
>>  F: include/hw/isa/apm.h
>>  F: tests/unit/test-x86-cpuid.c
>>  F: tests/qtest/test-x86-cpuid-compat.c
>> +F: tests/qtest/hyperv-test.c
>>  
>>  PC Chipset
>>  M: Michael S. Tsirkin 
>> diff --git a/tests/qtest/hyperv-test.c b/tests/qtest/hyperv-test.c
>> new file mode 100644
>> index ..2155e5d90970
>> --- /dev/null
>> +++ b/tests/qtest/hyperv-test.c
>> @@ -0,0 +1,228 @@
>> +/*
>> + * Hyper-V emulation CPU feature test cases
>> + *
>> + * Copyright (c) 2021 Red Hat Inc.
>> + * Authors:
>> + *  Vitaly Kuznetsov 
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>> + * See the COPYING file in the top-level directory.
>> + */
>> +#include 
>> +#include 
>> +
>> +#include "qemu/osdep.h"
>> +#include "qemu/bitops.h"
>> +#include "libqos/libqtest.h"
>> +#include "qapi/qmp/qdict.h"
>> +#include "qapi/qmp/qjson.h"
>> +
>> +#define MACHINE_KVM "-machine pc-q35-5.2 -accel kvm "
>> +#define QUERY_HEAD  "{ 'execute': 'query-cpu-model-expansion', " \
>> +"  'arguments': { 'type': 'full', "
>> +#define QUERY_TAIL  "}}"
>> +
>> +static bool kvm_enabled(QTestState *qts)
>> +{
>> +QDict *resp, *qdict;
>> +bool enabled;
>> +
>> +resp = qtest_qmp(qts, "{ 'execute': 'query-kvm' }");
>> +g_assert(qdict_haskey(resp, "return"));
>> +qdict = qdict_get_qdict(resp, "return");
>> +g_assert(qdict_haskey(qdict, "enabled"));
>> +enabled = qdict_get_bool(qdict, "enabled");
>> +qobject_unref(resp);
>> +
>> +return enabled;
>> +}
>> +
>> +static bool kvm_has_cap(int cap)
>> +{
>> +int fd = open("/dev/kvm", O_RDWR);
>> +int ret;
>> +
>> +if (fd < 0) {
>> +return false;
>> +}
>> +
>> +ret = ioctl(fd, KVM_CHECK_EXTENSION, cap);
>> +
>> +close(fd);
>> +
>> +return ret > 0;
>> +}
>> +
>> +static QDict *do_query_no_props(QTestState *qts, const char *cpu_type)
>> +{
>> +return qtest_qmp(qts, QUERY_HEAD "'model': { 'name': %s }"
>> +  QUERY_TAIL, cpu_type);
>> +}
>> +
>> +static bool resp_has_props(QDict *resp)
>> +{
>> +QDict *qdict;
>> +
>> +g_assert(resp);
>> +
>> +if (!qdict_haskey(resp, "return")) {
>> +return false;
>> +}
>> +qdict = qdict_get_qdict(resp, "return");
>> +
>> +if (!qdict_haskey(qdict, "model")) {
>> +return false;
>> +}
>> +qdict = qdict_get_qdict(qdict, "model");
>> +
>> +return qdict_haskey(qdict, "props");
>> +}
>> +
>> +static QDict *resp_get_props(QDict *resp)
>> +{
>> +QDict *qdict;
>> +
>> +g_assert(resp);
>> +g_assert(resp_has_props(resp));
>> +
>> +qdict = qdict_get_qdict(resp, "return");
>> +qdict = qdict_get_qdict(qdict, "model");
>> +qdict = qdict_get_qdict(qdict, "props");
>> +
>> +return qdict;
>> +}
>> +
>> +static bool resp_get_feature(QDict *resp, const char *feature)
>> +{
>> +QDict *props;
>> +
>> +g_assert(resp);
>> +g_assert(resp_has_props(resp));
>> +props = resp_get

[PATCH] qtest/hyperv: Introduce a simple hyper-v test

2021-07-16 Thread Vitaly Kuznetsov

For the beginning, just test 'hv-passthrough' and a couple of custom
Hyper-V  enlightenments configurations through QMP. Later, it would
be great to complement this by checking CPUID values from within the
guest.

Signed-off-by: Vitaly Kuznetsov 
---
- Changes since "[PATCH v8 0/9] i386: KVM: expand Hyper-V features early":
 make the test SKIP correctly when KVM is not present.
---
 MAINTAINERS   |   1 +
 tests/qtest/hyperv-test.c | 228 ++
 tests/qtest/meson.build   |   3 +-
 3 files changed, 231 insertions(+), 1 deletion(-)
 create mode 100644 tests/qtest/hyperv-test.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 148153d74f5b..c1afd744edca 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1576,6 +1576,7 @@ F: hw/isa/apm.c
 F: include/hw/isa/apm.h
 F: tests/unit/test-x86-cpuid.c
 F: tests/qtest/test-x86-cpuid-compat.c
+F: tests/qtest/hyperv-test.c
 
 PC Chipset
 M: Michael S. Tsirkin 
diff --git a/tests/qtest/hyperv-test.c b/tests/qtest/hyperv-test.c
new file mode 100644
index ..2155e5d90970
--- /dev/null
+++ b/tests/qtest/hyperv-test.c
@@ -0,0 +1,228 @@
+/*
+ * Hyper-V emulation CPU feature test cases
+ *
+ * Copyright (c) 2021 Red Hat Inc.
+ * Authors:
+ *  Vitaly Kuznetsov 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#include 
+#include 
+
+#include "qemu/osdep.h"
+#include "qemu/bitops.h"
+#include "libqos/libqtest.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qjson.h"
+
+#define MACHINE_KVM "-machine pc-q35-5.2 -accel kvm "
+#define QUERY_HEAD  "{ 'execute': 'query-cpu-model-expansion', " \
+"  'arguments': { 'type': 'full', "
+#define QUERY_TAIL  "}}"
+
+static bool kvm_enabled(QTestState *qts)
+{
+QDict *resp, *qdict;
+bool enabled;
+
+resp = qtest_qmp(qts, "{ 'execute': 'query-kvm' }");
+g_assert(qdict_haskey(resp, "return"));
+qdict = qdict_get_qdict(resp, "return");
+g_assert(qdict_haskey(qdict, "enabled"));
+enabled = qdict_get_bool(qdict, "enabled");
+qobject_unref(resp);
+
+return enabled;
+}
+
+static bool kvm_has_cap(int cap)
+{
+int fd = open("/dev/kvm", O_RDWR);
+int ret;
+
+if (fd < 0) {
+return false;
+}
+
+ret = ioctl(fd, KVM_CHECK_EXTENSION, cap);
+
+close(fd);
+
+return ret > 0;
+}
+
+static QDict *do_query_no_props(QTestState *qts, const char *cpu_type)
+{
+return qtest_qmp(qts, QUERY_HEAD "'model': { 'name': %s }"
+  QUERY_TAIL, cpu_type);
+}
+
+static bool resp_has_props(QDict *resp)
+{
+QDict *qdict;
+
+g_assert(resp);
+
+if (!qdict_haskey(resp, "return")) {
+return false;
+}
+qdict = qdict_get_qdict(resp, "return");
+
+if (!qdict_haskey(qdict, "model")) {
+return false;
+}
+qdict = qdict_get_qdict(qdict, "model");
+
+return qdict_haskey(qdict, "props");
+}
+
+static QDict *resp_get_props(QDict *resp)
+{
+QDict *qdict;
+
+g_assert(resp);
+g_assert(resp_has_props(resp));
+
+qdict = qdict_get_qdict(resp, "return");
+qdict = qdict_get_qdict(qdict, "model");
+qdict = qdict_get_qdict(qdict, "props");
+
+return qdict;
+}
+
+static bool resp_get_feature(QDict *resp, const char *feature)
+{
+QDict *props;
+
+g_assert(resp);
+g_assert(resp_has_props(resp));
+props = resp_get_props(resp);
+g_assert(qdict_get(props, feature));
+return qdict_get_bool(props, feature);
+}
+
+#define assert_has_feature(qts, cpu_type, feature) \
+({ \
+QDict *_resp = do_query_no_props(qts, cpu_type);   \
+g_assert(_resp);   \
+g_assert(resp_has_props(_resp));   \
+g_assert(qdict_get(resp_get_props(_resp), feature));   \
+qobject_unref(_resp);  \
+})
+
+#define resp_assert_feature(resp, feature, expected_value) \
+({ \
+QDict *_props; \
+   \
+g_assert(_resp);   \
+g_assert(resp_has_props(_resp));   \
+_props = resp_get_props(_resp);\
+g_assert(qdict_get(_props, feature));

Re: [PATCH v8 9/9] qtest/hyperv: Introduce a simple hyper-v test

2021-07-16 Thread Vitaly Kuznetsov

Igor Mammedov  writes:

> On Thu, 8 Jul 2021 17:02:22 -0400
> Eduardo Habkost  wrote:
>
>> On Tue, Jun 08, 2021 at 02:08:17PM +0200, Vitaly Kuznetsov wrote:
>> > For the beginning, just test 'hv-passthrough' and a couple of custom
>> > Hyper-V  enlightenments configurations through QMP. Later, it would
>> > be great to complement this by checking CPUID values from within the
>> > guest.
>> > 
>> > Signed-off-by: Vitaly Kuznetsov   
>> [...]
>> > +static bool kvm_has_sys_hyperv_cpuid(void)
>> > +{
>> > +int fd = open("/dev/kvm", O_RDWR);
>> > +int ret;
>> > +
>> > +g_assert(fd > 0);  
>> 

g_assert() was an overkill, just 'return false' would do.

>> This crashes when /dev/kvm doesn't exist.  See:
>> https://gitlab.com/ehabkost/qemu/-/jobs/1404084459
>
> maybe reuse qtest_has_accel()
>  https://lists.gnu.org/archive/html/qemu-devel/2021-06/msg06864.html
>
> instead of op encoding it.

The purpose of this function is to check if KVM_CAP_SYS_HYPERV_CPUID is
supported by KVM. It is certainly unsupported when KVM is not present
:-) but an ioctl() is needed when it is.

We already have a similar check in tests/qtest/migration-test.c where we
test for KVM_CAP_DIRTY_LOG_RING, maybe we can create a library function
but we don't seem to have any KVM-specific stuff in qtest at this moment
...

>> I'm removing it from the queue.

I'll fix g_assert() and send as a separate patch if it's fine.

-- 
Vitaly

[PATCH 2/2] i386: Fix coding style in kvm_hyperv_expand_features()

2021-07-16 Thread Vitaly Kuznetsov

QEMU coding style requires braces around bodies of ifs.

Reported-by: Peter Maydell 
Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/kvm/kvm.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index e69abe48e3f8..28ca682b1089 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1219,8 +1219,9 @@ bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
 Error *local_err = NULL;
 int feat;
 
-if (!hyperv_enabled(cpu))
+if (!hyperv_enabled(cpu)) {
 return true;
+}
 
 /*
  * When kvm_hyperv_expand_features is called at CPU feature expansion
@@ -1228,8 +1229,9 @@ bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
  * when KVM_CAP_SYS_HYPERV_CPUID is supported.
  */
 if (!cs->kvm_state &&
-!kvm_check_extension(kvm_state, KVM_CAP_SYS_HYPERV_CPUID))
+!kvm_check_extension(kvm_state, KVM_CAP_SYS_HYPERV_CPUID)) {
 return true;
+}
 
 if (cpu->hyperv_passthrough) {
 cpu->hyperv_vendor_id[0] =
-- 
2.31.1

[PATCH 1/2] i386: assert 'cs->kvm_state' is not null

2021-07-16 Thread Vitaly Kuznetsov

Coverity reports potential NULL pointer dereference in
get_supported_hv_cpuid_legacy() when 'cs->kvm_state' is NULL. While
'cs->kvm_state' can indeed be NULL in hv_cpuid_get_host(),
kvm_hyperv_expand_features() makes sure that it only happens when
KVM_CAP_SYS_HYPERV_CPUID is supported and KVM_CAP_SYS_HYPERV_CPUID
implies KVM_CAP_HYPERV_CPUID so get_supported_hv_cpuid_legacy() is
never really called. Add asserts to strengthen the protection against
broken KVM behavior.

Coverity: CID 1458243
Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/kvm/kvm.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 59ed8327ac13..e69abe48e3f8 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -974,6 +974,12 @@ static struct kvm_cpuid2 *get_supported_hv_cpuid(CPUState 
*cs)
 do_sys_ioctl =
 kvm_check_extension(kvm_state, KVM_CAP_SYS_HYPERV_CPUID) > 0;
 
+/*
+ * Non-empty KVM context is needed when KVM_CAP_SYS_HYPERV_CPUID is
+ * unsupported, kvm_hyperv_expand_features() checks for that.
+ */
+assert(do_sys_ioctl || cs->kvm_state);
+
 /*
  * When the buffer is too small, KVM_GET_SUPPORTED_HV_CPUID fails with
  * -E2BIG, however, it doesn't report back the right size. Keep increasing
@@ -1105,6 +,14 @@ static uint32_t hv_cpuid_get_host(CPUState *cs, uint32_t 
func, int reg)
 if (kvm_check_extension(kvm_state, KVM_CAP_HYPERV_CPUID) > 0) {
 cpuid = get_supported_hv_cpuid(cs);
 } else {
+/*
+ * 'cs->kvm_state' may be NULL when Hyper-V features are expanded
+ * before KVM context is created but this is only done when
+ * KVM_CAP_SYS_HYPERV_CPUID is supported and it implies
+ * KVM_CAP_HYPERV_CPUID.
+ */
+assert(cs->kvm_state);
+
 cpuid = get_supported_hv_cpuid_legacy(cs);
 }
 hv_cpuid_cache = cpuid;
-- 
2.31.1

Re: [PULL 04/11] i386: expand Hyper-V features during CPU feature expansion time

2021-07-16 Thread Vitaly Kuznetsov

Peter Maydell  writes:

> On Tue, 13 Jul 2021 at 17:19, Eduardo Habkost  wrote:
>>
>> From: Vitaly Kuznetsov 
>>
>> To make Hyper-V features appear in e.g. QMP query-cpu-model-expansion we
>> need to expand and set the corresponding CPUID leaves early. Modify
>> x86_cpu_get_supported_feature_word() to call newly intoduced Hyper-V
>> specific kvm_hv_get_supported_cpuid() instead of
>> kvm_arch_get_supported_cpuid(). We can't use kvm_arch_get_supported_cpuid()
>> as Hyper-V specific CPUID leaves intersect with KVM's.
>>
>> Note, early expansion will only happen when KVM supports system wide
>> KVM_GET_SUPPORTED_HV_CPUID ioctl (KVM_CAP_SYS_HYPERV_CPUID).
>>
>> Reviewed-by: Eduardo Habkost 
>> Signed-off-by: Vitaly Kuznetsov 
>> Message-Id: <20210608120817.1325125-6-vkuzn...@redhat.com>
>> Signed-off-by: Eduardo Habkost 
>
> Hi; Coverity reports an issue in this code (CID 1458243):
>
>> -static bool hyperv_expand_features(CPUState *cs, Error **errp)
>> +bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
>>  {
>> -X86CPU *cpu = X86_CPU(cs);
>> +CPUState *cs = CPU(cpu);
>>
>>  if (!hyperv_enabled(cpu))
>>  return true;
>>
>> +/*
>> + * When kvm_hyperv_expand_features is called at CPU feature expansion
>> + * time per-CPU kvm_state is not available yet so we can only proceed
>> + * when KVM_CAP_SYS_HYPERV_CPUID is supported.
>> + */
>> +if (!cs->kvm_state &&
>> +!kvm_check_extension(kvm_state, KVM_CAP_SYS_HYPERV_CPUID))
>> +return true;
>
> Here we check whether cs->kvm_state is NULL, but even if it is
> NULL we can still continue execution further through the function.
>
> Later in the function we call hv_cpuid_get_host(), which in turn
> can call get_supported_hv_cpuid_legacy(), which can dereference
> cs->kvm_state without checking it.

get_supported_hv_cpuid_legacy() is only called when KVM_CAP_HYPERV_CPUID
is not supported and this is not possible with
KVM_CAP_SYS_HYPERV_CPUID. Coverity, of course, can't know that.

>
> So either the check on cs->kvm_state above is unnecessary, or we
> need to handle it being NULL in some way other than falling through.

It seems an assert(cs) before calling get_supported_hv_cpuid_legacy()
(with a proper comment) should do the job.

>
> Side note: this change isn't in line with our coding style, which
> requires braces around the body of the if().

My bad, will fix.

-- 
Vitaly

Re: [PATCH v8 3/9] i386: hardcode supported eVMCS version to '1'

2021-06-16 Thread Vitaly Kuznetsov

Eduardo Habkost  writes:

> On Tue, Jun 08, 2021 at 02:08:11PM +0200, Vitaly Kuznetsov wrote:
>> Currently, the only eVMCS version, supported by KVM (and described in TLFS)
>> is '1'. When Enlightened VMCS feature is enabled, QEMU takes the supported
>> eVMCS version range (from KVM_CAP_HYPERV_ENLIGHTENED_VMCS enablement) and
>> puts it to guest visible CPUIDs. When (and if) eVMCS ver.2 appears a
>> problem on migration is expected: it doesn't seem to be possible to migrate
>> from a host supporting eVMCS ver.2 to a host, which only support eVMCS
>> ver.1.
>
> Should we rewrite this as "it wouldn't be possible to migrate",
> as this patch fixes the problem and makes it possible?

Yes, no problem with such amendment. Currently, there's no issue as
EVMCSv2 just doesn't exist. We, however, expect it to appear some time
in the future and this change allows us to re-use
KVM_CAP_HYPERV_ENLIGHTENED_VMCS in KVM without (potentially) breaking
migrations. Note: the migration will only be broken when we migrate to
KVM/QEMU which does not support EVMCSv2 *and* when the guest is already
using it. As we expose the range of supported versions, it is possible
that guests (esp. older Hyper-V versions) will stick to 'v1' even when
'v2' is supported.

>
>> 
>> Hardcode eVMCS ver.1 as the result of 'hv-evmcs' enablement for now. Newer
>> eVMCS versions will have to have their own enablement options (e.g.
>> 'hv-evmcs=2').
>> 
>> Signed-off-by: Vitaly Kuznetsov 
>
> Reviewed-by: Eduardo Habkost 

Thanks! Please let me know if expect v9 with amended commit message or
if you're able to alter it upon commit.

-- 
Vitaly

[PATCH v8 9/9] qtest/hyperv: Introduce a simple hyper-v test

2021-06-08 Thread Vitaly Kuznetsov

For the beginning, just test 'hv-passthrough' and a couple of custom
Hyper-V  enlightenments configurations through QMP. Later, it would
be great to complement this by checking CPUID values from within the
guest.

Signed-off-by: Vitaly Kuznetsov 
---
 MAINTAINERS   |   1 +
 tests/qtest/hyperv-test.c | 221 ++
 tests/qtest/meson.build   |   3 +-
 3 files changed, 224 insertions(+), 1 deletion(-)
 create mode 100644 tests/qtest/hyperv-test.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 7d9cd2904264..6345bad461e8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1545,6 +1545,7 @@ F: hw/isa/apm.c
 F: include/hw/isa/apm.h
 F: tests/unit/test-x86-cpuid.c
 F: tests/qtest/test-x86-cpuid-compat.c
+F: tests/qtest/hyperv-test.c
 
 PC Chipset
 M: Michael S. Tsirkin 
diff --git a/tests/qtest/hyperv-test.c b/tests/qtest/hyperv-test.c
new file mode 100644
index ..88f7a19e4a85
--- /dev/null
+++ b/tests/qtest/hyperv-test.c
@@ -0,0 +1,221 @@
+/*
+ * Hyper-V emulation CPU feature test cases
+ *
+ * Copyright (c) 2021 Red Hat Inc.
+ * Authors:
+ *  Vitaly Kuznetsov 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#include 
+#include 
+
+#include "qemu/osdep.h"
+#include "qemu/bitops.h"
+#include "libqos/libqtest.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qjson.h"
+
+#define MACHINE_KVM "-machine pc-q35-5.2 -accel kvm "
+#define QUERY_HEAD  "{ 'execute': 'query-cpu-model-expansion', " \
+"  'arguments': { 'type': 'full', "
+#define QUERY_TAIL  "}}"
+
+static bool kvm_enabled(QTestState *qts)
+{
+QDict *resp, *qdict;
+bool enabled;
+
+resp = qtest_qmp(qts, "{ 'execute': 'query-kvm' }");
+g_assert(qdict_haskey(resp, "return"));
+qdict = qdict_get_qdict(resp, "return");
+g_assert(qdict_haskey(qdict, "enabled"));
+enabled = qdict_get_bool(qdict, "enabled");
+qobject_unref(resp);
+
+return enabled;
+}
+
+static bool kvm_has_sys_hyperv_cpuid(void)
+{
+int fd = open("/dev/kvm", O_RDWR);
+int ret;
+
+g_assert(fd > 0);
+
+ret = ioctl(fd, KVM_CHECK_EXTENSION, KVM_CAP_SYS_HYPERV_CPUID);
+
+close(fd);
+
+return ret > 0;
+}
+
+static QDict *do_query_no_props(QTestState *qts, const char *cpu_type)
+{
+return qtest_qmp(qts, QUERY_HEAD "'model': { 'name': %s }"
+  QUERY_TAIL, cpu_type);
+}
+
+static bool resp_has_props(QDict *resp)
+{
+QDict *qdict;
+
+g_assert(resp);
+
+if (!qdict_haskey(resp, "return")) {
+return false;
+}
+qdict = qdict_get_qdict(resp, "return");
+
+if (!qdict_haskey(qdict, "model")) {
+return false;
+}
+qdict = qdict_get_qdict(qdict, "model");
+
+return qdict_haskey(qdict, "props");
+}
+
+static QDict *resp_get_props(QDict *resp)
+{
+QDict *qdict;
+
+g_assert(resp);
+g_assert(resp_has_props(resp));
+
+qdict = qdict_get_qdict(resp, "return");
+qdict = qdict_get_qdict(qdict, "model");
+qdict = qdict_get_qdict(qdict, "props");
+
+return qdict;
+}
+
+static bool resp_get_feature(QDict *resp, const char *feature)
+{
+QDict *props;
+
+g_assert(resp);
+g_assert(resp_has_props(resp));
+props = resp_get_props(resp);
+g_assert(qdict_get(props, feature));
+return qdict_get_bool(props, feature);
+}
+
+#define assert_has_feature(qts, cpu_type, feature) \
+({ \
+QDict *_resp = do_query_no_props(qts, cpu_type);   \
+g_assert(_resp);   \
+g_assert(resp_has_props(_resp));   \
+g_assert(qdict_get(resp_get_props(_resp), feature));   \
+qobject_unref(_resp);  \
+})
+
+#define resp_assert_feature(resp, feature, expected_value) \
+({ \
+QDict *_props; \
+   \
+g_assert(_resp);   \
+g_assert(resp_has_props(_resp));   \
+_props = resp_get_props(_resp);\
+g_assert(qdict_get(_props, feature));  \
+g_assert(qdict_get_bool(_props, feature) == (expected_value)); \
+})
+
+#define assert_feature(qts, cpu_type, feature, expected_value) \
+({

[PATCH v8 8/9] i386: Hyper-V SynIC requires POST_MESSAGES/SIGNAL_EVENTS privileges

2021-06-08 Thread Vitaly Kuznetsov

When Hyper-V SynIC is enabled, we may need to allow Windows guests to make
hypercalls (POST_MESSAGES/SIGNAL_EVENTS). No issue is currently observed
because KVM is very permissive, allowing these hypercalls regarding of
guest visible CPUid bits.

Reviewed-by: Eduardo Habkost 
Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/kvm/hyperv-proto.h | 6 ++
 target/i386/kvm/kvm.c  | 6 ++
 2 files changed, 12 insertions(+)

diff --git a/target/i386/kvm/hyperv-proto.h b/target/i386/kvm/hyperv-proto.h
index e30d64b4ade4..5fbb385cc136 100644
--- a/target/i386/kvm/hyperv-proto.h
+++ b/target/i386/kvm/hyperv-proto.h
@@ -38,6 +38,12 @@
 #define HV_ACCESS_FREQUENCY_MSRS (1u << 11)
 #define HV_ACCESS_REENLIGHTENMENTS_CONTROL  (1u << 13)
 
+/*
+ * HV_CPUID_FEATURES.EBX bits
+ */
+#define HV_POST_MESSAGES (1u << 4)
+#define HV_SIGNAL_EVENTS (1u << 5)
+
 /*
  * HV_CPUID_FEATURES.EDX bits
  */
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 33830117fa31..260c563d59a3 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1343,6 +1343,12 @@ static int hyperv_fill_cpuids(CPUState *cs,
 /* Unconditionally required with any Hyper-V enlightenment */
 c->eax |= HV_HYPERCALL_AVAILABLE;
 
+/* SynIC and Vmbus devices require messages/signals hypercalls */
+if (hyperv_feat_enabled(cpu, HYPERV_FEAT_SYNIC) &&
+!cpu->hyperv_synic_kvm_only) {
+c->ebx |= HV_POST_MESSAGES | HV_SIGNAL_EVENTS;
+}
+
 /* Not exposed by KVM but needed to make CPU hotplug in Windows work */
 c->edx |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
 
-- 
2.31.1

[PATCH v8 3/9] i386: hardcode supported eVMCS version to '1'

2021-06-08 Thread Vitaly Kuznetsov

Currently, the only eVMCS version, supported by KVM (and described in TLFS)
is '1'. When Enlightened VMCS feature is enabled, QEMU takes the supported
eVMCS version range (from KVM_CAP_HYPERV_ENLIGHTENED_VMCS enablement) and
puts it to guest visible CPUIDs. When (and if) eVMCS ver.2 appears a
problem on migration is expected: it doesn't seem to be possible to migrate
from a host supporting eVMCS ver.2 to a host, which only support eVMCS
ver.1.

Hardcode eVMCS ver.1 as the result of 'hv-evmcs' enablement for now. Newer
eVMCS versions will have to have their own enablement options (e.g.
'hv-evmcs=2').

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt   |  2 +-
 target/i386/kvm/kvm.c | 39 +++
 2 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index a51953daa833..000638a2fd38 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -170,7 +170,7 @@ Recommended: hv-frequencies
 3.16. hv-evmcs
 ===
 The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
-enabled, it provides Enlightened VMCS feature to the guest. The feature
+enabled, it provides Enlightened VMCS version 1 feature to the guest. The 
feature
 implements paravirtualized protocol between L0 (KVM) and L1 (Hyper-V)
 hypervisors making L2 exits to the hypervisor faster. The feature is 
Intel-only.
 Note: some virtualization features (e.g. Posted Interrupts) are disabled when
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index c676ee8b38a7..13d63f576b88 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1406,6 +1406,21 @@ static int hyperv_fill_cpuids(CPUState *cs,
 static Error *hv_passthrough_mig_blocker;
 static Error *hv_no_nonarch_cs_mig_blocker;
 
+/* Checks that the exposed eVMCS version range is supported by KVM */
+static bool evmcs_version_supported(uint16_t evmcs_version,
+uint16_t supported_evmcs_version)
+{
+uint8_t min_version = evmcs_version & 0xff;
+uint8_t max_version = evmcs_version >> 8;
+uint8_t min_supported_version = supported_evmcs_version & 0xff;
+uint8_t max_supported_version = supported_evmcs_version >> 8;
+
+return (min_version >= min_supported_version) &&
+(max_version <= max_supported_version);
+}
+
+#define DEFAULT_EVMCS_VERSION ((1 << 8) | 1)
+
 static int hyperv_init_vcpu(X86CPU *cpu)
 {
 CPUState *cs = CPU(cpu);
@@ -1485,17 +1500,33 @@ static int hyperv_init_vcpu(X86CPU *cpu)
 }
 
 if (hyperv_feat_enabled(cpu, HYPERV_FEAT_EVMCS)) {
-uint16_t evmcs_version;
+uint16_t evmcs_version = DEFAULT_EVMCS_VERSION;
+uint16_t supported_evmcs_version;
 
 ret = kvm_vcpu_enable_cap(cs, KVM_CAP_HYPERV_ENLIGHTENED_VMCS, 0,
-  (uintptr_t)_version);
+  (uintptr_t)_evmcs_version);
 
+/*
+ * KVM is required to support EVMCS ver.1. as that's what 'hv-evmcs'
+ * option sets. Note: we hardcode the maximum supported eVMCS version
+ * to '1' as well so 'hv-evmcs' feature is migratable even when (and 
if)
+ * ver.2 is implemented. A new option (e.g. 'hv-evmcs=2') will then 
have
+ * to be added.
+ */
 if (ret < 0) {
-fprintf(stderr, "Hyper-V %s is not supported by kernel\n",
-kvm_hyperv_properties[HYPERV_FEAT_EVMCS].desc);
+error_report("Hyper-V %s is not supported by kernel",
+ kvm_hyperv_properties[HYPERV_FEAT_EVMCS].desc);
 return ret;
 }
 
+if (!evmcs_version_supported(evmcs_version, supported_evmcs_version)) {
+error_report("eVMCS version range [%d..%d] is not supported by "
+ "kernel (supported: [%d..%d])", evmcs_version & 0xff,
+ evmcs_version >> 8, supported_evmcs_version & 0xff,
+ supported_evmcs_version >> 8);
+return -ENOTSUP;
+}
+
 cpu->hyperv_nested[0] = evmcs_version;
 }
 
-- 
2.31.1

[PATCH v8 7/9] i386: HV_HYPERCALL_AVAILABLE privilege bit is always needed

2021-06-08 Thread Vitaly Kuznetsov

According to TLFS, Hyper-V guest is supposed to check
HV_HYPERCALL_AVAILABLE privilege bit before accessing
HV_X64_MSR_GUEST_OS_ID/HV_X64_MSR_HYPERCALL MSRs but at least some
Windows versions ignore that. As KVM is very permissive and allows
accessing these MSRs unconditionally, no issue is observed. We may,
however, want to tighten the checks eventually. Conforming to the
spec is probably also a good idea.

Enable HV_HYPERCALL_AVAILABLE bit unconditionally.

Reviewed-by: Eduardo Habkost 
Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/kvm/kvm.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 1cce0969067e..33830117fa31 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -810,8 +810,6 @@ static struct {
 [HYPERV_FEAT_RELAXED] = {
 .desc = "relaxed timing (hv-relaxed)",
 .flags = {
-{.func = HV_CPUID_FEATURES, .reg = R_EAX,
- .bits = HV_HYPERCALL_AVAILABLE},
 {.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX,
  .bits = HV_RELAXED_TIMING_RECOMMENDED}
 }
@@ -820,7 +818,7 @@ static struct {
 .desc = "virtual APIC (hv-vapic)",
 .flags = {
 {.func = HV_CPUID_FEATURES, .reg = R_EAX,
- .bits = HV_HYPERCALL_AVAILABLE | HV_APIC_ACCESS_AVAILABLE},
+ .bits = HV_APIC_ACCESS_AVAILABLE},
 {.func = HV_CPUID_ENLIGHTMENT_INFO, .reg = R_EAX,
  .bits = HV_APIC_ACCESS_RECOMMENDED}
 }
@@ -829,8 +827,7 @@ static struct {
 .desc = "clocksources (hv-time)",
 .flags = {
 {.func = HV_CPUID_FEATURES, .reg = R_EAX,
- .bits = HV_HYPERCALL_AVAILABLE | HV_TIME_REF_COUNT_AVAILABLE |
- HV_REFERENCE_TSC_AVAILABLE}
+ .bits = HV_TIME_REF_COUNT_AVAILABLE | HV_REFERENCE_TSC_AVAILABLE}
 }
 },
 [HYPERV_FEAT_CRASH] = {
@@ -1343,6 +1340,9 @@ static int hyperv_fill_cpuids(CPUState *cs,
 c->ebx = hv_build_cpuid_leaf(cs, HV_CPUID_FEATURES, R_EBX);
 c->edx = hv_build_cpuid_leaf(cs, HV_CPUID_FEATURES, R_EDX);
 
+/* Unconditionally required with any Hyper-V enlightenment */
+c->eax |= HV_HYPERCALL_AVAILABLE;
+
 /* Not exposed by KVM but needed to make CPU hotplug in Windows work */
 c->edx |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
 
-- 
2.31.1

[PATCH v8 6/9] i386: kill off hv_cpuid_check_and_set()

2021-06-08 Thread Vitaly Kuznetsov

hv_cpuid_check_and_set() does too much:
- Checks if the feature is supported by KVM;
- Checks if all dependencies are enabled;
- Sets the feature bit in cpu->hyperv_features for 'passthrough' mode.

To reduce the complexity, move all the logic except for dependencies
check out of it. Also, in 'passthrough' mode we don't really need to
check dependencies because KVM is supposed to provide a consistent
set anyway.

Reviewed-by: Eduardo Habkost 
Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/kvm/kvm.c | 104 +++---
 1 file changed, 36 insertions(+), 68 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index b679dfdfc655..1cce0969067e 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1145,16 +1145,12 @@ static bool hyperv_feature_supported(CPUState *cs, int 
feature)
 return true;
 }
 
-static int hv_cpuid_check_and_set(CPUState *cs, int feature, Error **errp)
+/* Checks that all feature dependencies are enabled */
+static bool hv_feature_check_deps(X86CPU *cpu, int feature, Error **errp)
 {
-X86CPU *cpu = X86_CPU(cs);
 uint64_t deps;
 int dep_feat;
 
-if (!hyperv_feat_enabled(cpu, feature) && !cpu->hyperv_passthrough) {
-return 0;
-}
-
 deps = kvm_hyperv_properties[feature].dependencies;
 while (deps) {
 dep_feat = ctz64(deps);
@@ -1162,26 +1158,12 @@ static int hv_cpuid_check_and_set(CPUState *cs, int 
feature, Error **errp)
 error_setg(errp, "Hyper-V %s requires Hyper-V %s",
kvm_hyperv_properties[feature].desc,
kvm_hyperv_properties[dep_feat].desc);
-return 1;
+return false;
 }
 deps &= ~(1ull << dep_feat);
 }
 
-if (!hyperv_feature_supported(cs, feature)) {
-if (hyperv_feat_enabled(cpu, feature)) {
-error_setg(errp, "Hyper-V %s is not supported by kernel",
-   kvm_hyperv_properties[feature].desc);
-return 1;
-} else {
-return 0;
-}
-}
-
-if (cpu->hyperv_passthrough) {
-cpu->hyperv_features |= BIT(feature);
-}
-
-return 0;
+return true;
 }
 
 static uint32_t hv_build_cpuid_leaf(CPUState *cs, uint32_t func, int reg)
@@ -1220,6 +1202,8 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, 
uint32_t func, int reg)
 bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
 {
 CPUState *cs = CPU(cpu);
+Error *local_err = NULL;
+int feat;
 
 if (!hyperv_enabled(cpu))
 return true;
@@ -1275,53 +1259,37 @@ bool kvm_hyperv_expand_features(X86CPU *cpu, Error 
**errp)
 
 cpu->hyperv_spinlock_attempts =
 hv_cpuid_get_host(cs, HV_CPUID_ENLIGHTMENT_INFO, R_EBX);
-}
 
-/* Features */
-if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_RELAXED, errp)) {
-return false;
-}
-if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_VAPIC, errp)) {
-return false;
-}
-if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_TIME, errp)) {
-return false;
-}
-if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_CRASH, errp)) {
-return false;
-}
-if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_RESET, errp)) {
-return false;
-}
-if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_VPINDEX, errp)) {
-return false;
-}
-if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_RUNTIME, errp)) {
-return false;
-}
-if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_SYNIC, errp)) {
-return false;
-}
-if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER, errp)) {
-return false;
-}
-if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_FREQUENCIES, errp)) {
-return false;
-}
-if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_REENLIGHTENMENT, errp)) {
-return false;
-}
-if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_TLBFLUSH, errp)) {
-return false;
-}
-if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_EVMCS, errp)) {
-return false;
-}
-if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_IPI, errp)) {
-return false;
-}
-if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER_DIRECT, errp)) {
-return false;
+/*
+ * Mark feature as enabled in 'cpu->hyperv_features' as
+ * hv_build_cpuid_leaf() uses this info to build guest CPUIDs.
+ */
+for (feat = 0; feat < ARRAY_SIZE(kvm_hyperv_properties); feat++) {
+if (hyperv_feature_supported(cs, feat)) {
+cpu->hyperv_features |= BIT(feat);
+}
+}
+} else {
+/* Check features availability and dependencies */
+for (feat = 0; feat < ARRAY_SIZE(kvm_hyperv_properties); feat++) {
+/* If the feature was not requested skip it. */
+if (!hyperv_feat_enabled(cpu, feat)) {
+continue;
+

[PATCH v8 5/9] i386: expand Hyper-V features during CPU feature expansion time

2021-06-08 Thread Vitaly Kuznetsov

To make Hyper-V features appear in e.g. QMP query-cpu-model-expansion we
need to expand and set the corresponding CPUID leaves early. Modify
x86_cpu_get_supported_feature_word() to call newly intoduced Hyper-V
specific kvm_hv_get_supported_cpuid() instead of
kvm_arch_get_supported_cpuid(). We can't use kvm_arch_get_supported_cpuid()
as Hyper-V specific CPUID leaves intersect with KVM's.

Note, early expansion will only happen when KVM supports system wide
KVM_GET_SUPPORTED_HV_CPUID ioctl (KVM_CAP_SYS_HYPERV_CPUID).

Reviewed-by: Eduardo Habkost 
Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/cpu.c  |  4 
 target/i386/kvm/kvm-stub.c |  5 +
 target/i386/kvm/kvm.c  | 24 
 target/i386/kvm/kvm_i386.h |  1 +
 4 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f8ae45be0d53..c5d19216787c 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5990,6 +5990,10 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
 if (env->cpuid_xlevel2 == UINT32_MAX) {
 env->cpuid_xlevel2 = env->cpuid_min_xlevel2;
 }
+
+if (kvm_enabled()) {
+kvm_hyperv_expand_features(cpu, errp);
+}
 }
 
 /*
diff --git a/target/i386/kvm/kvm-stub.c b/target/i386/kvm/kvm-stub.c
index 92f49121b8fa..f6e7e4466e1a 100644
--- a/target/i386/kvm/kvm-stub.c
+++ b/target/i386/kvm/kvm-stub.c
@@ -39,3 +39,8 @@ bool kvm_hv_vpindex_settable(void)
 {
 return false;
 }
+
+bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
+{
+abort();
+}
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 1e6f3c483e28..b679dfdfc655 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1217,13 +1217,22 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, 
uint32_t func, int reg)
  * of 'hv_passthrough' mode and fills the environment with all supported
  * Hyper-V features.
  */
-static bool hyperv_expand_features(CPUState *cs, Error **errp)
+bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
 {
-X86CPU *cpu = X86_CPU(cs);
+CPUState *cs = CPU(cpu);
 
 if (!hyperv_enabled(cpu))
 return true;
 
+/*
+ * When kvm_hyperv_expand_features is called at CPU feature expansion
+ * time per-CPU kvm_state is not available yet so we can only proceed
+ * when KVM_CAP_SYS_HYPERV_CPUID is supported.
+ */
+if (!cs->kvm_state &&
+!kvm_check_extension(kvm_state, KVM_CAP_SYS_HYPERV_CPUID))
+return true;
+
 if (cpu->hyperv_passthrough) {
 cpu->hyperv_vendor_id[0] =
 hv_cpuid_get_host(cs, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EBX);
@@ -1590,8 +1599,15 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
 env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
 
-/* Paravirtualization CPUIDs */
-if (!hyperv_expand_features(cs, _err)) {
+/*
+ * kvm_hyperv_expand_features() is called here for the second time in case
+ * KVM_CAP_SYS_HYPERV_CPUID is not supported. While we can't possibly 
handle
+ * 'query-cpu-model-expansion' in this case as we don't have a KVM vCPU to
+ * check which Hyper-V enlightenments are supported and which are not, we
+ * can still proceed and check/expand Hyper-V enlightenments here so legacy
+ * behavior is preserved.
+ */
+if (!kvm_hyperv_expand_features(cpu, _err)) {
 error_report_err(local_err);
 return -ENOSYS;
 }
diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index dc725083891c..54667b35f09c 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -47,6 +47,7 @@ bool kvm_has_x2apic_api(void);
 bool kvm_has_waitpkg(void);
 
 bool kvm_hv_vpindex_settable(void);
+bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp);
 
 uint64_t kvm_swizzle_msi_ext_dest_id(uint64_t address);
 
-- 
2.31.1

[PATCH v8 2/9] i386: clarify 'hv-passthrough' behavior

2021-06-08 Thread Vitaly Kuznetsov

Clarify the fact that 'hv-passthrough' only enables features which are
already known to QEMU and that it overrides all other 'hv-*' settings.

Reviewed-by: Eduardo Habkost 
Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index e53c581f4586..a51953daa833 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -209,8 +209,11 @@ In some cases (e.g. during development) it may make sense 
to use QEMU in
 'pass-through' mode and give Windows guests all enlightenments currently
 supported by KVM. This pass-through mode is enabled by "hv-passthrough" CPU
 flag.
-Note: enabling this flag effectively prevents migration as supported features
-may differ between target and destination.
+Note: "hv-passthrough" flag only enables enlightenments which are known to QEMU
+(have corresponding "hv-*" flag) and copies "hv-spinlocks="/"hv-vendor-id="
+values from KVM to QEMU. "hv-passthrough" overrides all other "hv-*" settings 
on
+the command line. Also, enabling this flag effectively prevents migration as 
the
+list of enabled enlightenments may differ between target and destination hosts.
 
 
 4. Useful links
-- 
2.31.1

[PATCH v8 0/9] i386: KVM: expand Hyper-V features early

2021-06-08 Thread Vitaly Kuznetsov

Changes since v7:
- Make eVMCS version check future proof [Eduardo]
- Collect R-b tags [Eduardo]
- Drop 'if (!strcmp(arch, "i386") || !strcmp(arch, "x86_64"))' check from qtest
 [Eduardo]
- s/priviliges/privileges/ [Eric]

The last two functional patches are inspired by 'Fine-grained access check
to Hyper-V hypercalls and MSRs' work for KVM:
https://lore.kernel.org/kvm/20210521095204.2161214-1-vkuzn...@redhat.com/

Original description:

Upper layer tools like libvirt want to figure out which Hyper-V features are
supported by the underlying stack (QEMU/KVM) but currently they are unable to
do so. We have a nice 'hv_passthrough' CPU flag supported by QEMU but it has
no effect on e.g. QMP's 

query-cpu-model-expansion type=full 
model={"name":"host","props":{"hv-passthrough":true}}

command as we parse Hyper-V features after creating KVM vCPUs and not at
feature expansion time. To support the use-case we first need to make 
KVM_GET_SUPPORTED_HV_CPUID ioctl a system-wide ioctl as the existing
vCPU version can't be used that early. This is what KVM part does. With
that done, we can make early Hyper-V feature expansion (this series).

Vitaly Kuznetsov (9):
  i386: avoid hardcoding '12' as 'hyperv_vendor_id' length
  i386: clarify 'hv-passthrough' behavior
  i386: hardcode supported eVMCS version to '1'
  i386: make hyperv_expand_features() return bool
  i386: expand Hyper-V features during CPU feature expansion time
  i386: kill off hv_cpuid_check_and_set()
  i386: HV_HYPERCALL_AVAILABLE privilege bit is always needed
  i386: Hyper-V SynIC requires POST_MESSAGES/SIGNAL_EVENTS privileges
  qtest/hyperv: Introduce a simple hyper-v test

 MAINTAINERS|   1 +
 docs/hyperv.txt|   9 +-
 target/i386/cpu.c  |  13 +-
 target/i386/kvm/hyperv-proto.h |   6 +
 target/i386/kvm/kvm-stub.c |   5 +
 target/i386/kvm/kvm.c  | 189 +++-
 target/i386/kvm/kvm_i386.h |   1 +
 tests/qtest/hyperv-test.c  | 221 +
 tests/qtest/meson.build|   3 +-
 9 files changed, 357 insertions(+), 91 deletions(-)
 create mode 100644 tests/qtest/hyperv-test.c

-- 
2.31.1

[PATCH v8 4/9] i386: make hyperv_expand_features() return bool

2021-06-08 Thread Vitaly Kuznetsov

Return 'false' when hyperv_expand_features() sets an error.

No functional change intended.

Reviewed-by: Eduardo Habkost 
Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/kvm/kvm.c | 40 +---
 1 file changed, 21 insertions(+), 19 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 13d63f576b88..1e6f3c483e28 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1217,12 +1217,12 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, 
uint32_t func, int reg)
  * of 'hv_passthrough' mode and fills the environment with all supported
  * Hyper-V features.
  */
-static void hyperv_expand_features(CPUState *cs, Error **errp)
+static bool hyperv_expand_features(CPUState *cs, Error **errp)
 {
 X86CPU *cpu = X86_CPU(cs);
 
 if (!hyperv_enabled(cpu))
-return;
+return true;
 
 if (cpu->hyperv_passthrough) {
 cpu->hyperv_vendor_id[0] =
@@ -1270,49 +1270,49 @@ static void hyperv_expand_features(CPUState *cs, Error 
**errp)
 
 /* Features */
 if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_RELAXED, errp)) {
-return;
+return false;
 }
 if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_VAPIC, errp)) {
-return;
+return false;
 }
 if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_TIME, errp)) {
-return;
+return false;
 }
 if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_CRASH, errp)) {
-return;
+return false;
 }
 if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_RESET, errp)) {
-return;
+return false;
 }
 if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_VPINDEX, errp)) {
-return;
+return false;
 }
 if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_RUNTIME, errp)) {
-return;
+return false;
 }
 if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_SYNIC, errp)) {
-return;
+return false;
 }
 if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER, errp)) {
-return;
+return false;
 }
 if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_FREQUENCIES, errp)) {
-return;
+return false;
 }
 if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_REENLIGHTENMENT, errp)) {
-return;
+return false;
 }
 if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_TLBFLUSH, errp)) {
-return;
+return false;
 }
 if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_EVMCS, errp)) {
-return;
+return false;
 }
 if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_IPI, errp)) {
-return;
+return false;
 }
 if (hv_cpuid_check_and_set(cs, HYPERV_FEAT_STIMER_DIRECT, errp)) {
-return;
+return false;
 }
 
 /* Additional dependencies not covered by kvm_hyperv_properties[] */
@@ -1322,7 +1322,10 @@ static void hyperv_expand_features(CPUState *cs, Error 
**errp)
 error_setg(errp, "Hyper-V %s requires Hyper-V %s",
kvm_hyperv_properties[HYPERV_FEAT_SYNIC].desc,
kvm_hyperv_properties[HYPERV_FEAT_VPINDEX].desc);
+return false;
 }
+
+return true;
 }
 
 /*
@@ -1588,8 +1591,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
 env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
 
 /* Paravirtualization CPUIDs */
-hyperv_expand_features(cs, _err);
-if (local_err) {
+if (!hyperv_expand_features(cs, _err)) {
 error_report_err(local_err);
 return -ENOSYS;
 }
-- 
2.31.1

[PATCH v8 1/9] i386: avoid hardcoding '12' as 'hyperv_vendor_id' length

2021-06-08 Thread Vitaly Kuznetsov

While this is very unlikely to change, let's avoid hardcoding '12' as
'hyperv_vendor_id' length.

No functional change intended.

Reviewed-by: Eduardo Habkost 
Signed-off-by: Vitaly Kuznetsov 
---
 target/i386/cpu.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index a9fe1662d392..f8ae45be0d53 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6057,11 +6057,12 @@ static void x86_cpu_hyperv_realize(X86CPU *cpu)
 _abort);
 }
 len = strlen(cpu->hyperv_vendor);
-if (len > 12) {
-warn_report("hv-vendor-id truncated to 12 characters");
-len = 12;
+if (len > sizeof(cpu->hyperv_vendor_id)) {
+warn_report("hv-vendor-id truncated to %ld characters",
+sizeof(cpu->hyperv_vendor_id));
+len = sizeof(cpu->hyperv_vendor_id);
 }
-memset(cpu->hyperv_vendor_id, 0, 12);
+memset(cpu->hyperv_vendor_id, 0, sizeof(cpu->hyperv_vendor_id));
 memcpy(cpu->hyperv_vendor_id, cpu->hyperv_vendor, len);
 
 /* 'Hv#1' interface identification*/
-- 
2.31.1

Re: [PATCH v7 3/9] i386: hardcode supported eVMCS version to '1'

2021-06-07 Thread Vitaly Kuznetsov

Eduardo Habkost  writes:

> On Fri, Jun 04, 2021 at 09:28:15AM +0200, Vitaly Kuznetsov wrote:
>> Eduardo Habkost  writes:
>> 
>> > On Thu, Jun 03, 2021 at 01:48:29PM +0200, Vitaly Kuznetsov wrote:
>> >> Currently, the only eVMCS version, supported by KVM (and described in 
>> >> TLFS)
>> >> is '1'. When Enlightened VMCS feature is enabled, QEMU takes the supported
>> >> eVMCS version range (from KVM_CAP_HYPERV_ENLIGHTENED_VMCS enablement) and
>> >> puts it to guest visible CPUIDs. When (and if) eVMCS ver.2 appears a
>> >> problem on migration is expected: it doesn't seem to be possible to 
>> >> migrate
>> >> from a host supporting eVMCS ver.2 to a host, which only support eVMCS
>> >> ver.1.
>> >
>> > Isn't it possible and safe to expose eVMCS ver.1 to the guest on
>> > a host that supports ver.2?
>> 
>> We expose the supported range, guest is free to use any eVMCS version in
>> the range (see below):
>
> Oh, I didn't notice the returned value was a range.
>
>> 
>> >
>> >> 
>> >> Hardcode eVMCS ver.1 as the result of 'hv-evmcs' enablement for now. Newer
>> >> eVMCS versions will have to have their own enablement options (e.g.
>> >> 'hv-evmcs=2').
>> >> 
>> >> Signed-off-by: Vitaly Kuznetsov 
>> >> ---
>> >>  docs/hyperv.txt   |  2 +-
>> >>  target/i386/kvm/kvm.c | 16 +++-
>> >>  2 files changed, 12 insertions(+), 6 deletions(-)
>> >> 
>> >> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
>> >> index a51953daa833..000638a2fd38 100644
>> >> --- a/docs/hyperv.txt
>> >> +++ b/docs/hyperv.txt
>> >> @@ -170,7 +170,7 @@ Recommended: hv-frequencies
>> >>  3.16. hv-evmcs
>> >>  ===
>> >>  The enlightenment is nested specific, it targets Hyper-V on KVM guests. 
>> >> When
>> >> -enabled, it provides Enlightened VMCS feature to the guest. The feature
>> >> +enabled, it provides Enlightened VMCS version 1 feature to the guest. 
>> >> The feature
>> >>  implements paravirtualized protocol between L0 (KVM) and L1 (Hyper-V)
>> >>  hypervisors making L2 exits to the hypervisor faster. The feature is 
>> >> Intel-only.
>> >>  Note: some virtualization features (e.g. Posted Interrupts) are disabled 
>> >> when
>> >> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>> >> index c676ee8b38a7..d57eede5dc81 100644
>> >> --- a/target/i386/kvm/kvm.c
>> >> +++ b/target/i386/kvm/kvm.c
>> >> @@ -1490,13 +1490,19 @@ static int hyperv_init_vcpu(X86CPU *cpu)
>> >>  ret = kvm_vcpu_enable_cap(cs, KVM_CAP_HYPERV_ENLIGHTENED_VMCS, 0,
>> >>(uintptr_t)_version);
>> >>  
>> >> -if (ret < 0) {
>> >> -fprintf(stderr, "Hyper-V %s is not supported by kernel\n",
>> >> -kvm_hyperv_properties[HYPERV_FEAT_EVMCS].desc);
>> >> +/*
>> >> + * KVM is required to support EVMCS ver.1. as that's what 
>> >> 'hv-evmcs'
>> >> + * option sets. Note: we hardcode the maximum supported eVMCS 
>> >> version
>> >> + * to '1' as well so 'hv-evmcs' feature is migratable even when 
>> >> (and if)
>> >> + * ver.2 is implemented. A new option (e.g. 'hv-evmcs=2') will 
>> >> then have
>> >> + * to be added.
>> >> + */
>> >> +if (ret < 0 || (uint8_t)evmcs_version > 1) {
>> >
>> > Wait, do you really want to get a fatal error every time, after a
>> > kernel upgrade?
>> >
>> 
>> Here, evmcs_version (returned by kvm_vcpu_enable_cap()) represents a
>> *range* of supported eVMCS versions:
>> 
>> (evmcs_highest_supported_version << 8) | evmcs_lowest_supported_version
>> 
>> Currently, this is 0x101 [1..1] range.
>> 
>> The '(uint8_t)evmcs_version > 1' check here means 'eVMCS v1' is no
>> longer supported by KVM. This is not going to happen any time soon, but
>> I can imagine in 10 years or so we'll be dropping v1 so the range (in
>> theory) can be [10..2] -- which would mean eVMCS ver. 1 is NOT
>> supported. And we can't proceed then.
>
> Where is this documented?  The only reference to
> KVM_CAP_HYPERV_ENLIGHTENED_VMCS I've found in linux/Documentatio

1 2 3 4 5 6 >

1 - 100 of 523 matches

Mail list logo