Re: [PATCH] net: appletalk: remove cops support

2023-09-27 Thread Vitaly Kuznetsov
Greg Kroah-Hartman  writes:

> The COPS Appletalk support is very old, never said to actually work
> properly, and the firmware code for the devices are under a very suspect
> license.  Remove it all to clear up the license issue, if it is still
> needed and actually used by anyone, we can add it back later once the
> license is cleared up.
>
> Reported-by: Prarit Bhargava 
> Cc: Christoph Hellwig 
> Cc: Vitaly Kuznetsov 

FWIW,

Reviewed-by: Vitaly Kuznetsov 

> Cc: jsch...@samba.org
> Signed-off-by: Greg Kroah-Hartman 
> ---
>  .../device_drivers/appletalk/cops.rst |   80 --
>  .../device_drivers/appletalk/index.rst|   18 -
>  .../networking/device_drivers/index.rst   |1 -
>  drivers/net/Space.c   |6 -
>  drivers/net/appletalk/Kconfig |   30 -
>  drivers/net/appletalk/Makefile|1 -
>  drivers/net/appletalk/cops.c  | 1005 -
>  drivers/net/appletalk/cops.h  |   61 -
>  drivers/net/appletalk/cops_ffdrv.h|  532 -
>  drivers/net/appletalk/cops_ltdrv.h|  241 
>  include/net/Space.h   |1 -
>  11 files changed, 1976 deletions(-)
>  delete mode 100644 Documentation/networking/device_drivers/appletalk/cops.rst
>  delete mode 100644 
> Documentation/networking/device_drivers/appletalk/index.rst
>  delete mode 100644 drivers/net/appletalk/cops.c
>  delete mode 100644 drivers/net/appletalk/cops.h
>  delete mode 100644 drivers/net/appletalk/cops_ffdrv.h
>  delete mode 100644 drivers/net/appletalk/cops_ltdrv.h
>
> diff --git a/Documentation/networking/device_drivers/appletalk/cops.rst 
> b/Documentation/networking/device_drivers/appletalk/cops.rst
> deleted file mode 100644
> index 964ba80599a9..
> --- a/Documentation/networking/device_drivers/appletalk/cops.rst
> +++ /dev/null
> @@ -1,80 +0,0 @@
> -.. SPDX-License-Identifier: GPL-2.0
> -
> -
> -The COPS LocalTalk Linux driver (cops.c)
> -
> -
> -By Jay Schulist 
> -
> -This driver has two modes and they are: Dayna mode and Tangent mode.
> -Each mode corresponds with the type of card. It has been found
> -that there are 2 main types of cards and all other cards are
> -the same and just have different names or only have minor differences
> -such as more IO ports. As this driver is tested it will
> -become more clear exactly what cards are supported.
> -
> -Right now these cards are known to work with the COPS driver. The
> -LT-200 cards work in a somewhat more limited capacity than the
> -DL200 cards, which work very well and are in use by many people.
> -
> -TANGENT driver mode:
> - - Tangent ATB-II, Novell NL-1000, Daystar Digital LT-200
> -
> -DAYNA driver mode:
> - - Dayna DL2000/DaynaTalk PC (Half Length), COPS LT-95,
> - - Farallon PhoneNET PC III, Farallon PhoneNET PC II
> -
> -Other cards possibly supported mode unknown though:
> - - Dayna DL2000 (Full length)
> -
> -The COPS driver defaults to using Dayna mode. To change the driver's
> -mode if you built a driver with dual support use board_type=1 or
> -board_type=2 for Dayna or Tangent with insmod.
> -
> -Operation/loading of the driver
> -===
> -
> -Use modprobe like this:  /sbin/modprobe cops.o (IO #) (IRQ #)
> -If you do not specify any options the driver will try and use the IO = 0x240,
> -IRQ = 5. As of right now I would only use IRQ 5 for the card, if autoprobing.
> -
> -To load multiple COPS driver Localtalk cards you can do one of the 
> following::
> -
> - insmod cops io=0x240 irq=5
> - insmod -o cops2 cops io=0x260 irq=3
> -
> -Or in lilo.conf put something like this::
> -
> - append="ether=5,0x240,lt0 ether=3,0x260,lt1"
> -
> -Then bring up the interface with ifconfig. It will look something like this::
> -
> -  lt0   Link encap:UNSPEC  HWaddr 
> 00-00-00-00-00-00-00-F7-00-00-00-00-00-00-00-00
> - inet addr:192.168.1.2  Bcast:192.168.1.255  Mask:255.255.255.0
> - UP BROADCAST RUNNING NOARP MULTICAST  MTU:600  Metric:1
> - RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> - TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 coll:0
> -
> -Netatalk Configuration
> -==
> -
> -You will need to configure atalkd with something like the following to make
> -it work with the cops.c driver.
> -
> -* For single LTalk card use::
> -
> -dummy -seed -phase 2 -net 2000 -addr 2000.10 -zone "1033"
> -lt0 -seed -phase 1 -net 1000 -addr 1000.50 -zone "1033"

Re: [PATCH] x86/hyperv: Restrict get_vtl to only VTL platforms

2023-09-14 Thread Vitaly Kuznetsov
Saurabh Sengar  writes:

> For non VTL platforms vtl is always 0, and there is no need of
> get_vtl function. For VTL platforms get_vtl should always succeed
> and should return the correct VTL.
>
> Signed-off-by: Saurabh Sengar 
> ---
>  arch/x86/hyperv/hv_init.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 783ed339f341..e589c240565a 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -416,8 +416,8 @@ static u8 __init get_vtl(void)
>   if (hv_result_success(ret)) {
>   ret = output->as64.low & HV_X64_VTL_MASK;
>   } else {
> - pr_err("Failed to get VTL(%lld) and set VTL to zero by 
> default.\n", ret);
> - ret = 0;
> + pr_err("Failed to get VTL(error: %lld) exiting...\n", ret);

Nitpick: arch/x86/hyperv/hv_init.c lacks pr_fmt so the message won't get
prefixed with "Hyper-V". I'm not sure 'VTL' abbreviation has the only,
Hyper-V specific meaning. I'd suggest we add 

#define pr_fmt(fmt)  "Hyper-V: " fmt

to the beginning of the file.

> + BUG();
>   }
>  
>   local_irq_restore(flags);
> @@ -604,8 +604,10 @@ void __init hyperv_init(void)
>   hv_query_ext_cap(0);
>  
>   /* Find the VTL */
> - if (!ms_hyperv.paravisor_present && hv_isolation_type_snp())
> + if (IS_ENABLED(CONFIG_HYPERV_VTL_MODE))
>   ms_hyperv.vtl = get_vtl();
> + else
> + ms_hyperv.vtl = 0;

Is 'else' branch really needed? 'ms_hyperv' seems to be a statically
allocated global. But instead of doing this, what about putting the
whole get_vtl() funtion under '#if (IS_ENABLED(CONFIG_HYPERV_VTL_MODE))', i.e.:

#if IS_ENABLED(CONFIG_HYPERV_VTL_MODE)
static u8 __init get_vtl(void)
{
u64 control = HV_HYPERCALL_REP_COMP_1 | HVCALL_GET_VP_REGISTERS;
...
}
#else
static inline get_vtl(void) { return 0; }
#endif

and then we can always do

  ms_hyperv.vtl = get_vtl();

unconditionally?

>  
>   return;

-- 
Vitaly




Re: [PATCH v2 3/7] KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx

2021-04-20 Thread Vitaly Kuznetsov
Vineeth Pillai  writes:

> On 4/16/2021 4:36 AM, Vitaly Kuznetsov wrote:
>>
>>>   struct kvm_vm_stat {
>>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>>> index 58fa8c029867..614b4448a028 100644
>>> --- a/arch/x86/kvm/hyperv.c
>>> +++ b/arch/x86/kvm/hyperv.c
>> I still think that using arch/x86/kvm/hyperv.[ch] for KVM-on-Hyper-V is
>> misleading. Currently, these are dedicated to emulating Hyper-V
>> interface to KVM guests and this is orthogonal to nesting KVM on
>> Hyper-V. As a solution, I'd suggest you either:
>> - Put the stuff in x86.c
>> - Create a dedicated set of files, e.g. 'kvmonhyperv.[ch]' (I also
>> thought about 'hyperv_host.[ch]' but then I realized it's equally
>> misleading as one can read this as 'KVM is acting as Hyper-V host').
>>
>> Personally, I'd vote for the later. Besides eliminating confusion, the
>> benefit of having dedicated files is that we can avoid compiling them
>> completely when !IS_ENABLED(CONFIG_HYPERV) (#ifdefs in C are ugly).
> Makes sense, creating new set of files looks good to me. The default 
> hyperv.c
> for hyperv emulation also seems misleading - probably we should rename it
> to hyperv_host_emul.[ch] or similar. That way, probably I can use 
> hyperv.[ch]
> for kvm on hyperv code. If you feel, thats too big of a churn, I shall use
> kvm_on_hyperv.[ch] (to avoid reading the file differently). What do you 
> think?

I agree that 'hyperv.[ch]' is not ideal but I'm on the fence whether
renaming it is worth it. If we were to rename it, I'd suggest just
'hyperv_emul.[ch]' to indicate that here we're emulating Hyper-V.

I don't think reusing 'hyperv.[ch]' for KVM-on-Hyper-V is a good idea,
it would be doubly misleading and not friendly to backporters. Let's not
do that.

>
>
>>> @@ -10470,7 +10474,6 @@ void kvm_arch_free_vm(struct kvm *kvm)
>>> vfree(kvm);
>>>   }
>>>   
>>> -
>> Stray change?
> It was kinda leftover, but I thought I'd keep it as it removes and 
> unnecessary line.

The idea is to have meaninful patches as concise as possible splitting
off cleanup / preparatory patches which don't actually change anything;
this way big series are much easier to review.

>
> Thanks,
> Vineeth
>

-- 
Vitaly



Re: ** POTENTIAL FRAUD ALERT - RED HAT ** [PATCH v2 1/1] Drivers: hv: vmbus: Increase wait time for VMbus unload

2021-04-20 Thread Vitaly Kuznetsov
Michael Kelley  writes:

> When running in Azure, disks may be connected to a Linux VM with
> read/write caching enabled. If a VM panics and issues a VMbus
> UNLOAD request to Hyper-V, the response is delayed until all dirty
> data in the disk cache is flushed.  In extreme cases, this flushing
> can take 10's of seconds, depending on the disk speed and the amount
> of dirty data. If kdump is configured for the VM, the current 10 second
> timeout in vmbus_wait_for_unload() may be exceeded, and the UNLOAD
> complete message may arrive well after the kdump kernel is already
> running, causing problems.  Note that no problem occurs if kdump is
> not enabled because Hyper-V waits for the cache flush before doing
> a reboot through the BIOS/UEFI code.
>
> Fix this problem by increasing the timeout in vmbus_wait_for_unload()
> to 100 seconds. Also output periodic messages so that if anyone is
> watching the serial console, they won't think the VM is completely
> hung.
>
> Fixes: 911e1987efc8 ("Drivers: hv: vmbus: Add timeout to 
> vmbus_wait_for_unload")
> Signed-off-by: Michael Kelley 
> ---
>
> Changed in v2: Fixed silly error in the argument to mdelay()
>
> ---
>  drivers/hv/channel_mgmt.c | 30 +-
>  1 file changed, 25 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index f3cf4af..ef4685c 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -755,6 +755,12 @@ static void init_vp_index(struct vmbus_channel *channel)
>   free_cpumask_var(available_mask);
>  }
>  
> +#define UNLOAD_DELAY_UNIT_MS 10  /* 10 milliseconds */
> +#define UNLOAD_WAIT_MS   (100*1000)  /* 100 seconds */
> +#define UNLOAD_WAIT_LOOPS(UNLOAD_WAIT_MS/UNLOAD_DELAY_UNIT_MS)
> +#define UNLOAD_MSG_MS(5*1000)/* Every 5 seconds */
> +#define UNLOAD_MSG_LOOPS (UNLOAD_MSG_MS/UNLOAD_DELAY_UNIT_MS)
> +
>  static void vmbus_wait_for_unload(void)
>  {
>   int cpu;
> @@ -772,12 +778,17 @@ static void vmbus_wait_for_unload(void)
>* vmbus_connection.unload_event. If not, the last thing we can do is
>* read message pages for all CPUs directly.
>*
> -  * Wait no more than 10 seconds so that the panic path can't get
> -  * hung forever in case the response message isn't seen.
> +  * Wait up to 100 seconds since an Azure host must writeback any dirty
> +  * data in its disk cache before the VMbus UNLOAD request will
> +  * complete. This flushing has been empirically observed to take up
> +  * to 50 seconds in cases with a lot of dirty data, so allow additional
> +  * leeway and for inaccuracies in mdelay(). But eventually time out so
> +  * that the panic path can't get hung forever in case the response
> +  * message isn't seen.

I vaguely remember debugging cases when CHANNELMSG_UNLOAD_RESPONSE never
arrives, it was kind of pointless to proceed to kexec as attempts to
reconnect Vmbus devices were failing (no devices were offered after
CHANNELMSG_REQUESTOFFERS AFAIR). Would it maybe make sense to just do
emergency reboot instead of proceeding to kexec when this happens? Just
wondering.

>*/
> - for (i = 0; i < 1000; i++) {
> + for (i = 1; i <= UNLOAD_WAIT_LOOPS; i++) {
>   if (completion_done(_connection.unload_event))
> - break;
> + goto completed;
>  
>   for_each_online_cpu(cpu) {
>   struct hv_per_cpu_context *hv_cpu
> @@ -800,9 +811,18 @@ static void vmbus_wait_for_unload(void)
>   vmbus_signal_eom(msg, message_type);
>   }
>  
> - mdelay(10);
> + /*
> +  * Give a notice periodically so someone watching the
> +  * serial output won't think it is completely hung.
> +  */
> + if (!(i % UNLOAD_MSG_LOOPS))
> + pr_notice("Waiting for VMBus UNLOAD to complete\n");
> +
> + mdelay(UNLOAD_DELAY_UNIT_MS);
>   }
> + pr_err("Continuing even though VMBus UNLOAD did not complete\n");
>  
> +completed:
>   /*
>* We're crashing and already got the UNLOAD_RESPONSE, cleanup all
>* maybe-pending messages on all CPUs to be able to receive new

This is definitely an improvement,

Reviewed-by: Vitaly Kuznetsov 

-- 
Vitaly



[PATCH 16/30] KVM: x86: hyper-v: Honor HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE privilege bit

2021-04-19 Thread Vitaly Kuznetsov
HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4, HV_X64_MSR_CRASH_CTL are only
available to guest when HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE bit is
exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index f416c9de73cb..43ebb53b6b38 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1253,6 +1253,10 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv 
*hv_vcpu, u32 msr)
case HV_X64_MSR_TSC_EMULATION_STATUS:
return hv_vcpu->cpuid_cache.features_eax &
HV_ACCESS_REENLIGHTENMENT;
+   case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
+   case HV_X64_MSR_CRASH_CTL:
+   return hv_vcpu->cpuid_cache.features_edx &
+   HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE;
default:
break;
}
-- 
2.30.2



[PATCH 28/30] KVM: selftests: move Hyper-V MSR definitions to hyperv.h

2021-04-19 Thread Vitaly Kuznetsov
These defines can be shared by multiple tests, move them to a dedicated
header.

Signed-off-by: Vitaly Kuznetsov 
---
 .../selftests/kvm/include/x86_64/hyperv.h | 19 +++
 .../selftests/kvm/x86_64/hyperv_clock.c   |  8 +---
 2 files changed, 20 insertions(+), 7 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/include/x86_64/hyperv.h

diff --git a/tools/testing/selftests/kvm/include/x86_64/hyperv.h 
b/tools/testing/selftests/kvm/include/x86_64/hyperv.h
new file mode 100644
index ..443c6572512b
--- /dev/null
+++ b/tools/testing/selftests/kvm/include/x86_64/hyperv.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * tools/testing/selftests/kvm/include/x86_64/hyperv.h
+ *
+ * Copyright (C) 2021, Red Hat, Inc.
+ *
+ */
+
+#ifndef SELFTEST_KVM_HYPERV_H
+#define SELFTEST_KVM_HYPERV_H
+
+#define HV_X64_MSR_GUEST_OS_ID 0x4000
+#define HV_X64_MSR_TIME_REF_COUNT  0x4020
+#define HV_X64_MSR_REFERENCE_TSC   0x4021
+#define HV_X64_MSR_TSC_FREQUENCY   0x4022
+#define HV_X64_MSR_REENLIGHTENMENT_CONTROL 0x4106
+#define HV_X64_MSR_TSC_EMULATION_CONTROL   0x4107
+
+#endif /* !SELFTEST_KVM_HYPERV_H */
diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_clock.c 
b/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
index 7f1d2765572c..489625acc9cf 100644
--- a/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
@@ -7,6 +7,7 @@
 #include "test_util.h"
 #include "kvm_util.h"
 #include "processor.h"
+#include "hyperv.h"
 
 struct ms_hyperv_tsc_page {
volatile u32 tsc_sequence;
@@ -15,13 +16,6 @@ struct ms_hyperv_tsc_page {
volatile s64 tsc_offset;
 } __packed;
 
-#define HV_X64_MSR_GUEST_OS_ID 0x4000
-#define HV_X64_MSR_TIME_REF_COUNT  0x4020
-#define HV_X64_MSR_REFERENCE_TSC   0x4021
-#define HV_X64_MSR_TSC_FREQUENCY   0x4022
-#define HV_X64_MSR_REENLIGHTENMENT_CONTROL 0x4106
-#define HV_X64_MSR_TSC_EMULATION_CONTROL   0x4107
-
 /* Simplified mul_u64_u64_shr() */
 static inline u64 mul_u64_u64_shr64(u64 a, u64 b)
 {
-- 
2.30.2



[PATCH 29/30] KVM: selftests: Move evmcs.h to x86_64/

2021-04-19 Thread Vitaly Kuznetsov
evmcs.h is x86_64 only thing, move it to x86_64/ subdirectory.

Signed-off-by: Vitaly Kuznetsov 
---
 tools/testing/selftests/kvm/include/{ => x86_64}/evmcs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
 rename tools/testing/selftests/kvm/include/{ => x86_64}/evmcs.h (99%)

diff --git a/tools/testing/selftests/kvm/include/evmcs.h 
b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
similarity index 99%
rename from tools/testing/selftests/kvm/include/evmcs.h
rename to tools/testing/selftests/kvm/include/x86_64/evmcs.h
index a034438b6266..c9af97abd622 100644
--- a/tools/testing/selftests/kvm/include/evmcs.h
+++ b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 /*
- * tools/testing/selftests/kvm/include/vmx.h
+ * tools/testing/selftests/kvm/include/x86_64/evmcs.h
  *
  * Copyright (C) 2018, Red Hat, Inc.
  *
-- 
2.30.2



[PATCH 30/30] KVM: selftests: Introduce hyperv_features test

2021-04-19 Thread Vitaly Kuznetsov
The initial implementation of the test only tests that access to Hyper-V
MSRs and hypercalls is in compliance with guest visible CPUID feature bits.

Signed-off-by: Vitaly Kuznetsov 
---
 tools/testing/selftests/kvm/.gitignore|   1 +
 tools/testing/selftests/kvm/Makefile  |   1 +
 .../selftests/kvm/include/x86_64/hyperv.h | 166 +
 .../selftests/kvm/x86_64/hyperv_features.c| 649 ++
 4 files changed, 817 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/hyperv_features.c

diff --git a/tools/testing/selftests/kvm/.gitignore 
b/tools/testing/selftests/kvm/.gitignore
index 7bd7e776c266..45213384a443 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -12,6 +12,7 @@
 /x86_64/kvm_pv_test
 /x86_64/hyperv_clock
 /x86_64/hyperv_cpuid
+/x86_64/hyperv_features
 /x86_64/mmio_warning_test
 /x86_64/platform_info_test
 /x86_64/set_boot_cpu_id
diff --git a/tools/testing/selftests/kvm/Makefile 
b/tools/testing/selftests/kvm/Makefile
index 67eebb53235f..8d610c87beee 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -44,6 +44,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/evmcs_test
 TEST_GEN_PROGS_x86_64 += x86_64/get_cpuid_test
 TEST_GEN_PROGS_x86_64 += x86_64/hyperv_clock
 TEST_GEN_PROGS_x86_64 += x86_64/hyperv_cpuid
+TEST_GEN_PROGS_x86_64 += x86_64/hyperv_features
 TEST_GEN_PROGS_x86_64 += x86_64/kvm_pv_test
 TEST_GEN_PROGS_x86_64 += x86_64/mmio_warning_test
 TEST_GEN_PROGS_x86_64 += x86_64/platform_info_test
diff --git a/tools/testing/selftests/kvm/include/x86_64/hyperv.h 
b/tools/testing/selftests/kvm/include/x86_64/hyperv.h
index 443c6572512b..412eaee7884a 100644
--- a/tools/testing/selftests/kvm/include/x86_64/hyperv.h
+++ b/tools/testing/selftests/kvm/include/x86_64/hyperv.h
@@ -9,11 +9,177 @@
 #ifndef SELFTEST_KVM_HYPERV_H
 #define SELFTEST_KVM_HYPERV_H
 
+#define HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS  0x4000
+#define HYPERV_CPUID_INTERFACE 0x4001
+#define HYPERV_CPUID_VERSION   0x4002
+#define HYPERV_CPUID_FEATURES  0x4003
+#define HYPERV_CPUID_ENLIGHTMENT_INFO  0x4004
+#define HYPERV_CPUID_IMPLEMENT_LIMITS  0x4005
+#define HYPERV_CPUID_CPU_MANAGEMENT_FEATURES   0x4007
+#define HYPERV_CPUID_NESTED_FEATURES   0x400A
+#define HYPERV_CPUID_SYNDBG_VENDOR_AND_MAX_FUNCTIONS   0x4080
+#define HYPERV_CPUID_SYNDBG_INTERFACE  0x4081
+#define HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES  0x4082
+
 #define HV_X64_MSR_GUEST_OS_ID 0x4000
+#define HV_X64_MSR_HYPERCALL   0x4001
+#define HV_X64_MSR_VP_INDEX0x4002
+#define HV_X64_MSR_RESET   0x4003
+#define HV_X64_MSR_VP_RUNTIME  0x4010
 #define HV_X64_MSR_TIME_REF_COUNT  0x4020
 #define HV_X64_MSR_REFERENCE_TSC   0x4021
 #define HV_X64_MSR_TSC_FREQUENCY   0x4022
+#define HV_X64_MSR_APIC_FREQUENCY  0x4023
+#define HV_X64_MSR_EOI 0x4070
+#define HV_X64_MSR_ICR 0x4071
+#define HV_X64_MSR_TPR 0x4072
+#define HV_X64_MSR_VP_ASSIST_PAGE  0x4073
+#define HV_X64_MSR_SCONTROL0x4080
+#define HV_X64_MSR_SVERSION0x4081
+#define HV_X64_MSR_SIEFP   0x4082
+#define HV_X64_MSR_SIMP0x4083
+#define HV_X64_MSR_EOM 0x4084
+#define HV_X64_MSR_SINT0   0x4090
+#define HV_X64_MSR_SINT1   0x4091
+#define HV_X64_MSR_SINT2   0x4092
+#define HV_X64_MSR_SINT3   0x4093
+#define HV_X64_MSR_SINT4   0x4094
+#define HV_X64_MSR_SINT5   0x4095
+#define HV_X64_MSR_SINT6   0x4096
+#define HV_X64_MSR_SINT7   0x4097
+#define HV_X64_MSR_SINT8   0x4098
+#define HV_X64_MSR_SINT9   0x4099
+#define HV_X64_MSR_SINT10  0x409A
+#define HV_X64_MSR_SINT11  0x409B
+#define HV_X64_MSR_SINT12  0x409C
+#define HV_X64_MSR_SINT13  0x409D
+#define HV_X64_MSR_SINT14  0x409E
+#define HV_X64_MSR_SINT15  0x409F
+#define HV_X64_MSR_STIMER0_CONFIG  0x40B0
+#define HV_X64_MSR_STIMER0_COUNT   0x40B1
+#define HV_X64_MSR_STIMER1_CONFIG  0x40B2
+#define HV_X64_MSR_STIMER1_COUNT   0x40B3
+#define HV_X64_MSR_STIMER2_CONFIG  0x40B4
+#define HV_X64_MSR_STIMER2_COUNT   0x40B5
+#define

[PATCH 27/30] KVM: x86: hyper-v: Honor HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED bit

2021-04-19 Thread Vitaly Kuznetsov
Hypercalls which use extended processor masks are only available when
HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED privilege bit is exposed (and
'RECOMMENDED' is rather a misnomer).

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index ba5af4d27ccf..4ad27e7cdb05 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2046,11 +2046,19 @@ static bool hv_check_hypercall_access(struct 
kvm_vcpu_hv *hv_vcpu, u16 code)
hv_vcpu->cpuid_cache.features_ebx & HV_DEBUGGING;
case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX:
case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX:
+   if (!(hv_vcpu->cpuid_cache.enlightenments_eax &
+ HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED))
+   return false;
+   fallthrough;
case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST:
case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE:
return hv_vcpu->cpuid_cache.enlightenments_eax &
HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED;
case HVCALL_SEND_IPI_EX:
+   if (!(hv_vcpu->cpuid_cache.enlightenments_eax &
+ HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED))
+   return false;
+   fallthrough;
case HVCALL_SEND_IPI:
return hv_vcpu->cpuid_cache.enlightenments_eax &
HV_X64_CLUSTER_IPI_RECOMMENDED;
-- 
2.30.2



[PATCH 25/30] KVM: x86: hyper-v: Honor HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED bit

2021-04-19 Thread Vitaly Kuznetsov
Hyper-V partition must possess 'HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED'
privilege ('recommended' is rather a misnomer) to issue
HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST/SPACE hypercalls.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 36ec688cda4e..f99072e092d0 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2044,6 +2044,12 @@ static bool hv_check_hypercall_access(struct kvm_vcpu_hv 
*hv_vcpu, u16 code)
 */
return !kvm_hv_is_syndbg_enabled(hv_vcpu->vcpu) ||
hv_vcpu->cpuid_cache.features_ebx & HV_DEBUGGING;
+   case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX:
+   case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX:
+   case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST:
+   case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE:
+   return hv_vcpu->cpuid_cache.enlightenments_eax &
+   HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED;
default:
break;
}
-- 
2.30.2



[PATCH 26/30] KVM: x86: hyper-v: Honor HV_X64_CLUSTER_IPI_RECOMMENDED bit

2021-04-19 Thread Vitaly Kuznetsov
Hyper-V partition must possess 'HV_X64_CLUSTER_IPI_RECOMMENDED'
privilege ('recommended' is rather a misnomer) to issue
HVCALL_SEND_IPI hypercalls.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index f99072e092d0..ba5af4d27ccf 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2050,6 +2050,10 @@ static bool hv_check_hypercall_access(struct kvm_vcpu_hv 
*hv_vcpu, u16 code)
case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE:
return hv_vcpu->cpuid_cache.enlightenments_eax &
HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED;
+   case HVCALL_SEND_IPI_EX:
+   case HVCALL_SEND_IPI:
+   return hv_vcpu->cpuid_cache.enlightenments_eax &
+   HV_X64_CLUSTER_IPI_RECOMMENDED;
default:
break;
}
-- 
2.30.2



[PATCH 24/30] KVM: x86: hyper-v: Honor HV_DEBUGGING privilege bit

2021-04-19 Thread Vitaly Kuznetsov
Hyper-V partition must possess 'HV_DEBUGGING' privilege to issue
HVCALL_POST_DEBUG_DATA/HVCALL_RETRIEVE_DEBUG_DATA/
HVCALL_RESET_DEBUG_SESSION hypercalls.

Note, when SynDBG is disabled hv_check_hypercall_access() returns
'true' (like for any other unknown hypercall) so the result will
be HV_STATUS_INVALID_HYPERCALL_CODE and not HV_STATUS_ACCESS_DENIED.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 523f63287636..36ec688cda4e 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2035,6 +2035,15 @@ static bool hv_check_hypercall_access(struct kvm_vcpu_hv 
*hv_vcpu, u16 code)
return hv_vcpu->cpuid_cache.features_ebx & HV_POST_MESSAGES;
case HVCALL_SIGNAL_EVENT:
return hv_vcpu->cpuid_cache.features_ebx & HV_SIGNAL_EVENTS;
+   case HVCALL_POST_DEBUG_DATA:
+   case HVCALL_RETRIEVE_DEBUG_DATA:
+   case HVCALL_RESET_DEBUG_SESSION:
+   /*
+* Return 'true' when SynDBG is disabled so the resulting code
+* will be HV_STATUS_INVALID_HYPERCALL_CODE.
+*/
+   return !kvm_hv_is_syndbg_enabled(hv_vcpu->vcpu) ||
+   hv_vcpu->cpuid_cache.features_ebx & HV_DEBUGGING;
default:
break;
}
-- 
2.30.2



[PATCH 21/30] KVM: x86: hyper-v: Check access to HVCALL_NOTIFY_LONG_SPIN_WAIT hypercall

2021-04-19 Thread Vitaly Kuznetsov
TLFS6.0b states that partition issuing HVCALL_NOTIFY_LONG_SPIN_WAIT must
posess 'UseHypercallForLongSpinWait' privilege but there's no
corresponding feature bit. Instead, we have "Recommended number of attempts
to retry a spinlock failure before notifying the hypervisor about the
failures. 0x indicates never notify." Use this to check access to
the hypercall. Also, check against zero as the corresponding CPUID must
be set (and '0' attempts before re-try is weird anyway).

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 4f0ab0c50c44..bd424f2d4294 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2024,6 +2024,17 @@ static u16 kvm_hvcall_signal_event(struct kvm_vcpu 
*vcpu, bool fast, u64 param)
 
 static bool hv_check_hypercall_access(struct kvm_vcpu_hv *hv_vcpu, u16 code)
 {
+   if (!hv_vcpu->enforce_cpuid)
+   return true;
+
+   switch (code) {
+   case HVCALL_NOTIFY_LONG_SPIN_WAIT:
+   return hv_vcpu->cpuid_cache.enlightenments_ebx &&
+   hv_vcpu->cpuid_cache.enlightenments_ebx != U32_MAX;
+   default:
+   break;
+   }
+
return true;
 }
 
-- 
2.30.2



[PATCH 22/30] KVM: x86: hyper-v: Honor HV_POST_MESSAGES privilege bit

2021-04-19 Thread Vitaly Kuznetsov
Hyper-V partition must possess 'HV_POST_MESSAGES' privilege to issue
HVCALL_POST_MESSAGE hypercalls.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index bd424f2d4294..ff86c00d1396 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2031,6 +2031,8 @@ static bool hv_check_hypercall_access(struct kvm_vcpu_hv 
*hv_vcpu, u16 code)
case HVCALL_NOTIFY_LONG_SPIN_WAIT:
return hv_vcpu->cpuid_cache.enlightenments_ebx &&
hv_vcpu->cpuid_cache.enlightenments_ebx != U32_MAX;
+   case HVCALL_POST_MESSAGE:
+   return hv_vcpu->cpuid_cache.features_ebx & HV_POST_MESSAGES;
default:
break;
}
-- 
2.30.2



[PATCH 23/30] KVM: x86: hyper-v: Honor HV_SIGNAL_EVENTS privilege bit

2021-04-19 Thread Vitaly Kuznetsov
Hyper-V partition must possess 'HV_SIGNAL_EVENTS' privilege to issue
HVCALL_SIGNAL_EVENT hypercalls.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index ff86c00d1396..523f63287636 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2033,6 +2033,8 @@ static bool hv_check_hypercall_access(struct kvm_vcpu_hv 
*hv_vcpu, u16 code)
hv_vcpu->cpuid_cache.enlightenments_ebx != U32_MAX;
case HVCALL_POST_MESSAGE:
return hv_vcpu->cpuid_cache.features_ebx & HV_POST_MESSAGES;
+   case HVCALL_SIGNAL_EVENT:
+   return hv_vcpu->cpuid_cache.features_ebx & HV_SIGNAL_EVENTS;
default:
break;
}
-- 
2.30.2



[PATCH 20/30] KVM: x86: hyper-v: Prepare to check access to Hyper-V hypercalls

2021-04-19 Thread Vitaly Kuznetsov
Introduce hv_check_hypercallr_access() to check if the particular hypercall
should be available to guest, this will be used with
KVM_CAP_HYPERV_ENFORCE_CPUID mode.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 12b6803de1b7..4f0ab0c50c44 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2022,6 +2022,11 @@ static u16 kvm_hvcall_signal_event(struct kvm_vcpu 
*vcpu, bool fast, u64 param)
return HV_STATUS_SUCCESS;
 }
 
+static bool hv_check_hypercall_access(struct kvm_vcpu_hv *hv_vcpu, u16 code)
+{
+   return true;
+}
+
 int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
 {
u64 param, ingpa, outgpa, ret = HV_STATUS_SUCCESS;
@@ -2061,6 +2066,11 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
 
trace_kvm_hv_hypercall(code, fast, rep_cnt, rep_idx, ingpa, outgpa);
 
+   if (unlikely(!hv_check_hypercall_access(to_hv_vcpu(vcpu), code))) {
+   ret = HV_STATUS_ACCESS_DENIED;
+   goto hypercall_complete;
+   }
+
switch (code) {
case HVCALL_NOTIFY_LONG_SPIN_WAIT:
if (unlikely(rep)) {
@@ -2167,6 +2177,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
break;
}
 
+hypercall_complete:
return kvm_hv_hypercall_complete(vcpu, ret);
 }
 
-- 
2.30.2



[PATCH 19/30] KVM: x86: hyper-v: Honor HV_STIMER_DIRECT_MODE_AVAILABLE privilege bit

2021-04-19 Thread Vitaly Kuznetsov
Synthetic timers can only be configured in 'direct' mode when
HV_STIMER_DIRECT_MODE_AVAILABLE bit was exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index ec065177531b..12b6803de1b7 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -630,11 +630,17 @@ static int stimer_set_config(struct kvm_vcpu_hv_stimer 
*stimer, u64 config,
union hv_stimer_config new_config = {.as_uint64 = config},
old_config = {.as_uint64 = stimer->config.as_uint64};
struct kvm_vcpu *vcpu = hv_stimer_to_vcpu(stimer);
+   struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
struct kvm_vcpu_hv_synic *synic = to_hv_synic(vcpu);
 
if (!synic->active && !host)
return 1;
 
+   if (unlikely(!host && hv_vcpu->enforce_cpuid && new_config.direct_mode 
&&
+!(hv_vcpu->cpuid_cache.features_edx &
+  HV_STIMER_DIRECT_MODE_AVAILABLE)))
+   return 1;
+
trace_kvm_hv_stimer_set_config(hv_stimer_to_vcpu(stimer)->vcpu_id,
   stimer->index, config, host);
 
-- 
2.30.2



[PATCH 17/30] KVM: x86: hyper-v: Honor HV_FEATURE_DEBUG_MSRS_AVAILABLE privilege bit

2021-04-19 Thread Vitaly Kuznetsov
Synthetic debugging MSRs (HV_X64_MSR_SYNDBG_CONTROL,
HV_X64_MSR_SYNDBG_STATUS, HV_X64_MSR_SYNDBG_SEND_BUFFER,
HV_X64_MSR_SYNDBG_RECV_BUFFER, HV_X64_MSR_SYNDBG_PENDING_BUFFER,
HV_X64_MSR_SYNDBG_OPTIONS) are only available to guest when
HV_FEATURE_DEBUG_MSRS_AVAILABLE bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 43ebb53b6b38..f54385ffcdc0 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1257,6 +1257,10 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv 
*hv_vcpu, u32 msr)
case HV_X64_MSR_CRASH_CTL:
return hv_vcpu->cpuid_cache.features_edx &
HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE;
+   case HV_X64_MSR_SYNDBG_OPTIONS:
+   case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
+   return hv_vcpu->cpuid_cache.features_edx &
+   HV_FEATURE_DEBUG_MSRS_AVAILABLE;
default:
break;
}
-- 
2.30.2



[PATCH 18/30] KVM: x86: hyper-v: Inverse the default in hv_check_msr_access()

2021-04-19 Thread Vitaly Kuznetsov
Access to all MSRs is now properly checked. To avoid 'forgetting' to
properly check access to new MSRs in the future change the default
to 'false' meaning 'no access'.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index f54385ffcdc0..ec065177531b 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1265,7 +1265,7 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv 
*hv_vcpu, u32 msr)
break;
}
 
-   return true;
+   return false;
 }
 
 static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
-- 
2.30.2



[PATCH 15/30] KVM: x86: hyper-v: Honor HV_ACCESS_REENLIGHTENMENT privilege bit

2021-04-19 Thread Vitaly Kuznetsov
HV_X64_MSR_REENLIGHTENMENT_CONTROL/HV_X64_MSR_TSC_EMULATION_CONTROL/
HV_X64_MSR_TSC_EMULATION_STATUS are only available to guest when
HV_ACCESS_REENLIGHTENMENT bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 8821981e17d3..f416c9de73cb 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1248,6 +1248,11 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv 
*hv_vcpu, u32 msr)
case HV_X64_MSR_APIC_FREQUENCY:
return hv_vcpu->cpuid_cache.features_eax &
HV_ACCESS_FREQUENCY_MSRS;
+   case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
+   case HV_X64_MSR_TSC_EMULATION_CONTROL:
+   case HV_X64_MSR_TSC_EMULATION_STATUS:
+   return hv_vcpu->cpuid_cache.features_eax &
+   HV_ACCESS_REENLIGHTENMENT;
default:
break;
}
-- 
2.30.2



[PATCH 14/30] KVM: x86: hyper-v: Honor HV_ACCESS_FREQUENCY_MSRS privilege bit

2021-04-19 Thread Vitaly Kuznetsov
HV_X64_MSR_TSC_FREQUENCY/HV_X64_MSR_APIC_FREQUENCY are only available to
guest when HV_ACCESS_FREQUENCY_MSRS bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index a41ad21768ed..8821981e17d3 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1244,6 +1244,10 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv 
*hv_vcpu, u32 msr)
return hv_vcpu->cpuid_cache.features_eax &
HV_MSR_APIC_ACCESS_AVAILABLE;
break;
+   case HV_X64_MSR_TSC_FREQUENCY:
+   case HV_X64_MSR_APIC_FREQUENCY:
+   return hv_vcpu->cpuid_cache.features_eax &
+   HV_ACCESS_FREQUENCY_MSRS;
default:
break;
}
-- 
2.30.2



[PATCH 12/30] KVM: x86: hyper-v: Honor HV_MSR_SYNTIMER_AVAILABLE privilege bit

2021-04-19 Thread Vitaly Kuznetsov
Synthetic timers MSRs (HV_X64_MSR_STIMER[0-3]_CONFIG,
HV_X64_MSR_STIMER[0-3]_COUNT) are only available to guest when
HV_MSR_SYNTIMER_AVAILABLE bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 17bdf8e8196e..2582c23126fa 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1227,6 +1227,16 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv 
*hv_vcpu, u32 msr)
case HV_X64_MSR_SINT0 ... HV_X64_MSR_SINT15:
return hv_vcpu->cpuid_cache.features_eax &
HV_MSR_SYNIC_AVAILABLE;
+   case HV_X64_MSR_STIMER0_CONFIG:
+   case HV_X64_MSR_STIMER1_CONFIG:
+   case HV_X64_MSR_STIMER2_CONFIG:
+   case HV_X64_MSR_STIMER3_CONFIG:
+   case HV_X64_MSR_STIMER0_COUNT:
+   case HV_X64_MSR_STIMER1_COUNT:
+   case HV_X64_MSR_STIMER2_COUNT:
+   case HV_X64_MSR_STIMER3_COUNT:
+   return hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_SYNTIMER_AVAILABLE;
default:
break;
}
-- 
2.30.2



[PATCH 13/30] KVM: x86: hyper-v: Honor HV_MSR_APIC_ACCESS_AVAILABLE privilege bit

2021-04-19 Thread Vitaly Kuznetsov
HV_X64_MSR_EOI, HV_X64_MSR_ICR, HV_X64_MSR_TPR, and
HV_X64_MSR_VP_ASSIST_PAGE  are only available to guest when
HV_MSR_APIC_ACCESS_AVAILABLE bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 2582c23126fa..a41ad21768ed 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1237,6 +1237,13 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv 
*hv_vcpu, u32 msr)
case HV_X64_MSR_STIMER3_COUNT:
return hv_vcpu->cpuid_cache.features_eax &
HV_MSR_SYNTIMER_AVAILABLE;
+   case HV_X64_MSR_EOI:
+   case HV_X64_MSR_ICR:
+   case HV_X64_MSR_TPR:
+   case HV_X64_MSR_VP_ASSIST_PAGE:
+   return hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_APIC_ACCESS_AVAILABLE;
+   break;
default:
break;
}
-- 
2.30.2



[PATCH 11/30] KVM: x86: hyper-v: Honor HV_MSR_SYNIC_AVAILABLE privilege bit

2021-04-19 Thread Vitaly Kuznetsov
SynIC MSRs (HV_X64_MSR_SCONTROL, HV_X64_MSR_SVERSION, HV_X64_MSR_SIEFP,
HV_X64_MSR_SIMP, HV_X64_MSR_EOM, HV_X64_MSR_SINT0 ... HV_X64_MSR_SINT15)
are only available to guest when HV_MSR_SYNIC_AVAILABLE bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index be6156a27bd7..17bdf8e8196e 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1219,6 +1219,14 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv 
*hv_vcpu, u32 msr)
case HV_X64_MSR_REFERENCE_TSC:
return hv_vcpu->cpuid_cache.features_eax &
HV_MSR_REFERENCE_TSC_AVAILABLE;
+   case HV_X64_MSR_SCONTROL:
+   case HV_X64_MSR_SVERSION:
+   case HV_X64_MSR_SIEFP:
+   case HV_X64_MSR_SIMP:
+   case HV_X64_MSR_EOM:
+   case HV_X64_MSR_SINT0 ... HV_X64_MSR_SINT15:
+   return hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_SYNIC_AVAILABLE;
default:
break;
}
-- 
2.30.2



[PATCH 10/30] KVM: x86: hyper-v: Honor HV_MSR_REFERENCE_TSC_AVAILABLE privilege bit

2021-04-19 Thread Vitaly Kuznetsov
HV_X64_MSR_REFERENCE_TSC is only available to guest when
HV_MSR_REFERENCE_TSC_AVAILABLE bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index affc6e0cda09..be6156a27bd7 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1216,6 +1216,9 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv 
*hv_vcpu, u32 msr)
case HV_X64_MSR_RESET:
return hv_vcpu->cpuid_cache.features_eax &
HV_MSR_RESET_AVAILABLE;
+   case HV_X64_MSR_REFERENCE_TSC:
+   return hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_REFERENCE_TSC_AVAILABLE;
default:
break;
}
-- 
2.30.2



[PATCH 07/30] KVM: x86: hyper-v: Honor HV_MSR_TIME_REF_COUNT_AVAILABLE privilege bit

2021-04-19 Thread Vitaly Kuznetsov
HV_X64_MSR_TIME_REF_COUNT is only available to guest when
HV_MSR_TIME_REF_COUNT_AVAILABLE bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 152d991ed033..0b2261a50ee8 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1207,6 +1207,9 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv 
*hv_vcpu, u32 msr)
case HV_X64_MSR_VP_RUNTIME:
return hv_vcpu->cpuid_cache.features_eax &
HV_MSR_VP_RUNTIME_AVAILABLE;
+   case HV_X64_MSR_TIME_REF_COUNT:
+   return hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_TIME_REF_COUNT_AVAILABLE;
default:
break;
}
-- 
2.30.2



[PATCH 09/30] KVM: x86: hyper-v: Honor HV_MSR_RESET_AVAILABLE privilege bit

2021-04-19 Thread Vitaly Kuznetsov
HV_X64_MSR_RESET is only available to guest when HV_MSR_RESET_AVAILABLE bit
is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 0f3f30f6ca69..affc6e0cda09 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1213,6 +1213,9 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv 
*hv_vcpu, u32 msr)
case HV_X64_MSR_VP_INDEX:
return hv_vcpu->cpuid_cache.features_eax &
HV_MSR_VP_INDEX_AVAILABLE;
+   case HV_X64_MSR_RESET:
+   return hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_RESET_AVAILABLE;
default:
break;
}
-- 
2.30.2



[PATCH 08/30] KVM: x86: hyper-v: Honor HV_MSR_VP_INDEX_AVAILABLE privilege bit

2021-04-19 Thread Vitaly Kuznetsov
HV_X64_MSR_VP_INDEX is only available to guest when
HV_MSR_VP_INDEX_AVAILABLE bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 0b2261a50ee8..0f3f30f6ca69 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1210,6 +1210,9 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv 
*hv_vcpu, u32 msr)
case HV_X64_MSR_TIME_REF_COUNT:
return hv_vcpu->cpuid_cache.features_eax &
HV_MSR_TIME_REF_COUNT_AVAILABLE;
+   case HV_X64_MSR_VP_INDEX:
+   return hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_VP_INDEX_AVAILABLE;
default:
break;
}
-- 
2.30.2



[PATCH 06/30] KVM: x86: hyper-v: Honor HV_MSR_VP_RUNTIME_AVAILABLE privilege bit

2021-04-19 Thread Vitaly Kuznetsov
HV_X64_MSR_VP_RUNTIME is only available to guest when
HV_MSR_VP_RUNTIME_AVAILABLE bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 13011803ebbd..152d991ed033 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1204,6 +1204,9 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv 
*hv_vcpu, u32 msr)
case HV_X64_MSR_HYPERCALL:
return hv_vcpu->cpuid_cache.features_eax &
HV_MSR_HYPERCALL_AVAILABLE;
+   case HV_X64_MSR_VP_RUNTIME:
+   return hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_VP_RUNTIME_AVAILABLE;
default:
break;
}
-- 
2.30.2



[PATCH 04/30] KVM: x86: hyper-v: Prepare to check access to Hyper-V MSRs

2021-04-19 Thread Vitaly Kuznetsov
Introduce hv_check_msr_access() to check if the particular MSR
should be accessible by guest, this will be used with
KVM_CAP_HYPERV_ENFORCE_CPUID mode.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index ccb298cfc933..b5bc16ea2595 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1193,12 +1193,21 @@ void kvm_hv_invalidate_tsc_page(struct kvm *kvm)
mutex_unlock(>hv_lock);
 }
 
+
+static bool hv_check_msr_access(struct kvm_vcpu_hv *hv_vcpu, u32 msr)
+{
+   return true;
+}
+
 static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
 bool host)
 {
struct kvm *kvm = vcpu->kvm;
struct kvm_hv *hv = to_kvm_hv(kvm);
 
+   if (unlikely(!host && !hv_check_msr_access(to_hv_vcpu(vcpu), msr)))
+   return 1;
+
switch (msr) {
case HV_X64_MSR_GUEST_OS_ID:
hv->hv_guest_os_id = data;
@@ -1327,6 +1336,9 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, 
u64 data, bool host)
 {
struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
 
+   if (unlikely(!host && !hv_check_msr_access(hv_vcpu, msr)))
+   return 1;
+
switch (msr) {
case HV_X64_MSR_VP_INDEX: {
struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
@@ -1441,6 +1453,9 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata,
struct kvm *kvm = vcpu->kvm;
struct kvm_hv *hv = to_kvm_hv(kvm);
 
+   if (unlikely(!host && !hv_check_msr_access(to_hv_vcpu(vcpu), msr)))
+   return 1;
+
switch (msr) {
case HV_X64_MSR_GUEST_OS_ID:
data = hv->hv_guest_os_id;
@@ -1490,6 +1505,9 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, 
u64 *pdata,
u64 data = 0;
struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
 
+   if (unlikely(!host && !hv_check_msr_access(hv_vcpu, msr)))
+   return 1;
+
switch (msr) {
case HV_X64_MSR_VP_INDEX:
data = hv_vcpu->vp_index;
-- 
2.30.2



[PATCH 02/30] KVM: x86: hyper-v: Introduce KVM_CAP_HYPERV_ENFORCE_CPUID

2021-04-19 Thread Vitaly Kuznetsov
Modeled after KVM_CAP_ENFORCE_PV_FEATURE_CPUID, the new capability allows
for limiting Hyper-V features to those exposed to the guest in Hyper-V
CPUIDs (0x4003, 0x4004, ...).

Signed-off-by: Vitaly Kuznetsov 
---
 Documentation/virt/kvm/api.rst  | 11 +++
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/hyperv.c   | 21 +
 arch/x86/kvm/hyperv.h   |  1 +
 arch/x86/kvm/x86.c  |  4 
 include/uapi/linux/kvm.h|  1 +
 6 files changed, 39 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 307f2fcf1b02..cdcaacf3d783 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6727,3 +6727,14 @@ vcpu_info is set.
 The KVM_XEN_HVM_CONFIG_RUNSTATE flag indicates that the runstate-related
 features KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR/_CURRENT/_DATA/_ADJUST are
 supported by the KVM_XEN_VCPU_SET_ATTR/KVM_XEN_VCPU_GET_ATTR ioctls.
+
+8.31 KVM_CAP_HYPERV_ENFORCE_CPUID
+-
+
+Architectures: x86
+
+When enabled, KVM will disable emulated Hyper-V features provided to the
+guest according to the bits Hyper-V CPUID feature leaves. Otherwise, all
+currently implmented Hyper-V features are provided unconditionally when
+Hyper-V identification is set in the HYPERV_CPUID_INTERFACE (0x4001)
+leaf.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3768819693e5..dc40897c41bc 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -530,6 +530,7 @@ struct kvm_vcpu_hv {
struct kvm_vcpu_hv_stimer stimer[HV_SYNIC_STIMER_COUNT];
DECLARE_BITMAP(stimer_pending_bitmap, HV_SYNIC_STIMER_COUNT);
cpumask_t tlb_flush;
+   bool enforce_cpuid;
 };
 
 /* Xen HVM per vcpu emulation context */
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index f98370a39936..557897c453a9 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1809,6 +1809,27 @@ void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
vcpu->arch.hyperv_enabled = false;
 }
 
+int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce)
+{
+   struct kvm_vcpu_hv *hv_vcpu;
+   int ret = 0;
+
+   if (!to_hv_vcpu(vcpu)) {
+   if (enforce) {
+   ret = kvm_hv_vcpu_init(vcpu);
+   if (ret)
+   return ret;
+   } else {
+   return 0;
+   }
+   }
+
+   hv_vcpu = to_hv_vcpu(vcpu);
+   hv_vcpu->enforce_cpuid = enforce;
+
+   return ret;
+}
+
 bool kvm_hv_hypercall_enabled(struct kvm_vcpu *vcpu)
 {
return vcpu->arch.hyperv_enabled && 
to_kvm_hv(vcpu->kvm)->hv_guest_os_id;
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index 60547d5cb6d7..730da8537d05 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -138,6 +138,7 @@ void kvm_hv_invalidate_tsc_page(struct kvm *kvm);
 void kvm_hv_init_vm(struct kvm *kvm);
 void kvm_hv_destroy_vm(struct kvm *kvm);
 void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu);
+int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce);
 int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args);
 int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
 struct kvm_cpuid_entry2 __user *entries);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index eca63625aee4..a06a6f48386d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3745,6 +3745,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long 
ext)
case KVM_CAP_HYPERV_TLBFLUSH:
case KVM_CAP_HYPERV_SEND_IPI:
case KVM_CAP_HYPERV_CPUID:
+   case KVM_CAP_HYPERV_ENFORCE_CPUID:
case KVM_CAP_SYS_HYPERV_CPUID:
case KVM_CAP_PCI_SEGMENT:
case KVM_CAP_DEBUGREGS:
@@ -4669,6 +4670,9 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu 
*vcpu,
 
return static_call(kvm_x86_enable_direct_tlbflush)(vcpu);
 
+   case KVM_CAP_HYPERV_ENFORCE_CPUID:
+   return kvm_hv_set_enforce_cpuid(vcpu, cap->args[0]);
+
case KVM_CAP_ENFORCE_PV_FEATURE_CPUID:
vcpu->arch.pv_cpuid.enforce = cap->args[0];
if (vcpu->arch.pv_cpuid.enforce)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index f6afee209620..723bd729787f 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1078,6 +1078,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_DIRTY_LOG_RING 192
 #define KVM_CAP_X86_BUS_LOCK_EXIT 193
 #define KVM_CAP_PPC_DAWR1 194
+#define KVM_CAP_HYPERV_ENFORCE_CPUID 195
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.30.2



[PATCH 03/30] KVM: x86: hyper-v: Cache guest CPUID leaves determining features availability

2021-04-19 Thread Vitaly Kuznetsov
Limiting exposed Hyper-V features requires a fast way to check if the
particular feature is exposed in guest visible CPUIDs or not. To aboid
looping through all CPUID entries on every hypercall/MSR access cache
the required leaves on CPUID update.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/include/asm/kvm_host.h |  8 ++
 arch/x86/kvm/hyperv.c   | 49 ++---
 2 files changed, 47 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index dc40897c41bc..6525d2716b09 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -531,6 +531,14 @@ struct kvm_vcpu_hv {
DECLARE_BITMAP(stimer_pending_bitmap, HV_SYNIC_STIMER_COUNT);
cpumask_t tlb_flush;
bool enforce_cpuid;
+   struct {
+   u32 features_eax; /* HYPERV_CPUID_FEATURES.EAX */
+   u32 features_ebx; /* HYPERV_CPUID_FEATURES.EBX */
+   u32 features_edx; /* HYPERV_CPUID_FEATURES.EDX */
+   u32 enlightenments_eax; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EAX */
+   u32 enlightenments_ebx; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EBX */
+   u32 syndbg_cap_eax; /* 
HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */
+   } cpuid_cache;
 };
 
 /* Xen HVM per vcpu emulation context */
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 557897c453a9..ccb298cfc933 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -273,15 +273,10 @@ static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
 
 static bool kvm_hv_is_syndbg_enabled(struct kvm_vcpu *vcpu)
 {
-   struct kvm_cpuid_entry2 *entry;
-
-   entry = kvm_find_cpuid_entry(vcpu,
-HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES,
-0);
-   if (!entry)
-   return false;
+   struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
 
-   return entry->eax & HV_X64_SYNDBG_CAP_ALLOW_KERNEL_DEBUGGING;
+   return hv_vcpu->cpuid_cache.syndbg_cap_eax &
+   HV_X64_SYNDBG_CAP_ALLOW_KERNEL_DEBUGGING;
 }
 
 static int kvm_hv_syndbg_complete_userspace(struct kvm_vcpu *vcpu)
@@ -1801,12 +1796,46 @@ static u64 kvm_hv_send_ipi(struct kvm_vcpu *vcpu, u64 
ingpa, u64 outgpa,
 void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
 {
struct kvm_cpuid_entry2 *entry;
+   struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
 
entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_INTERFACE, 0);
-   if (entry && entry->eax == HYPERV_CPUID_SIGNATURE_EAX)
+   if (entry && entry->eax == HYPERV_CPUID_SIGNATURE_EAX) {
vcpu->arch.hyperv_enabled = true;
-   else
+   } else {
vcpu->arch.hyperv_enabled = false;
+   return;
+   }
+
+   if (!to_hv_vcpu(vcpu) && kvm_hv_vcpu_init(vcpu))
+   return;
+
+   hv_vcpu = to_hv_vcpu(vcpu);
+
+   entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_FEATURES, 0);
+   if (entry) {
+   hv_vcpu->cpuid_cache.features_eax = entry->eax;
+   hv_vcpu->cpuid_cache.features_ebx = entry->ebx;
+   hv_vcpu->cpuid_cache.features_edx = entry->edx;
+   } else {
+   hv_vcpu->cpuid_cache.features_eax = 0;
+   hv_vcpu->cpuid_cache.features_ebx = 0;
+   hv_vcpu->cpuid_cache.features_edx = 0;
+   }
+
+   entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_ENLIGHTMENT_INFO, 0);
+   if (entry) {
+   hv_vcpu->cpuid_cache.enlightenments_eax = entry->eax;
+   hv_vcpu->cpuid_cache.enlightenments_ebx = entry->ebx;
+   } else {
+   hv_vcpu->cpuid_cache.enlightenments_eax = 0;
+   hv_vcpu->cpuid_cache.enlightenments_ebx = 0;
+   }
+
+   entry = kvm_find_cpuid_entry(vcpu, 
HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES, 0);
+   if (entry)
+   hv_vcpu->cpuid_cache.syndbg_cap_eax = entry->eax;
+   else
+   hv_vcpu->cpuid_cache.syndbg_cap_eax = 0;
 }
 
 int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce)
-- 
2.30.2



[PATCH 05/30] KVM: x86: hyper-v: Honor HV_MSR_HYPERCALL_AVAILABLE privilege bit

2021-04-19 Thread Vitaly Kuznetsov
HV_X64_MSR_GUEST_OS_ID/HV_X64_MSR_HYPERCALL are only available to guest
when HV_MSR_HYPERCALL_AVAILABLE bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index b5bc16ea2595..13011803ebbd 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1196,6 +1196,18 @@ void kvm_hv_invalidate_tsc_page(struct kvm *kvm)
 
 static bool hv_check_msr_access(struct kvm_vcpu_hv *hv_vcpu, u32 msr)
 {
+   if (!hv_vcpu->enforce_cpuid)
+   return true;
+
+   switch (msr) {
+   case HV_X64_MSR_GUEST_OS_ID:
+   case HV_X64_MSR_HYPERCALL:
+   return hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_HYPERCALL_AVAILABLE;
+   default:
+   break;
+   }
+
return true;
 }
 
-- 
2.30.2



[PATCH 01/30] asm-generic/hyperv: add HV_STATUS_ACCESS_DENIED definition

2021-04-19 Thread Vitaly Kuznetsov
>From TLFSv6.0b, this status means: "The caller did not possess sufficient
access rights to perform the requested operation."

Signed-off-by: Vitaly Kuznetsov 
Acked-by: Wei Liu 
---
 include/asm-generic/hyperv-tlfs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/asm-generic/hyperv-tlfs.h 
b/include/asm-generic/hyperv-tlfs.h
index 83448e837ded..e01a3bade13a 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -187,6 +187,7 @@ enum HV_GENERIC_SET_FORMAT {
 #define HV_STATUS_INVALID_HYPERCALL_INPUT  3
 #define HV_STATUS_INVALID_ALIGNMENT4
 #define HV_STATUS_INVALID_PARAMETER5
+#define HV_STATUS_ACCESS_DENIED6
 #define HV_STATUS_OPERATION_DENIED 8
 #define HV_STATUS_INSUFFICIENT_MEMORY  11
 #define HV_STATUS_INVALID_PORT_ID  17
-- 
2.30.2



[PATCH 00/30] KVM: x86: hyper-v: Fine-grained access check to Hyper-V hypercalls and MSRs

2021-04-19 Thread Vitaly Kuznetsov
Changes since RFC:
- KVM_CAP_HYPERV_ENFORCE_CPUID introduced. Turns out that at least QEMU
  is not doing a great job setting Hyper-V CPUID entries for various
  configurations (when not all enlightenments are enabled).
- Added a selftest.
- Add Wei's A-b tag to PATCH1.

Currently, all implemented Hyper-V features (MSRs and hypercalls) are
available unconditionally to all Hyper-V enabled guests. This is not
ideal as KVM userspace may decide to provide only a subset of the
currently implemented features to emulate an older Hyper-V version,
to reduce attack surface,... Implement checks against guest visible
CPUIDs for all currently implemented MSRs and hypercalls.

Vitaly Kuznetsov (30):
  asm-generic/hyperv: add HV_STATUS_ACCESS_DENIED definition
  KVM: x86: hyper-v: Introduce KVM_CAP_HYPERV_ENFORCE_CPUID
  KVM: x86: hyper-v: Cache guest CPUID leaves determining features
availability
  KVM: x86: hyper-v: Prepare to check access to Hyper-V MSRs
  KVM: x86: hyper-v: Honor HV_MSR_HYPERCALL_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_VP_RUNTIME_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_TIME_REF_COUNT_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_VP_INDEX_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_RESET_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_REFERENCE_TSC_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_SYNIC_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_SYNTIMER_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_APIC_ACCESS_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_ACCESS_FREQUENCY_MSRS privilege bit
  KVM: x86: hyper-v: Honor HV_ACCESS_REENLIGHTENMENT privilege bit
  KVM: x86: hyper-v: Honor HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE
privilege bit
  KVM: x86: hyper-v: Honor HV_FEATURE_DEBUG_MSRS_AVAILABLE privilege bit
  KVM: x86: hyper-v: Inverse the default in hv_check_msr_access()
  KVM: x86: hyper-v: Honor HV_STIMER_DIRECT_MODE_AVAILABLE privilege bit
  KVM: x86: hyper-v: Prepare to check access to Hyper-V hypercalls
  KVM: x86: hyper-v: Check access to HVCALL_NOTIFY_LONG_SPIN_WAIT
hypercall
  KVM: x86: hyper-v: Honor HV_POST_MESSAGES privilege bit
  KVM: x86: hyper-v: Honor HV_SIGNAL_EVENTS privilege bit
  KVM: x86: hyper-v: Honor HV_DEBUGGING privilege bit
  KVM: x86: hyper-v: Honor HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED bit
  KVM: x86: hyper-v: Honor HV_X64_CLUSTER_IPI_RECOMMENDED bit
  KVM: x86: hyper-v: Honor HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED bit
  KVM: selftests: move Hyper-V MSR definitions to hyperv.h
  KVM: selftests: Move evmcs.h to x86_64/
  KVM: selftests: Introduce hyperv_features test

 Documentation/virt/kvm/api.rst|  11 +
 arch/x86/include/asm/kvm_host.h   |   9 +
 arch/x86/kvm/hyperv.c | 216 +-
 arch/x86/kvm/hyperv.h |   1 +
 arch/x86/kvm/x86.c|   4 +
 include/asm-generic/hyperv-tlfs.h |   1 +
 include/uapi/linux/kvm.h  |   1 +
 tools/testing/selftests/kvm/.gitignore|   1 +
 tools/testing/selftests/kvm/Makefile  |   1 +
 .../kvm/include/{ => x86_64}/evmcs.h  |   2 +-
 .../selftests/kvm/include/x86_64/hyperv.h | 185 +
 .../selftests/kvm/x86_64/hyperv_clock.c   |   8 +-
 .../selftests/kvm/x86_64/hyperv_features.c| 649 ++
 13 files changed, 1071 insertions(+), 18 deletions(-)
 rename tools/testing/selftests/kvm/include/{ => x86_64}/evmcs.h (99%)
 create mode 100644 tools/testing/selftests/kvm/include/x86_64/hyperv.h
 create mode 100644 tools/testing/selftests/kvm/x86_64/hyperv_features.c

-- 
2.30.2



Re: [PATCH v2 5/7] KVM: SVM: hyper-v: Remote TLB flush for SVM

2021-04-16 Thread Vitaly Kuznetsov
Vineeth Pillai  writes:

> Enable remote TLB flush for SVM.
>
> Signed-off-by: Vineeth Pillai 
> ---
>  arch/x86/kvm/svm/svm.c | 37 +
>  1 file changed, 37 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 2ad1f55c88d0..de141d5ae5fb 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -37,6 +37,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include "trace.h"
> @@ -44,6 +45,8 @@
>  #include "svm.h"
>  #include "svm_ops.h"
>  
> +#include "hyperv.h"
> +
>  #define __ex(x) __kvm_handle_fault_on_reboot(x)
>  
>  MODULE_AUTHOR("Qumranet");
> @@ -931,6 +934,8 @@ static __init void svm_set_cpu_caps(void)
>   kvm_cpu_cap_set(X86_FEATURE_VIRT_SSBD);
>  }
>  
> +static struct kvm_x86_ops svm_x86_ops;
> +
>  static __init int svm_hardware_setup(void)
>  {
>   int cpu;
> @@ -1000,6 +1005,16 @@ static __init int svm_hardware_setup(void)
>   kvm_configure_mmu(npt_enabled, get_max_npt_level(), PG_LEVEL_1G);
>   pr_info("kvm: Nested Paging %sabled\n", npt_enabled ? "en" : "dis");
>  
> +#if IS_ENABLED(CONFIG_HYPERV)
> + if (ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB
> + && npt_enabled) {
> + pr_info("kvm: Hyper-V enlightened NPT TLB flush enabled\n");
> + svm_x86_ops.tlb_remote_flush = kvm_hv_remote_flush_tlb;
> + svm_x86_ops.tlb_remote_flush_with_range =
> + kvm_hv_remote_flush_tlb_with_range;
> + }
> +#endif
> +
>   if (nrips) {
>   if (!boot_cpu_has(X86_FEATURE_NRIPS))
>   nrips = false;
> @@ -1120,6 +1135,21 @@ static void svm_check_invpcid(struct vcpu_svm *svm)
>   }
>  }
>  
> +#if IS_ENABLED(CONFIG_HYPERV)
> +static void hv_init_vmcb(struct vmcb *vmcb)
> +{
> + struct hv_enlightenments *hve = >hv_enlightenments;
> +
> + if (npt_enabled &&
> + ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB)

Nitpick: we can probably have a 'static inline' for 

 "npt_enabled && ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB"

e.g. 'hv_svm_enlightened_tlbflush()'

> + hve->hv_enlightenments_control.enlightened_npt_tlb = 1;
> +}
> +#else
> +static inline void hv_init_vmcb(struct vmcb *vmcb)
> +{
> +}
> +#endif
> +
>  static void init_vmcb(struct vcpu_svm *svm)
>  {
>   struct vmcb_control_area *control = >vmcb->control;
> @@ -1282,6 +1312,8 @@ static void init_vmcb(struct vcpu_svm *svm)
>   }
>   }
>  
> + hv_init_vmcb(svm->vmcb);
> +
>   vmcb_mark_all_dirty(svm->vmcb);
>  
>   enable_gif(svm);
> @@ -3975,6 +4007,11 @@ static void svm_load_mmu_pgd(struct kvm_vcpu *vcpu, 
> unsigned long root,
>   svm->vmcb->control.nested_cr3 = cr3;
>   vmcb_mark_dirty(svm->vmcb, VMCB_NPT);
>  
> +#if IS_ENABLED(CONFIG_HYPERV)
> + if (kvm_x86_ops.tlb_remote_flush)
> + kvm_update_arch_tdp_pointer(vcpu->kvm, vcpu, cr3);
> +#endif
> +
>   /* Loading L2's CR3 is handled by enter_svm_guest_mode.  */
>   if (!test_bit(VCPU_EXREG_CR3, (ulong *)>arch.regs_avail))
>   return;

-- 
Vitaly



Re: [PATCH v2 4/7] KVM: SVM: hyper-v: Nested enlightenments in VMCB

2021-04-16 Thread Vitaly Kuznetsov
Vineeth Pillai  writes:

> Add Hyper-V specific fields in VMCB to support SVM enlightenments.
> Also a small refactoring of VMCB clean bits handling.
>
> Signed-off-by: Vineeth Pillai 
> ---
>  arch/x86/include/asm/svm.h | 24 +++-
>  arch/x86/kvm/svm/svm.c |  8 
>  arch/x86/kvm/svm/svm.h | 30 --
>  3 files changed, 59 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 1c561945b426..3586d7523ce8 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -322,9 +322,31 @@ static inline void __unused_size_checks(void)
>   BUILD_BUG_ON(sizeof(struct ghcb)!= EXPECTED_GHCB_SIZE);
>  }
>  
> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> +struct __packed hv_enlightenments {
> + struct __packed hv_enlightenments_control {
> + u32 nested_flush_hypercall:1;
> + u32 msr_bitmap:1;
> + u32 enlightened_npt_tlb: 1;
> + u32 reserved:29;
> + } hv_enlightenments_control;
> + u32 hv_vp_id;
> + u64 hv_vm_id;
> + u64 partition_assist_page;
> + u64 reserved;
> +};

Enlightened VMCS seems to have the same part:

struct {
u32 nested_flush_hypercall:1;
u32 msr_bitmap:1;
u32 reserved:30;
}  __packed hv_enlightenments_control;
u32 hv_vp_id;
u64 hv_vm_id;
u64 partition_assist_page;

Would it maybe make sense to unify these two (in case they are the same
thing in Hyper-V, of course)?


> +#define VMCB_CONTROL_END 992 // 32 bytes for Hyper-V
> +#else
> +#define VMCB_CONTROL_END 1024
> +#endif
> +
>  struct vmcb {
>   struct vmcb_control_area control;
> - u8 reserved_control[1024 - sizeof(struct vmcb_control_area)];
> + u8 reserved_control[VMCB_CONTROL_END - sizeof(struct 
> vmcb_control_area)];
> +#if IS_ENABLED(CONFIG_HYPERV)
> + struct hv_enlightenments hv_enlightenments;
> +#endif
>   struct vmcb_save_area save;
>  } __packed;
>  
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index baee91c1e936..2ad1f55c88d0 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -31,6 +31,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -122,6 +123,8 @@ bool npt_enabled = true;
>  bool npt_enabled;
>  #endif
>  
> +u32 __read_mostly vmcb_all_clean_mask = VMCB_ALL_CLEAN_MASK;
> +
>  /*
>   * These 2 parameters are used to config the controls for Pause-Loop Exiting:
>   * pause_filter_count: On processors that support Pause filtering(indicated
> @@ -1051,6 +1054,11 @@ static __init int svm_hardware_setup(void)
>*/
>   allow_smaller_maxphyaddr = !npt_enabled;
>  
> +#if IS_ENABLED(CONFIG_HYPERV)
> + if (hypervisor_is_type(X86_HYPER_MS_HYPERV))
> + vmcb_all_clean_mask |= VMCB_HYPERV_CLEAN_MASK;
> +#endif
> +
>   return 0;
>  
>  err:
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 39e071fdab0c..63ed05c8027b 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -33,6 +33,11 @@ static const u32 host_save_user_msrs[] = {
>  extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
>  extern bool npt_enabled;
>  
> +/*
> + * Clean bits in VMCB.
> + * VMCB_ALL_CLEAN_MASK and VMCB_HYPERV_CLEAN_MASK might
> + * also need to be updated if this enum is modified.
> + */
>  enum {
>   VMCB_INTERCEPTS, /* Intercept vectors, TSC offset,
>   pause filter count */
> @@ -50,12 +55,28 @@ enum {
> * AVIC PHYSICAL_TABLE pointer,
> * AVIC LOGICAL_TABLE pointer
> */
> - VMCB_DIRTY_MAX,
> +#if IS_ENABLED(CONFIG_HYPERV)
> + VMCB_HV_NESTED_ENLIGHTENMENTS = 31,
> +#endif
>  };
>  
> +#define VMCB_ALL_CLEAN_MASK (\
> + (1U << VMCB_INTERCEPTS) | (1U << VMCB_PERM_MAP) |   \
> + (1U << VMCB_ASID) | (1U << VMCB_INTR) | \
> + (1U << VMCB_NPT) | (1U << VMCB_CR) | (1U << VMCB_DR) |  \
> + (1U << VMCB_DT) | (1U << VMCB_SEG) | (1U << VMCB_CR2) | \
> + (1U << VMCB_LBR) | (1U << VMCB_AVIC)\
> + )

What if we preserve VMCB_DIRTY_MAX and drop this newly introduced
VMCB_ALL_CLEAN_MASK (which basically lists all the members of the enum
above)? '1 << VMCB_DIRTY_MAX' can still work. (If the 'VMCB_DIRTY_MAX'
name becomes misleading we can e.g. rename it to VMCB_NATIVE_DIRTY_MAX
or something but I'm not sure it's worth it)

> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> +#define VMCB_HYPERV_CLEAN_MASK (1U << VMCB_HV_NESTED_ENLIGHTENMENTS)
> +#endif

VMCB_HYPERV_CLEAN_MASK is a single bit, why do we need it at all
(BIT(VMCB_HV_NESTED_ENLIGHTENMENTS) is not super long)

> +
>  /* TPR and CR2 are always written before VMRUN */
>  #define VMCB_ALWAYS_DIRTY_MASK   ((1U << VMCB_INTR) | (1U << 

Re: [PATCH v2 3/7] KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx

2021-04-16 Thread Vitaly Kuznetsov
Vineeth Pillai  writes:

> Currently the remote TLB flush logic is specific to VMX.
> Move it to a common place so that SVM can use it as well.
>
> Signed-off-by: Vineeth Pillai 
> ---
>  arch/x86/include/asm/kvm_host.h | 14 +
>  arch/x86/kvm/hyperv.c   | 87 +
>  arch/x86/kvm/hyperv.h   | 20 +++
>  arch/x86/kvm/vmx/vmx.c  | 97 +++--
>  arch/x86/kvm/vmx/vmx.h  | 10 
>  arch/x86/kvm/x86.c  |  9 ++-
>  6 files changed, 136 insertions(+), 101 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 877a4025d8da..ed84c15d18bc 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -838,6 +838,15 @@ struct kvm_vcpu_arch {
>  
>   /* Protected Guests */
>   bool guest_state_protected;
> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> + /*
> +  * Two Dimensional paging CR3
> +  * EPTP for Intel
> +  * nCR3 for AMD
> +  */
> + u64 tdp_pointer;
> +#endif
>  };
>  
>  struct kvm_lpage_info {
> @@ -1079,6 +1088,11 @@ struct kvm_arch {
>*/
>   spinlock_t tdp_mmu_pages_lock;
>  #endif /* CONFIG_X86_64 */
> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> + int tdp_pointers_match;
> + spinlock_t tdp_pointer_lock;
> +#endif
>  };
>  
>  struct kvm_vm_stat {
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 58fa8c029867..614b4448a028 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c

I still think that using arch/x86/kvm/hyperv.[ch] for KVM-on-Hyper-V is
misleading. Currently, these are dedicated to emulating Hyper-V
interface to KVM guests and this is orthogonal to nesting KVM on
Hyper-V. As a solution, I'd suggest you either:
- Put the stuff in x86.c
- Create a dedicated set of files, e.g. 'kvmonhyperv.[ch]' (I also
thought about 'hyperv_host.[ch]' but then I realized it's equally
misleading as one can read this as 'KVM is acting as Hyper-V host').

Personally, I'd vote for the later. Besides eliminating confusion, the
benefit of having dedicated files is that we can avoid compiling them
completely when !IS_ENABLED(CONFIG_HYPERV) (#ifdefs in C are ugly).


> @@ -32,6 +32,7 @@
>  #include 
>  
>  #include 
> +#include 
>  #include 
>  
>  #include "trace.h"
> @@ -2180,3 +2181,89 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct 
> kvm_cpuid2 *cpuid,
>  
>   return 0;
>  }
> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> +/* check_tdp_pointer() should be under protection of tdp_pointer_lock. */
> +static void check_tdp_pointer_match(struct kvm *kvm)
> +{
> + u64 tdp_pointer = INVALID_PAGE;
> + bool valid_tdp = false;
> + struct kvm_vcpu *vcpu;
> + int i;
> +
> + kvm_for_each_vcpu(i, vcpu, kvm) {
> + if (!valid_tdp) {
> + tdp_pointer = vcpu->arch.tdp_pointer;
> + valid_tdp = true;
> + continue;
> + }
> +
> + if (tdp_pointer != vcpu->arch.tdp_pointer) {
> + kvm->arch.tdp_pointers_match = TDP_POINTERS_MISMATCH;
> + return;
> + }
> + }
> +
> + kvm->arch.tdp_pointers_match = TDP_POINTERS_MATCH;
> +}
> +
> +static int kvm_fill_hv_flush_list_func(struct hv_guest_mapping_flush_list 
> *flush,
> + void *data)
> +{
> + struct kvm_tlb_range *range = data;
> +
> + return hyperv_fill_flush_guest_mapping_list(flush, range->start_gfn,
> + range->pages);
> +}
> +
> +static inline int __hv_remote_flush_tlb_with_range(struct kvm *kvm,
> + struct kvm_vcpu *vcpu, struct kvm_tlb_range *range)
> +{
> + u64 tdp_pointer = vcpu->arch.tdp_pointer;
> +
> + /*
> +  * FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE hypercall needs address
> +  * of the base of EPT PML4 table, strip off EPT configuration
> +  * information.
> +  */
> + if (range)
> + return hyperv_flush_guest_mapping_range(tdp_pointer & PAGE_MASK,
> + kvm_fill_hv_flush_list_func, (void *)range);
> + else
> + return hyperv_flush_guest_mapping(tdp_pointer & PAGE_MASK);
> +}
> +
> +int kvm_hv_remote_flush_tlb_with_range(struct kvm *kvm,
> + struct kvm_tlb_range *range)
> +{
> + struct kvm_vcpu *vcpu;
> + int ret = 0, i;
> +
> + spin_lock(>arch.tdp_pointer_lock);
> +
> + if (kvm->arch.tdp_pointers_match == TDP_POINTERS_CHECK)
> + check_tdp_pointer_match(kvm);
> +
> + if (kvm->arch.tdp_pointers_match != TDP_POINTERS_MATCH) {
> + kvm_for_each_vcpu(i, vcpu, kvm) {
> + /* If tdp_pointer is invalid pointer, bypass flush 
> request. */
> + if (VALID_PAGE(vcpu->arch.tdp_pointer))
> + ret |= __hv_remote_flush_tlb_with_range(
> + kvm, vcpu, range);
> + }
> + } else {
> + 

Re: [PATCH v2 1/7] hyperv: Detect Nested virtualization support for SVM

2021-04-16 Thread Vitaly Kuznetsov
Vineeth Pillai  writes:

> Detect nested features exposed by Hyper-V if SVM is enabled.
>

It may make sense to expand this a bit as it is probably unclear how the
change is related to SVM.

Something like:

HYPERV_CPUID_NESTED_FEATURES CPUID leaf can be present on both Intel and
AMD Hyper-V guests. Previously, the code was using
HV_X64_ENLIGHTENED_VMCS_RECOMMENDED feature bit to determine the
availability of nested features leaf and this complies to TLFS:
"Recommend a nested hypervisor using the enlightened VMCS interface. 
Also indicates that additional nested enlightenments may be available
(see leaf 0x400A)". Enlightened VMCS, however, is an Intel only
feature so the detection method doesn't work for AMD. Use
HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS.EAX CPUID information ("The
maximum input value for hypervisor CPUID information.") instead, this
works for both AMD and Intel.


> Signed-off-by: Vineeth Pillai 
> ---
>  arch/x86/kernel/cpu/mshyperv.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index 3546d3e21787..c6f812851e37 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -252,6 +252,7 @@ static void __init hv_smp_prepare_cpus(unsigned int 
> max_cpus)
>  
>  static void __init ms_hyperv_init_platform(void)
>  {
> + int hv_max_functions_eax;
>   int hv_host_info_eax;
>   int hv_host_info_ebx;
>   int hv_host_info_ecx;
> @@ -269,6 +270,8 @@ static void __init ms_hyperv_init_platform(void)
>   ms_hyperv.misc_features = cpuid_edx(HYPERV_CPUID_FEATURES);
>   ms_hyperv.hints= cpuid_eax(HYPERV_CPUID_ENLIGHTMENT_INFO);
>  
> + hv_max_functions_eax = cpuid_eax(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS);
> +
>   pr_info("Hyper-V: privilege flags low 0x%x, high 0x%x, hints 0x%x, misc 
> 0x%x\n",
>   ms_hyperv.features, ms_hyperv.priv_high, ms_hyperv.hints,
>   ms_hyperv.misc_features);
> @@ -298,8 +301,7 @@ static void __init ms_hyperv_init_platform(void)
>   /*
>* Extract host information.
>*/
> - if (cpuid_eax(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS) >=
> - HYPERV_CPUID_VERSION) {
> + if (hv_max_functions_eax >= HYPERV_CPUID_VERSION) {
>   hv_host_info_eax = cpuid_eax(HYPERV_CPUID_VERSION);
>   hv_host_info_ebx = cpuid_ebx(HYPERV_CPUID_VERSION);
>   hv_host_info_ecx = cpuid_ecx(HYPERV_CPUID_VERSION);
> @@ -325,9 +327,11 @@ static void __init ms_hyperv_init_platform(void)
>   ms_hyperv.isolation_config_a, 
> ms_hyperv.isolation_config_b);
>   }
>  
> - if (ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED) {
> + if (hv_max_functions_eax >= HYPERV_CPUID_NESTED_FEATURES) {
>   ms_hyperv.nested_features =
>   cpuid_eax(HYPERV_CPUID_NESTED_FEATURES);
> +     pr_info("Hyper-V: Nested features: 0x%x\n",
> + ms_hyperv.nested_features);
>   }
>  
>   /*

With the commit message expanded,

Reviewed-by: Vitaly Kuznetsov 

-- 
Vitaly



Re: [PATCH RFC 01/22] asm-generic/hyperv: add HV_STATUS_ACCESS_DENIED definition

2021-04-15 Thread Vitaly Kuznetsov
Wei Liu  writes:

> On Tue, Apr 13, 2021 at 02:26:09PM +0200, Vitaly Kuznetsov wrote:
>> From TLFSv6.0b, this status means: "The caller did not possess sufficient
>> access rights to perform the requested operation."
>> 
>> Signed-off-by: Vitaly Kuznetsov 
>
> This can be applied to hyperv-next right away. Let me know what you
> think.
>

In case there's no immediate need for this constant outside of KVM, I'd
suggest you just give Paolo your 'Acked-by' so I can carry the patch in
the series for the time being. This will eliminate the need to track
dependencies between hyperv-next and kvm-next.

Thanks!

-- 
Vitaly



[PATCH 5/5] x86/kvm: Unify kvm_pv_guest_cpu_reboot() with kvm_guest_cpu_offline()

2021-04-14 Thread Vitaly Kuznetsov
Simplify the code by making PV features shutdown happen in one place.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kernel/kvm.c | 42 +-
 1 file changed, 17 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 1754b7c3f754..7da7bea96745 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -384,31 +384,6 @@ static void kvm_disable_steal_time(void)
wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
 }
 
-static void kvm_pv_guest_cpu_reboot(void *unused)
-{
-   /*
-* We disable PV EOI before we load a new kernel by kexec,
-* since MSR_KVM_PV_EOI_EN stores a pointer into old kernel's memory.
-* New kernel can re-enable when it boots.
-*/
-   if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
-   wrmsrl(MSR_KVM_PV_EOI_EN, 0);
-   kvm_pv_disable_apf();
-   kvm_disable_steal_time();
-}
-
-static int kvm_pv_reboot_notify(struct notifier_block *nb,
-   unsigned long code, void *unused)
-{
-   if (code == SYS_RESTART)
-   on_each_cpu(kvm_pv_guest_cpu_reboot, NULL, 1);
-   return NOTIFY_DONE;
-}
-
-static struct notifier_block kvm_pv_reboot_nb = {
-   .notifier_call = kvm_pv_reboot_notify,
-};
-
 static u64 kvm_steal_clock(int cpu)
 {
u64 steal;
@@ -664,6 +639,23 @@ static struct syscore_ops kvm_syscore_ops = {
.resume = kvm_resume,
 };
 
+static void kvm_pv_guest_cpu_reboot(void *unused)
+{
+   kvm_guest_cpu_offline(true);
+}
+
+static int kvm_pv_reboot_notify(struct notifier_block *nb,
+   unsigned long code, void *unused)
+{
+   if (code == SYS_RESTART)
+   on_each_cpu(kvm_pv_guest_cpu_reboot, NULL, 1);
+   return NOTIFY_DONE;
+}
+
+static struct notifier_block kvm_pv_reboot_nb = {
+   .notifier_call = kvm_pv_reboot_notify,
+};
+
 /*
  * After a PV feature is registered, the host will keep writing to the
  * registered memory location. If the guest happens to shutdown, this memory
-- 
2.30.2



[PATCH 4/5] x86/kvm: Disable all PV features on crash

2021-04-14 Thread Vitaly Kuznetsov
Crash shutdown handler only disables kvmclock and steal time, other PV
features remain active so we risk corrupting memory or getting some
side-effects in kdump kernel. Move crash handler to kvm.c and unify
with CPU offline.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/include/asm/kvm_para.h |  6 -
 arch/x86/kernel/kvm.c   | 44 -
 arch/x86/kernel/kvmclock.c  | 21 
 3 files changed, 32 insertions(+), 39 deletions(-)

diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 9c56e0defd45..69299878b200 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -92,7 +92,6 @@ unsigned int kvm_arch_para_hints(void);
 void kvm_async_pf_task_wait_schedule(u32 token);
 void kvm_async_pf_task_wake(u32 token);
 u32 kvm_read_and_reset_apf_flags(void);
-void kvm_disable_steal_time(void);
 bool __kvm_handle_async_pf(struct pt_regs *regs, u32 token);
 
 DECLARE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
@@ -137,11 +136,6 @@ static inline u32 kvm_read_and_reset_apf_flags(void)
return 0;
 }
 
-static inline void kvm_disable_steal_time(void)
-{
-   return;
-}
-
 static __always_inline bool kvm_handle_async_pf(struct pt_regs *regs, u32 
token)
 {
return false;
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index df00d44f7424..1754b7c3f754 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
@@ -375,6 +376,14 @@ static void kvm_pv_disable_apf(void)
pr_info("disable async PF for cpu %d\n", smp_processor_id());
 }
 
+static void kvm_disable_steal_time(void)
+{
+   if (!has_steal_clock)
+   return;
+
+   wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
+}
+
 static void kvm_pv_guest_cpu_reboot(void *unused)
 {
/*
@@ -417,14 +426,6 @@ static u64 kvm_steal_clock(int cpu)
return steal;
 }
 
-void kvm_disable_steal_time(void)
-{
-   if (!has_steal_clock)
-   return;
-
-   wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
-}
-
 static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
 {
early_set_memory_decrypted((unsigned long) ptr, size);
@@ -588,13 +589,14 @@ static void __init kvm_smp_prepare_boot_cpu(void)
kvm_spinlock_init();
 }
 
-static void kvm_guest_cpu_offline(void)
+static void kvm_guest_cpu_offline(bool shutdown)
 {
kvm_disable_steal_time();
if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
wrmsrl(MSR_KVM_PV_EOI_EN, 0);
kvm_pv_disable_apf();
-   apf_task_wake_all();
+   if (!shutdown)
+   apf_task_wake_all();
kvmclock_disable();
 }
 
@@ -613,7 +615,7 @@ static int kvm_cpu_down_prepare(unsigned int cpu)
unsigned long flags;
 
local_irq_save(flags);
-   kvm_guest_cpu_offline();
+   kvm_guest_cpu_offline(false);
local_irq_restore(flags);
return 0;
 }
@@ -647,7 +649,7 @@ static void kvm_flush_tlb_others(const struct cpumask 
*cpumask,
 
 static int kvm_suspend(void)
 {
-   kvm_guest_cpu_offline();
+   kvm_guest_cpu_offline(false);
 
return 0;
 }
@@ -662,6 +664,20 @@ static struct syscore_ops kvm_syscore_ops = {
.resume = kvm_resume,
 };
 
+/*
+ * After a PV feature is registered, the host will keep writing to the
+ * registered memory location. If the guest happens to shutdown, this memory
+ * won't be valid. In cases like kexec, in which you install a new kernel, this
+ * means a random memory location will be kept being written.
+ */
+#ifdef CONFIG_KEXEC_CORE
+static void kvm_crash_shutdown(struct pt_regs *regs)
+{
+   kvm_guest_cpu_offline(true);
+   native_machine_crash_shutdown(regs);
+}
+#endif
+
 static void __init kvm_guest_init(void)
 {
int i;
@@ -704,6 +720,10 @@ static void __init kvm_guest_init(void)
kvm_guest_cpu_init();
 #endif
 
+#ifdef CONFIG_KEXEC_CORE
+   machine_ops.crash_shutdown = kvm_crash_shutdown;
+#endif
+
register_syscore_ops(_syscore_ops);
 
/*
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index cf869de98eec..b825c87c12ef 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -20,7 +20,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 static int kvmclock __initdata = 1;
@@ -203,23 +202,6 @@ static void kvm_setup_secondary_clock(void)
 }
 #endif
 
-/*
- * After the clock is registered, the host will keep writing to the
- * registered memory location. If the guest happens to shutdown, this memory
- * won't be valid. In cases like kexec, in which you install a new kernel, this
- * means a random memory location will be kept being written. So before any
- * kind of shutdown from our side, we unregister the clock by writing anything
- * that does not have the 'enable' bit set in the msr
- */
-#ifdef CONFIG_KEXEC_CORE
-s

[PATCH 2/5] x86/kvm: Teardown PV features on boot CPU as well

2021-04-14 Thread Vitaly Kuznetsov
Various PV features (Async PF, PV EOI, steal time) work through memory
shared with hypervisor and when we restore from hibernation we must
properly teardown all these features to make sure hypervisor doesn't
write to stale locations after we jump to the previously hibernated kernel
(which can try to place anything there). For secondary CPUs the job is
already done by kvm_cpu_down_prepare(), register syscore ops to do
the same for boot CPU.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kernel/kvm.c | 32 
 1 file changed, 28 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 79dddcc178e3..6b16a9bb4ecd 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -598,17 +599,21 @@ static void kvm_guest_cpu_offline(void)
 
 static int kvm_cpu_online(unsigned int cpu)
 {
-   local_irq_disable();
+   unsigned long flags;
+
+   local_irq_save(flags);
kvm_guest_cpu_init();
-   local_irq_enable();
+   local_irq_restore(flags);
return 0;
 }
 
 static int kvm_cpu_down_prepare(unsigned int cpu)
 {
-   local_irq_disable();
+   unsigned long flags;
+
+   local_irq_save(flags);
kvm_guest_cpu_offline();
-   local_irq_enable();
+   local_irq_restore(flags);
return 0;
 }
 #endif
@@ -639,6 +644,23 @@ static void kvm_flush_tlb_others(const struct cpumask 
*cpumask,
native_flush_tlb_others(flushmask, info);
 }
 
+static int kvm_suspend(void)
+{
+   kvm_guest_cpu_offline();
+
+   return 0;
+}
+
+static void kvm_resume(void)
+{
+   kvm_cpu_online(raw_smp_processor_id());
+}
+
+static struct syscore_ops kvm_syscore_ops = {
+   .suspend= kvm_suspend,
+   .resume = kvm_resume,
+};
+
 static void __init kvm_guest_init(void)
 {
int i;
@@ -681,6 +703,8 @@ static void __init kvm_guest_init(void)
kvm_guest_cpu_init();
 #endif
 
+   register_syscore_ops(_syscore_ops);
+
/*
 * Hard lockup detection is enabled by default. Disable it, as guests
 * can get false positives too easily, for example if the host is
-- 
2.30.2



[PATCH 3/5] x86/kvm: Disable kvmclock on all CPUs on shutdown

2021-04-14 Thread Vitaly Kuznetsov
Currenly, we disable kvmclock from machine_shutdown() hook and this
only happens for boot CPU. We need to disable it for all CPUs to
guard against memory corruption e.g. on restore from hibernate.

Note, writing '0' to kvmclock MSR doesn't clear memory location, it
just prevents hypervisor from updating the location so for the short
while after write and while CPU is still alive, the clock remains usable
and correct so we don't need to switch to some other clocksource.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/include/asm/kvm_para.h | 4 ++--
 arch/x86/kernel/kvm.c   | 1 +
 arch/x86/kernel/kvmclock.c  | 5 +
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 338119852512..9c56e0defd45 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -7,8 +7,6 @@
 #include 
 #include 
 
-extern void kvmclock_init(void);
-
 #ifdef CONFIG_KVM_GUEST
 bool kvm_check_and_clear_guest_paused(void);
 #else
@@ -86,6 +84,8 @@ static inline long kvm_hypercall4(unsigned int nr, unsigned 
long p1,
 }
 
 #ifdef CONFIG_KVM_GUEST
+void kvmclock_init(void);
+void kvmclock_disable(void);
 bool kvm_para_available(void);
 unsigned int kvm_arch_para_features(void);
 unsigned int kvm_arch_para_hints(void);
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 6b16a9bb4ecd..df00d44f7424 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -595,6 +595,7 @@ static void kvm_guest_cpu_offline(void)
wrmsrl(MSR_KVM_PV_EOI_EN, 0);
kvm_pv_disable_apf();
apf_task_wake_all();
+   kvmclock_disable();
 }
 
 static int kvm_cpu_online(unsigned int cpu)
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 1fc0962c89c0..cf869de98eec 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -220,11 +220,9 @@ static void kvm_crash_shutdown(struct pt_regs *regs)
 }
 #endif
 
-static void kvm_shutdown(void)
+void kvmclock_disable(void)
 {
native_write_msr(msr_kvm_system_time, 0, 0);
-   kvm_disable_steal_time();
-   native_machine_shutdown();
 }
 
 static void __init kvmclock_init_mem(void)
@@ -351,7 +349,6 @@ void __init kvmclock_init(void)
 #endif
x86_platform.save_sched_clock_state = kvm_save_sched_clock_state;
x86_platform.restore_sched_clock_state = kvm_restore_sched_clock_state;
-   machine_ops.shutdown  = kvm_shutdown;
 #ifdef CONFIG_KEXEC_CORE
machine_ops.crash_shutdown  = kvm_crash_shutdown;
 #endif
-- 
2.30.2



[PATCH 1/5] x86/kvm: Fix pr_info() for async PF setup/teardown

2021-04-14 Thread Vitaly Kuznetsov
'pr_fmt' already has 'kvm-guest: ' so 'KVM' prefix is redundant.
"Unregister pv shared memory" is very ambiguous, it's hard to
say which particular PV feature it relates to.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kernel/kvm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 78bb0fae3982..79dddcc178e3 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -345,7 +345,7 @@ static void kvm_guest_cpu_init(void)
 
wrmsrl(MSR_KVM_ASYNC_PF_EN, pa);
__this_cpu_write(apf_reason.enabled, 1);
-   pr_info("KVM setup async PF for cpu %d\n", smp_processor_id());
+   pr_info("setup async PF for cpu %d\n", smp_processor_id());
}
 
if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) {
@@ -371,7 +371,7 @@ static void kvm_pv_disable_apf(void)
wrmsrl(MSR_KVM_ASYNC_PF_EN, 0);
__this_cpu_write(apf_reason.enabled, 0);
 
-   pr_info("Unregister pv shared memory for cpu %d\n", smp_processor_id());
+   pr_info("disable async PF for cpu %d\n", smp_processor_id());
 }
 
 static void kvm_pv_guest_cpu_reboot(void *unused)
-- 
2.30.2



[PATCH 0/5] x86/kvm: Refactor KVM PV features teardown and fix restore from hibernation

2021-04-14 Thread Vitaly Kuznetsov
This series is a successor of Lenny's "[PATCH] x86/kvmclock: Stop kvmclocks
for hibernate restore". While reviewing his patch I realized that PV
features teardown we have is a bit messy: it is scattered across kvm.c
and kvmclock.c and not all features are being shutdown an all paths.
This series unifies all teardown paths in kvm.c and makes sure all
features are disabled when needed.

Vitaly Kuznetsov (5):
  x86/kvm: Fix pr_info() for async PF setup/teardown
  x86/kvm: Teardown PV features on boot CPU as well
  x86/kvm: Disable kvmclock on all CPUs on shutdown
  x86/kvm: Disable all PV features on crash
  x86/kvm: Unify kvm_pv_guest_cpu_reboot() with kvm_guest_cpu_offline()

 arch/x86/include/asm/kvm_para.h |  10 +--
 arch/x86/kernel/kvm.c   | 113 +---
 arch/x86/kernel/kvmclock.c  |  26 +---
 3 files changed, 78 insertions(+), 71 deletions(-)

-- 
2.30.2



Re: [PATCH v2 4/4] KVM: hyper-v: Advertise support for fast XMM hypercalls

2021-04-13 Thread Vitaly Kuznetsov
Siddharth Chandrasekaran  writes:

> Now that all extant hypercalls that can use XMM registers (based on
> spec) for input/outputs are patched to support them, we can start
> advertising this feature to guests.
>
> Cc: Alexander Graf 
> Cc: Evgeny Iakovlev 
> Signed-off-by: Siddharth Chandrasekaran 
> ---
>  arch/x86/include/asm/hyperv-tlfs.h | 7 ++-
>  arch/x86/kvm/hyperv.c  | 2 ++
>  2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/hyperv-tlfs.h 
> b/arch/x86/include/asm/hyperv-tlfs.h
> index e6cd3fee562b..716f12be411e 100644
> --- a/arch/x86/include/asm/hyperv-tlfs.h
> +++ b/arch/x86/include/asm/hyperv-tlfs.h
> @@ -52,7 +52,7 @@
>   * Support for passing hypercall input parameter block via XMM
>   * registers is available
>   */
> -#define HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLEBIT(4)
> +#define HV_X64_HYPERCALL_XMM_INPUT_AVAILABLE BIT(4)
>  /* Support for a virtual guest idle state is available */
>  #define HV_X64_GUEST_IDLE_STATE_AVAILABLEBIT(5)
>  /* Frequency MSRs available */
> @@ -61,6 +61,11 @@
>  #define HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE BIT(10)
>  /* Support for debug MSRs available */
>  #define HV_FEATURE_DEBUG_MSRS_AVAILABLE  BIT(11)
> +/*
> + * Support for returning hypercall ouput block via XMM
> + * registers is available
> + */
> +#define HV_X64_HYPERCALL_XMM_OUTPUT_AVAILABLEBIT(15)
>  /* stimer Direct Mode is available */
>  #define HV_STIMER_DIRECT_MODE_AVAILABLE  BIT(19)
>  
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 1f9959aba70d..55838c266bcd 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -2254,6 +2254,8 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct 
> kvm_cpuid2 *cpuid,
>   ent->ebx |= HV_POST_MESSAGES;
>   ent->ebx |= HV_SIGNAL_EVENTS;
>  
> + ent->edx |= HV_X64_HYPERCALL_XMM_INPUT_AVAILABLE;
> + ent->edx |= HV_X64_HYPERCALL_XMM_OUTPUT_AVAILABLE;
>   ent->edx |= HV_FEATURE_FREQUENCY_MSRS_AVAILABLE;
>   ent->edx |= HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE;

With 'ouput' typo fixed,

Reviewed-by: Vitaly Kuznetsov 

-- 
Vitaly



Re: [PATCH v2 3/4] KVM: x86: kvm_hv_flush_tlb use inputs from XMM registers

2021-04-13 Thread Vitaly Kuznetsov
Siddharth Chandrasekaran  writes:

> Hyper-V supports the use of XMM registers to perform fast hypercalls.
> This allows guests to take advantage of the improved performance of the
> fast hypercall interface even though a hypercall may require more than
> (the current maximum of) two input registers.
>
> The XMM fast hypercall interface uses six additional XMM registers (XMM0
> to XMM5) to allow the guest to pass an input parameter block of up to
> 112 bytes. Hyper-V can also return data back to the guest in the
> remaining XMM registers that are not used by the current hypercall.
>
> Add framework to read/write to XMM registers in kvm_hv_hypercall() and
> use the additional hypercall inputs from XMM registers in
> kvm_hv_flush_tlb() when possible.
>
> Cc: Alexander Graf 
> Co-developed-by: Evgeny Iakovlev 
> Signed-off-by: Evgeny Iakovlev 
> Signed-off-by: Siddharth Chandrasekaran 
> ---
>  arch/x86/kvm/hyperv.c | 109 ++
>  1 file changed, 90 insertions(+), 19 deletions(-)
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 8f6babd1ea0d..1f9959aba70d 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -36,6 +36,7 @@
>  
>  #include "trace.h"
>  #include "irq.h"
> +#include "fpu.h"
>  
>  /* "Hv#1" signature */
>  #define HYPERV_CPUID_SIGNATURE_EAX 0x31237648
> @@ -1623,6 +1624,8 @@ static __always_inline unsigned long 
> *sparse_set_to_vcpu_mask(
>   return vcpu_bitmap;
>  }
>  
> +#define KVM_HV_HYPERCALL_MAX_XMM_REGISTERS  6

Nitpick: this is not KVM-specific so could probably go to 
arch/x86/include/asm/hyperv-tlfs.h

> +
>  struct kvm_hv_hcall {
>   u64 param;
>   u64 ingpa;
> @@ -1632,10 +1635,14 @@ struct kvm_hv_hcall {
>   u16 rep_idx;
>   bool fast;
>   bool rep;
> + sse128_t xmm[KVM_HV_HYPERCALL_MAX_XMM_REGISTERS];
> + bool xmm_dirty;
>  };
>  
>  static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, 
> bool ex)
>  {
> + int i, j;
> + gpa_t gpa;
>   struct kvm *kvm = vcpu->kvm;
>   struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>   struct hv_tlb_flush_ex flush_ex;
> @@ -1649,8 +1656,15 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, 
> struct kvm_hv_hcall *hc, bool
>   bool all_cpus;
>  
>   if (!ex) {
> - if (unlikely(kvm_read_guest(kvm, hc->ingpa, , 
> sizeof(flush
> - return HV_STATUS_INVALID_HYPERCALL_INPUT;
> + if (hc->fast) {
> + flush.address_space = hc->ingpa;
> + flush.flags = hc->outgpa;
> + flush.processor_mask = sse128_lo(hc->xmm[0]);
> + } else {
> + if (unlikely(kvm_read_guest(kvm, hc->ingpa,
> + , sizeof(flush
> + return HV_STATUS_INVALID_HYPERCALL_INPUT;
> + }
>  
>   trace_kvm_hv_flush_tlb(flush.processor_mask,
>  flush.address_space, flush.flags);
> @@ -1668,9 +1682,16 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, 
> struct kvm_hv_hcall *hc, bool
>   all_cpus = (flush.flags & HV_FLUSH_ALL_PROCESSORS) ||
>   flush.processor_mask == 0;
>   } else {
> - if (unlikely(kvm_read_guest(kvm, hc->ingpa, _ex,
> - sizeof(flush_ex
> - return HV_STATUS_INVALID_HYPERCALL_INPUT;
> + if (hc->fast) {
> + flush_ex.address_space = hc->ingpa;
> + flush_ex.flags = hc->outgpa;
> + memcpy(_ex.hv_vp_set,
> +>xmm[0], sizeof(hc->xmm[0]));
> + } else {
> + if (unlikely(kvm_read_guest(kvm, hc->ingpa, _ex,
> + sizeof(flush_ex
> + return HV_STATUS_INVALID_HYPERCALL_INPUT;
> + }
>  
>   trace_kvm_hv_flush_tlb_ex(flush_ex.hv_vp_set.valid_bank_mask,
> flush_ex.hv_vp_set.format,
> @@ -1681,20 +1702,29 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, 
> struct kvm_hv_hcall *hc, bool
>   all_cpus = flush_ex.hv_vp_set.format !=
>   HV_GENERIC_SET_SPARSE_4K;
>  
> - sparse_banks_len =
> - bitmap_weight((unsigned long *)_bank_mask, 64) *
> - sizeof(sparse_banks[0]);
> + sparse_banks_len = bitmap_weight((unsigned long 
> *)_bank_mask, 64);
>  
>   if (!sparse_banks_len && !all_cpus)
>   goto ret_success;
>  
> - if (!all_cpus &&
> - kvm_read_guest(kvm,
> -hc->ingpa + offsetof(struct hv_tlb_flush_ex,
> - 
> hv_vp_set.bank_contents),
> -

Re: [PATCH v2 2/4] KVM: hyper-v: Collect hypercall params into struct

2021-04-13 Thread Vitaly Kuznetsov
ID_HYPERCALL_INPUT;
>   break;
>   }
> - ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, false);
> + ret = kvm_hv_flush_tlb(vcpu, , false);
>   break;
>   case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX:
> - if (unlikely(fast || !rep_cnt || rep_idx)) {
> + if (unlikely(hc.fast || !hc.rep_cnt || hc.rep_idx)) {
>   ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
>   break;
>   }
> - ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, true);
> + ret = kvm_hv_flush_tlb(vcpu, , true);
>   break;
>   case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX:
> - if (unlikely(fast || rep)) {
> + if (unlikely(hc.fast || hc.rep)) {
>   ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
>   break;
>   }
> - ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, true);
> + ret = kvm_hv_flush_tlb(vcpu, , true);
>   break;
>   case HVCALL_SEND_IPI:
> - if (unlikely(rep)) {
> + if (unlikely(hc.rep)) {
>   ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
>   break;
>   }
> - ret = kvm_hv_send_ipi(vcpu, ingpa, outgpa, false, fast);
> + ret = kvm_hv_send_ipi(vcpu, , false);
>   break;
>   case HVCALL_SEND_IPI_EX:
> - if (unlikely(fast || rep)) {
> +     if (unlikely(hc.fast || hc.rep)) {
>   ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
>   break;
>   }
> - ret = kvm_hv_send_ipi(vcpu, ingpa, outgpa, true, false);
> + ret = kvm_hv_send_ipi(vcpu, , true);
>   break;
>   case HVCALL_POST_DEBUG_DATA:
>   case HVCALL_RETRIEVE_DEBUG_DATA:
> - if (unlikely(fast)) {
> + if (unlikely(hc.fast)) {
>   ret = HV_STATUS_INVALID_PARAMETER;
>   break;
>   }
> @@ -2012,9 +2023,9 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
>   }
>   vcpu->run->exit_reason = KVM_EXIT_HYPERV;
>   vcpu->run->hyperv.type = KVM_EXIT_HYPERV_HCALL;
> - vcpu->run->hyperv.u.hcall.input = param;
> - vcpu->run->hyperv.u.hcall.params[0] = ingpa;
> - vcpu->run->hyperv.u.hcall.params[1] = outgpa;
> + vcpu->run->hyperv.u.hcall.input = hc.param;
> + vcpu->run->hyperv.u.hcall.params[0] = hc.ingpa;
> + vcpu->run->hyperv.u.hcall.params[1] = hc.outgpa;
>   vcpu->arch.complete_userspace_io =
>   kvm_hv_hypercall_complete_userspace;
>   return 0;

With or without the nitpicks from above addressed,

Reviewed-by: Vitaly Kuznetsov 

-- 
Vitaly



Re: [PATCH v2 1/4] KVM: x86: Move FPU register accessors into fpu.h

2021-04-13 Thread Vitaly Kuznetsov
Siddharth Chandrasekaran  writes:

> Hyper-v XMM fast hypercalls use XMM registers to pass input/output
> parameters. To access these, hyperv.c can reuse some FPU register
> accessors defined in emulator.c. Move them to a common location so both
> can access them.
>
> While at it, reorder the parameters of these accessor methods to make
> them more readable.
>
> Cc: Alexander Graf 
> Cc: Evgeny Iakovlev 
> Signed-off-by: Siddharth Chandrasekaran 
> ---
>  arch/x86/kvm/emulate.c | 138 ++--
>  arch/x86/kvm/fpu.h | 140 +
>  2 files changed, 158 insertions(+), 120 deletions(-)
>  create mode 100644 arch/x86/kvm/fpu.h
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index f7970ba6219f..296f8f3ce988 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -22,7 +22,6 @@
>  #include "kvm_cache_regs.h"
>  #include "kvm_emulate.h"
>  #include 
> -#include 
>  #include 
>  #include 
>  
> @@ -30,6 +29,7 @@
>  #include "tss.h"
>  #include "mmu.h"
>  #include "pmu.h"
> +#include "fpu.h"
>  
>  /*
>   * Operand types
> @@ -1081,116 +1081,14 @@ static void fetch_register_operand(struct operand 
> *op)
>   }
>  }
>  
> -static void emulator_get_fpu(void)
> -{
> - fpregs_lock();
> -
> - fpregs_assert_state_consistent();
> - if (test_thread_flag(TIF_NEED_FPU_LOAD))
> - switch_fpu_return();
> -}
> -
> -static void emulator_put_fpu(void)
> -{
> - fpregs_unlock();
> -}
> -
> -static void read_sse_reg(sse128_t *data, int reg)
> -{
> - emulator_get_fpu();
> - switch (reg) {
> - case 0: asm("movdqa %%xmm0, %0" : "=m"(*data)); break;
> - case 1: asm("movdqa %%xmm1, %0" : "=m"(*data)); break;
> - case 2: asm("movdqa %%xmm2, %0" : "=m"(*data)); break;
> - case 3: asm("movdqa %%xmm3, %0" : "=m"(*data)); break;
> - case 4: asm("movdqa %%xmm4, %0" : "=m"(*data)); break;
> - case 5: asm("movdqa %%xmm5, %0" : "=m"(*data)); break;
> - case 6: asm("movdqa %%xmm6, %0" : "=m"(*data)); break;
> - case 7: asm("movdqa %%xmm7, %0" : "=m"(*data)); break;
> -#ifdef CONFIG_X86_64
> - case 8: asm("movdqa %%xmm8, %0" : "=m"(*data)); break;
> - case 9: asm("movdqa %%xmm9, %0" : "=m"(*data)); break;
> - case 10: asm("movdqa %%xmm10, %0" : "=m"(*data)); break;
> - case 11: asm("movdqa %%xmm11, %0" : "=m"(*data)); break;
> - case 12: asm("movdqa %%xmm12, %0" : "=m"(*data)); break;
> - case 13: asm("movdqa %%xmm13, %0" : "=m"(*data)); break;
> - case 14: asm("movdqa %%xmm14, %0" : "=m"(*data)); break;
> - case 15: asm("movdqa %%xmm15, %0" : "=m"(*data)); break;
> -#endif
> - default: BUG();
> - }
> - emulator_put_fpu();
> -}
> -
> -static void write_sse_reg(sse128_t *data, int reg)
> -{
> - emulator_get_fpu();
> - switch (reg) {
> - case 0: asm("movdqa %0, %%xmm0" : : "m"(*data)); break;
> - case 1: asm("movdqa %0, %%xmm1" : : "m"(*data)); break;
> - case 2: asm("movdqa %0, %%xmm2" : : "m"(*data)); break;
> - case 3: asm("movdqa %0, %%xmm3" : : "m"(*data)); break;
> - case 4: asm("movdqa %0, %%xmm4" : : "m"(*data)); break;
> - case 5: asm("movdqa %0, %%xmm5" : : "m"(*data)); break;
> - case 6: asm("movdqa %0, %%xmm6" : : "m"(*data)); break;
> - case 7: asm("movdqa %0, %%xmm7" : : "m"(*data)); break;
> -#ifdef CONFIG_X86_64
> - case 8: asm("movdqa %0, %%xmm8" : : "m"(*data)); break;
> - case 9: asm("movdqa %0, %%xmm9" : : "m"(*data)); break;
> - case 10: asm("movdqa %0, %%xmm10" : : "m"(*data)); break;
> - case 11: asm("movdqa %0, %%xmm11" : : "m"(*data)); break;
> - case 12: asm("movdqa %0, %%xmm12" : : "m"(*data)); break;
> - case 13: asm("movdqa %0, %%xmm13" : : "m"(*data)); break;
> - case 14: asm("movdqa %0, %%xmm14" : : "m"(*data)); break;
> - case 15: asm("movdqa %0, %%xmm15" : : "m"(*data)); break;
> -#endif
> - default: BUG();
> - }
> - emulator_put_fpu();
> -}
> -
> -static void read_mmx_reg(u64 *data, int reg)
> -{
> - emulator_get_fpu();
> - switch (reg) {
> - case 0: asm("movq %%mm0, %0" : "=m"(*data)); break;
> - case 1: asm("movq %%mm1, %0" : "=m"(*data)); break;
> - case 2: asm("movq %%mm2, %0" : "=m"(*data)); break;
> - case 3: asm("movq %%mm3, %0" : "=m"(*data)); break;
> - case 4: asm("movq %%mm4, %0" : "=m"(*data)); break;
> - case 5: asm("movq %%mm5, %0" : "=m"(*data)); break;
> - case 6: asm("movq %%mm6, %0" : "=m"(*data)); break;
> - case 7: asm("movq %%mm7, %0" : "=m"(*data)); break;
> - default: BUG();
> - }
> - emulator_put_fpu();
> -}
> -
> -static void write_mmx_reg(u64 *data, int reg)
> -{
> - emulator_get_fpu();
> - switch (reg) {
> - case 0: asm("movq %0, %%mm0" : : "m"(*data)); break;
> - case 1: asm("movq %0, %%mm1" : : "m"(*data)); break;
> - case 2: asm("movq %0, %%mm2" : : "m"(*data)); break;
> - case 3: asm("movq %0, %%mm3" : : 

[PATCH RFC 16/22] KVM: x86: hyper-v: Honor HV_STIMER_DIRECT_MODE_AVAILABLE privilege bit

2021-04-13 Thread Vitaly Kuznetsov
Synthetic timers can only be configured in 'direct' mode when
HV_STIMER_DIRECT_MODE_AVAILABLE bit was exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 1299847c89ba..0df18187d908 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -646,6 +646,11 @@ static int stimer_set_config(struct kvm_vcpu_hv_stimer 
*stimer, u64 config,
 HV_MSR_SYNTIMER_AVAILABLE
return 1;
 
+   if (unlikely(!host && new_config.direct_mode &&
+!(to_hv_vcpu(vcpu)->cpuid_cache.features_edx &
+  HV_STIMER_DIRECT_MODE_AVAILABLE)))
+   return 1;
+
trace_kvm_hv_stimer_set_config(hv_stimer_to_vcpu(stimer)->vcpu_id,
   stimer->index, config, host);
 
-- 
2.30.2



[PATCH RFC 22/22] KVM: x86: hyper-v: Check access to HVCALL_NOTIFY_LONG_SPIN_WAIT hypercall

2021-04-13 Thread Vitaly Kuznetsov
TLFS6.0b states that partition issuing HVCALL_NOTIFY_LONG_SPIN_WAIT must
posess 'UseHypercallForLongSpinWait' privilege but there's no
corresponding feature bit. Instead, we have "Recommended number of attempts
to retry a spinlock failure before notifying the hypervisor about the
failures. 0x indicates never notify." Use this to check access to
the hypercall. Also, check against zero as the corresponding CPUID must
be set (and '0' attempts before re-try is weird anyway).

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 37b8ff30fc1d..325446833bbe 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2113,6 +2113,12 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
 
switch (code) {
case HVCALL_NOTIFY_LONG_SPIN_WAIT:
+   if (unlikely(!hv_vcpu->cpuid_cache.enlightenments_ebx ||
+hv_vcpu->cpuid_cache.enlightenments_ebx == 
U32_MAX)) {
+   ret = HV_STATUS_ACCESS_DENIED;
+   break;
+   }
+
if (unlikely(rep)) {
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
break;
-- 
2.30.2



[PATCH RFC 19/22] KVM: x86: hyper-v: Honor HV_DEBUGGING privilege bit

2021-04-13 Thread Vitaly Kuznetsov
Hyper-V partition must possess 'HV_DEBUGGING' privilege to issue
HVCALL_POST_DEBUG_DATA/HVCALL_RETRIEVE_DEBUG_DATA/
HVCALL_RESET_DEBUG_SESSION hypercalls.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index b661f92d90c8..7cb1dd1a9fc1 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2211,6 +2211,12 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
break;
}
 
+   if (unlikely(!(hv_vcpu->cpuid_cache.features_ebx &
+  HV_DEBUGGING))) {
+   ret = HV_STATUS_ACCESS_DENIED;
+   break;
+   }
+
if (!(syndbg->options & HV_X64_SYNDBG_OPTION_USE_HCALLS)) {
ret = HV_STATUS_OPERATION_DENIED;
break;
-- 
2.30.2



[PATCH RFC 20/22] KVM: x86: hyper-v: Honor HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED bit

2021-04-13 Thread Vitaly Kuznetsov
Hyper-V partition must possess 'HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED'
privilege ('recommended' is rather a misnomer) to issue
HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST/SPACE hypercalls. '_EX' versions of these
hypercalls also require HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 7cb1dd1a9fc1..3e8a34c08aef 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2155,6 +2155,12 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
kvm_hv_hypercall_complete_userspace;
return 0;
case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST:
+   if (unlikely(!(hv_vcpu->cpuid_cache.enlightenments_eax &
+  HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED))) {
+   ret = HV_STATUS_ACCESS_DENIED;
+   break;
+   }
+
if (unlikely(fast || !rep_cnt || rep_idx)) {
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
break;
@@ -2162,6 +2168,12 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, false);
break;
case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE:
+   if (unlikely(!(hv_vcpu->cpuid_cache.enlightenments_eax &
+  HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED))) {
+   ret = HV_STATUS_ACCESS_DENIED;
+   break;
+   }
+
if (unlikely(fast || rep)) {
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
break;
@@ -2169,6 +2181,14 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, false);
break;
case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX:
+   if (unlikely(!(hv_vcpu->cpuid_cache.enlightenments_eax &
+  HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED) ||
+!(hv_vcpu->cpuid_cache.enlightenments_eax &
+  HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED))) {
+   ret = HV_STATUS_ACCESS_DENIED;
+   break;
+   }
+
if (unlikely(fast || !rep_cnt || rep_idx)) {
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
break;
@@ -2176,6 +2196,14 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, true);
break;
case HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX:
+   if (unlikely(!(hv_vcpu->cpuid_cache.enlightenments_eax &
+  HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED) ||
+!(hv_vcpu->cpuid_cache.enlightenments_eax &
+  HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED))) {
+   ret = HV_STATUS_ACCESS_DENIED;
+   break;
+   }
+
if (unlikely(fast || rep)) {
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
break;
-- 
2.30.2



[PATCH RFC 21/22] KVM: x86: hyper-v: Honor HV_X64_CLUSTER_IPI_RECOMMENDED bit

2021-04-13 Thread Vitaly Kuznetsov
Hyper-V partition must possess 'HV_X64_CLUSTER_IPI_RECOMMENDED'
privilege ('recommended' is rather a misnomer) to issue
HVCALL_SEND_IPI hypercalls. 'HVCALL_SEND_IPI_EX' version of the
hypercall also requires HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 3e8a34c08aef..37b8ff30fc1d 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2211,6 +2211,12 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, true);
break;
case HVCALL_SEND_IPI:
+   if (unlikely(!(hv_vcpu->cpuid_cache.enlightenments_eax &
+  HV_X64_CLUSTER_IPI_RECOMMENDED))) {
+   ret = HV_STATUS_ACCESS_DENIED;
+   break;
+   }
+
if (unlikely(rep)) {
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
break;
@@ -2218,6 +2224,14 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
ret = kvm_hv_send_ipi(vcpu, ingpa, outgpa, false, fast);
break;
case HVCALL_SEND_IPI_EX:
+   if (unlikely(!(hv_vcpu->cpuid_cache.enlightenments_eax &
+  HV_X64_CLUSTER_IPI_RECOMMENDED) ||
+!(hv_vcpu->cpuid_cache.enlightenments_eax &
+  HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED))) {
+   ret = HV_STATUS_ACCESS_DENIED;
+   break;
+   }
+
if (unlikely(fast || rep)) {
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
break;
-- 
2.30.2



[PATCH RFC 18/22] KVM: x86: hyper-v: Honor HV_SIGNAL_EVENTS privilege bit

2021-04-13 Thread Vitaly Kuznetsov
Hyper-V partition must possess 'HV_SIGNAL_EVENTS' privilege to issue
HVCALL_SIGNAL_EVENT hypercalls.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 6e4bf1da9dcf..b661f92d90c8 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2120,6 +2120,12 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
kvm_vcpu_on_spin(vcpu, true);
break;
case HVCALL_SIGNAL_EVENT:
+   if (unlikely(!(hv_vcpu->cpuid_cache.features_ebx &
+  HV_SIGNAL_EVENTS))) {
+   ret = HV_STATUS_ACCESS_DENIED;
+   break;
+   }
+
if (unlikely(rep)) {
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
break;
-- 
2.30.2



[PATCH RFC 17/22] KVM: x86: hyper-v: Honor HV_POST_MESSAGES privilege bit

2021-04-13 Thread Vitaly Kuznetsov
Hyper-V partition must possess 'HV_POST_MESSAGES' privilege to issue
HVCALL_POST_MESSAGE hypercalls.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 0df18187d908..6e4bf1da9dcf 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2073,12 +2073,16 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
u64 param, ingpa, outgpa, ret = HV_STATUS_SUCCESS;
uint16_t code, rep_idx, rep_cnt;
bool fast, rep;
+   struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
 
/*
 * hypercall generates UD from non zero cpl and real mode
-* per HYPER-V spec
+* per HYPER-V spec. Fail the call when 'hv_vcpu' context
+* was not allocated (e.g. per-vCPU Hyper-V CPUID entries
+* are unset) as well.
 */
-   if (static_call(kvm_x86_get_cpl)(vcpu) != 0 || !is_protmode(vcpu)) {
+   if (static_call(kvm_x86_get_cpl)(vcpu) != 0 || !is_protmode(vcpu) ||
+   !hv_vcpu) {
kvm_queue_exception(vcpu, UD_VECTOR);
return 1;
}
@@ -2125,6 +2129,12 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
break;
fallthrough;/* maybe userspace knows this conn_id */
case HVCALL_POST_MESSAGE:
+   if (unlikely(!(hv_vcpu->cpuid_cache.features_ebx &
+  HV_POST_MESSAGES))) {
+   ret = HV_STATUS_ACCESS_DENIED;
+   break;
+   }
+
/* don't bother userspace if it has no way to handle it */
if (unlikely(rep || !to_hv_synic(vcpu)->active)) {
ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
-- 
2.30.2



[PATCH RFC 15/22] KVM: x86: hyper-v: Honor HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE privilege bit

2021-04-13 Thread Vitaly Kuznetsov
Synthetic debugging MSRs (HV_X64_MSR_SYNDBG_CONTROL,
HV_X64_MSR_SYNDBG_STATUS, HV_X64_MSR_SYNDBG_SEND_BUFFER,
HV_X64_MSR_SYNDBG_RECV_BUFFER, HV_X64_MSR_SYNDBG_PENDING_BUFFER,
HV_X64_MSR_SYNDBG_OPTIONS) are only available to guest when
HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 0678f1012ed7..1299847c89ba 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -312,7 +312,9 @@ static int syndbg_set_msr(struct kvm_vcpu *vcpu, u32 msr, 
u64 data, bool host)
 {
struct kvm_hv_syndbg *syndbg = to_hv_syndbg(vcpu);
 
-   if (!kvm_hv_is_syndbg_enabled(vcpu) && !host)
+   if (unlikely(!host && (!kvm_hv_is_syndbg_enabled(vcpu) ||
+  !(to_hv_vcpu(vcpu)->cpuid_cache.features_edx &
+HV_FEATURE_DEBUG_MSRS_AVAILABLE
return 1;
 
trace_kvm_hv_syndbg_set_msr(vcpu->vcpu_id,
@@ -351,7 +353,9 @@ static int syndbg_get_msr(struct kvm_vcpu *vcpu, u32 msr, 
u64 *pdata, bool host)
 {
struct kvm_hv_syndbg *syndbg = to_hv_syndbg(vcpu);
 
-   if (!kvm_hv_is_syndbg_enabled(vcpu) && !host)
+   if (unlikely(!host && (!kvm_hv_is_syndbg_enabled(vcpu) ||
+  !(to_hv_vcpu(vcpu)->cpuid_cache.features_edx &
+HV_FEATURE_DEBUG_MSRS_AVAILABLE
return 1;
 
switch (msr) {
-- 
2.30.2



[PATCH RFC 14/22] KVM: x86: hyper-v: Honor HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE privilege bit

2021-04-13 Thread Vitaly Kuznetsov
HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4, HV_X64_MSR_CRASH_CTL are only
available to guest when HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE bit is
exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 259badd3a139..0678f1012ed7 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1300,10 +1300,18 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 data,
}
break;
case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_edx &
+   HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE)))
+   return 1;
+
return kvm_hv_msr_set_crash_data(kvm,
 msr - HV_X64_MSR_CRASH_P0,
 data);
case HV_X64_MSR_CRASH_CTL:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_edx &
+   HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE)))
+   return 1;
+
if (host)
return kvm_hv_msr_set_crash_ctl(kvm, data);
 
@@ -1541,10 +1549,18 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata,
data = hv->hv_tsc_page;
break;
case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_edx &
+   HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE)))
+   return 1;
+
return kvm_hv_msr_get_crash_data(kvm,
 msr - HV_X64_MSR_CRASH_P0,
 pdata);
case HV_X64_MSR_CRASH_CTL:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_edx &
+   HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE)))
+   return 1;
+
return kvm_hv_msr_get_crash_ctl(kvm, pdata);
case HV_X64_MSR_RESET:
if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
-- 
2.30.2



[PATCH RFC 12/22] KVM: x86: hyper-v: Honor HV_ACCESS_FREQUENCY_MSRS privilege bit

2021-04-13 Thread Vitaly Kuznetsov
HV_X64_MSR_TSC_FREQUENCY/HV_X64_MSR_APIC_FREQUENCY are only available to
guest when HV_ACCESS_FREQUENCY_MSRS bit is exposed.

Note, writing to HV_X64_MSR_TSC_FREQUENCY/HV_X64_MSR_APIC_FREQUENCY is
unsupported so kvm_hv_set_msr() doesn't need an additional check.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 9c4454873e00..e92a1109ad9b 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1637,9 +1637,17 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata,
pdata, host);
}
case HV_X64_MSR_TSC_FREQUENCY:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_ACCESS_FREQUENCY_MSRS)))
+   return 1;
+
data = (u64)vcpu->arch.virtual_tsc_khz * 1000;
break;
case HV_X64_MSR_APIC_FREQUENCY:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_ACCESS_FREQUENCY_MSRS)))
+   return 1;
+
data = APIC_BUS_FREQUENCY;
break;
default:
-- 
2.30.2



[PATCH RFC 10/22] KVM: x86: hyper-v: Honor HV_MSR_SYNTIMER_AVAILABLE privilege bit

2021-04-13 Thread Vitaly Kuznetsov
Synthetic timers MSRs (HV_X64_MSR_STIMER[0-3]_CONFIG,
HV_X64_MSR_STIMER[0-3]_COUNT) are only available to guest when
HV_MSR_SYNTIMER_AVAILABLE bit is exposed.

While on it, complement stimer_get_config()/stimer_get_count() with
the same '!synic->active' check we have in stimer_set_config()/
stimer_set_count().

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 34 --
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index d85c441011c4..032305ad5615 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -637,7 +637,9 @@ static int stimer_set_config(struct kvm_vcpu_hv_stimer 
*stimer, u64 config,
struct kvm_vcpu *vcpu = hv_stimer_to_vcpu(stimer);
struct kvm_vcpu_hv_synic *synic = to_hv_synic(vcpu);
 
-   if (!synic->active && !host)
+   if (unlikely(!host && (!synic->active ||
+  !(to_hv_vcpu(vcpu)->cpuid_cache.features_eax &
+HV_MSR_SYNTIMER_AVAILABLE
return 1;
 
trace_kvm_hv_stimer_set_config(hv_stimer_to_vcpu(stimer)->vcpu_id,
@@ -661,7 +663,9 @@ static int stimer_set_count(struct kvm_vcpu_hv_stimer 
*stimer, u64 count,
struct kvm_vcpu *vcpu = hv_stimer_to_vcpu(stimer);
struct kvm_vcpu_hv_synic *synic = to_hv_synic(vcpu);
 
-   if (!synic->active && !host)
+   if (unlikely(!host && (!synic->active ||
+  !(to_hv_vcpu(vcpu)->cpuid_cache.features_eax &
+HV_MSR_SYNTIMER_AVAILABLE
return 1;
 
trace_kvm_hv_stimer_set_count(hv_stimer_to_vcpu(stimer)->vcpu_id,
@@ -680,14 +684,32 @@ static int stimer_set_count(struct kvm_vcpu_hv_stimer 
*stimer, u64 count,
return 0;
 }
 
-static int stimer_get_config(struct kvm_vcpu_hv_stimer *stimer, u64 *pconfig)
+static int stimer_get_config(struct kvm_vcpu_hv_stimer *stimer, u64 *pconfig,
+bool host)
 {
+   struct kvm_vcpu *vcpu = hv_stimer_to_vcpu(stimer);
+   struct kvm_vcpu_hv_synic *synic = to_hv_synic(vcpu);
+
+   if (unlikely(!host && (!synic->active ||
+  !(to_hv_vcpu(vcpu)->cpuid_cache.features_eax &
+HV_MSR_SYNTIMER_AVAILABLE
+   return 1;
+
*pconfig = stimer->config.as_uint64;
return 0;
 }
 
-static int stimer_get_count(struct kvm_vcpu_hv_stimer *stimer, u64 *pcount)
+static int stimer_get_count(struct kvm_vcpu_hv_stimer *stimer, u64 *pcount,
+   bool host)
 {
+   struct kvm_vcpu *vcpu = hv_stimer_to_vcpu(stimer);
+   struct kvm_vcpu_hv_synic *synic = to_hv_synic(vcpu);
+
+   if (unlikely(!host && (!synic->active ||
+  !(to_hv_vcpu(vcpu)->cpuid_cache.features_eax &
+HV_MSR_SYNTIMER_AVAILABLE
+   return 1;
+
*pcount = stimer->count;
return 0;
 }
@@ -1571,7 +1593,7 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, 
u64 *pdata,
int timer_index = (msr - HV_X64_MSR_STIMER0_CONFIG)/2;
 
return stimer_get_config(to_hv_stimer(vcpu, timer_index),
-pdata);
+pdata, host);
}
case HV_X64_MSR_STIMER0_COUNT:
case HV_X64_MSR_STIMER1_COUNT:
@@ -1580,7 +1602,7 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, 
u64 *pdata,
int timer_index = (msr - HV_X64_MSR_STIMER0_COUNT)/2;
 
return stimer_get_count(to_hv_stimer(vcpu, timer_index),
-   pdata);
+   pdata, host);
}
case HV_X64_MSR_TSC_FREQUENCY:
data = (u64)vcpu->arch.virtual_tsc_khz * 1000;
-- 
2.30.2



[PATCH RFC 13/22] KVM: x86: hyper-v: Honor HV_ACCESS_REENLIGHTENMENT privilege bit

2021-04-13 Thread Vitaly Kuznetsov
HV_X64_MSR_REENLIGHTENMENT_CONTROL/HV_X64_MSR_TSC_EMULATION_CONTROL/
HV_X64_MSR_TSC_EMULATION_STATUS are only available to guest when
HV_ACCESS_REENLIGHTENMENT bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index e92a1109ad9b..259badd3a139 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1330,13 +1330,22 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 data,
}
break;
case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_ACCESS_REENLIGHTENMENT)))
+   return 1;
+
hv->hv_reenlightenment_control = data;
break;
case HV_X64_MSR_TSC_EMULATION_CONTROL:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_ACCESS_REENLIGHTENMENT)))
+   return 1;
+
hv->hv_tsc_emulation_control = data;
break;
case HV_X64_MSR_TSC_EMULATION_STATUS:
-   if (data && !host)
+   if (unlikely(!host && (!(hv_vcpu->cpuid_cache.features_eax &
+   HV_ACCESS_REENLIGHTENMENT) || data)))
return 1;
 
hv->hv_tsc_emulation_status = data;
@@ -1545,12 +1554,24 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata,
data = 0;
break;
case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_ACCESS_REENLIGHTENMENT)))
+   return 1;
+
data = hv->hv_reenlightenment_control;
break;
case HV_X64_MSR_TSC_EMULATION_CONTROL:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_ACCESS_REENLIGHTENMENT)))
+   return 1;
+
data = hv->hv_tsc_emulation_control;
break;
case HV_X64_MSR_TSC_EMULATION_STATUS:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_ACCESS_REENLIGHTENMENT)))
+   return 1;
+
data = hv->hv_tsc_emulation_status;
break;
case HV_X64_MSR_SYNDBG_OPTIONS:
-- 
2.30.2



[PATCH RFC 11/22] KVM: x86: hyper-v: Honor HV_MSR_APIC_ACCESS_AVAILABLE privilege bit

2021-04-13 Thread Vitaly Kuznetsov
HV_X64_MSR_EOI, HV_X64_MSR_ICR, HV_X64_MSR_TPR, and
HV_X64_MSR_VP_ASSIST_PAGE  are only available to guest when
HV_MSR_APIC_ACCESS_AVAILABLE bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 032305ad5615..9c4454873e00 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1401,6 +1401,10 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 
msr, u64 data, bool host)
u64 gfn;
unsigned long addr;
 
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_APIC_ACCESS_AVAILABLE)))
+   return 1;
+
if (!(data & HV_X64_MSR_VP_ASSIST_PAGE_ENABLE)) {
hv_vcpu->hv_vapic = data;
if (kvm_lapic_enable_pv_eoi(vcpu, 0, 0))
@@ -1428,10 +1432,22 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 
msr, u64 data, bool host)
break;
}
case HV_X64_MSR_EOI:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_APIC_ACCESS_AVAILABLE)))
+   return 1;
+
return kvm_hv_vapic_msr_write(vcpu, APIC_EOI, data);
case HV_X64_MSR_ICR:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_APIC_ACCESS_AVAILABLE)))
+   return 1;
+
return kvm_hv_vapic_msr_write(vcpu, APIC_ICR, data);
case HV_X64_MSR_TPR:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_APIC_ACCESS_AVAILABLE)))
+   return 1;
+
return kvm_hv_vapic_msr_write(vcpu, APIC_TASKPRI, data);
case HV_X64_MSR_VP_RUNTIME:
if (!host)
@@ -1564,12 +1580,28 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata,
data = hv_vcpu->vp_index;
break;
case HV_X64_MSR_EOI:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_APIC_ACCESS_AVAILABLE)))
+   return 1;
+
return kvm_hv_vapic_msr_read(vcpu, APIC_EOI, pdata);
case HV_X64_MSR_ICR:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_APIC_ACCESS_AVAILABLE)))
+   return 1;
+
return kvm_hv_vapic_msr_read(vcpu, APIC_ICR, pdata);
case HV_X64_MSR_TPR:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_APIC_ACCESS_AVAILABLE)))
+   return 1;
+
return kvm_hv_vapic_msr_read(vcpu, APIC_TASKPRI, pdata);
case HV_X64_MSR_VP_ASSIST_PAGE:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_APIC_ACCESS_AVAILABLE)))
+   return 1;
+
data = hv_vcpu->hv_vapic;
break;
case HV_X64_MSR_VP_RUNTIME:
-- 
2.30.2



[PATCH RFC 08/22] KVM: x86: hyper-v: Honor HV_MSR_REFERENCE_TSC_AVAILABLE privilege bit

2021-04-13 Thread Vitaly Kuznetsov
HV_X64_MSR_REFERENCE_TSC is only available to guest when
HV_MSR_REFERENCE_TSC_AVAILABLE bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 15d557ce32b5..48215ad72b6c 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1257,6 +1257,10 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 data,
break;
}
case HV_X64_MSR_REFERENCE_TSC:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_REFERENCE_TSC_AVAILABLE)))
+   return 1;
+
hv->hv_tsc_page = data;
if (hv->hv_tsc_page & HV_X64_MSR_TSC_REFERENCE_ENABLE) {
if (!host)
@@ -1478,6 +1482,10 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata,
data = get_time_ref_counter(kvm);
break;
case HV_X64_MSR_REFERENCE_TSC:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_REFERENCE_TSC_AVAILABLE)))
+   return 1;
+
data = hv->hv_tsc_page;
break;
case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
-- 
2.30.2



[PATCH RFC 07/22] KVM: x86: hyper-v: Honor HV_MSR_RESET_AVAILABLE privilege bit

2021-04-13 Thread Vitaly Kuznetsov
HV_X64_MSR_RESET is only available to guest when HV_MSR_RESET_AVAILABLE bit
is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 07f1fc8575e5..15d557ce32b5 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1289,6 +1289,10 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 data,
}
break;
case HV_X64_MSR_RESET:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_RESET_AVAILABLE)))
+   return 1;
+
if (data == 1) {
vcpu_debug(vcpu, "hyper-v reset requested\n");
kvm_make_request(KVM_REQ_HV_RESET, vcpu);
@@ -1483,6 +1487,10 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata,
case HV_X64_MSR_CRASH_CTL:
return kvm_hv_msr_get_crash_ctl(kvm, pdata);
case HV_X64_MSR_RESET:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_RESET_AVAILABLE)))
+   return 1;
+
data = 0;
break;
case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
-- 
2.30.2



[PATCH RFC 09/22] KVM: x86: hyper-v: Honor HV_MSR_SYNIC_AVAILABLE privilege bit

2021-04-13 Thread Vitaly Kuznetsov
SynIC MSRs (HV_X64_MSR_SCONTROL, HV_X64_MSR_SVERSION, HV_X64_MSR_SIEFP,
HV_X64_MSR_SIMP, HV_X64_MSR_EOM, HV_X64_MSR_SINT0 ... HV_X64_MSR_SINT15)
are only available to guest when HV_MSR_SYNIC_AVAILABLE bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 48215ad72b6c..d85c441011c4 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -211,7 +211,9 @@ static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
struct kvm_vcpu *vcpu = hv_synic_to_vcpu(synic);
int ret;
 
-   if (!synic->active && !host)
+   if (unlikely(!host &&
+(!synic->active || 
!(to_hv_vcpu(vcpu)->cpuid_cache.features_eax &
+ HV_MSR_SYNIC_AVAILABLE
return 1;
 
trace_kvm_hv_synic_set_msr(vcpu->vcpu_id, msr, data, host);
@@ -383,9 +385,12 @@ static int syndbg_get_msr(struct kvm_vcpu *vcpu, u32 msr, 
u64 *pdata, bool host)
 static int synic_get_msr(struct kvm_vcpu_hv_synic *synic, u32 msr, u64 *pdata,
 bool host)
 {
+   struct kvm_vcpu *vcpu = hv_synic_to_vcpu(synic);
int ret;
 
-   if (!synic->active && !host)
+   if (unlikely(!host &&
+(!synic->active || 
!(to_hv_vcpu(vcpu)->cpuid_cache.features_eax &
+ HV_MSR_SYNIC_AVAILABLE
return 1;
 
ret = 0;
-- 
2.30.2



[PATCH RFC 04/22] KVM: x86: hyper-v: Honor HV_MSR_TIME_REF_COUNT_AVAILABLE privilege bit

2021-04-13 Thread Vitaly Kuznetsov
HV_X64_MSR_TIME_REF_COUNT is only available to guest when
HV_MSR_TIME_REF_COUNT_AVAILABLE bit is exposed.

Note, writing to HV_X64_MSR_TIME_REF_COUNT is unsupported so
kvm_hv_set_msr_pw() doesn't need an additional check.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index b39445aabbc2..efb3d69c98fd 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1440,6 +1440,7 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata,
u64 data = 0;
struct kvm *kvm = vcpu->kvm;
struct kvm_hv *hv = to_kvm_hv(kvm);
+   struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
 
switch (msr) {
case HV_X64_MSR_GUEST_OS_ID:
@@ -1449,6 +1450,10 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata,
data = hv->hv_hypercall;
break;
case HV_X64_MSR_TIME_REF_COUNT:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_TIME_REF_COUNT_AVAILABLE)))
+   return 1;
+
data = get_time_ref_counter(kvm);
break;
case HV_X64_MSR_REFERENCE_TSC:
-- 
2.30.2



[PATCH RFC 05/22] KVM: x86: hyper-v: Honor HV_MSR_HYPERCALL_AVAILABLE privilege bit

2021-04-13 Thread Vitaly Kuznetsov
HV_X64_MSR_GUEST_OS_ID/HV_X64_MSR_HYPERCALL are only available to guest
when HV_MSR_HYPERCALL_AVAILABLE bit is exposed.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index efb3d69c98fd..7fdd9b9c50d6 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1198,9 +1198,14 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 data,
 {
struct kvm *kvm = vcpu->kvm;
struct kvm_hv *hv = to_kvm_hv(kvm);
+   struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
 
switch (msr) {
case HV_X64_MSR_GUEST_OS_ID:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_HYPERCALL_AVAILABLE)))
+   return 1;
+
hv->hv_guest_os_id = data;
/* setting guest os id to zero disables hypercall page */
if (!hv->hv_guest_os_id)
@@ -1211,6 +1216,10 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 data,
int i = 0;
u64 addr;
 
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_HYPERCALL_AVAILABLE)))
+   return 1;
+
/* if guest os id is not set hypercall should remain disabled */
if (!hv->hv_guest_os_id)
break;
@@ -1444,9 +1453,17 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata,
 
switch (msr) {
case HV_X64_MSR_GUEST_OS_ID:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_HYPERCALL_AVAILABLE)))
+   return 1;
+
data = hv->hv_guest_os_id;
break;
case HV_X64_MSR_HYPERCALL:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_HYPERCALL_AVAILABLE)))
+   return 1;
+
data = hv->hv_hypercall;
break;
case HV_X64_MSR_TIME_REF_COUNT:
-- 
2.30.2



[PATCH RFC 06/22] KVM: x86: hyper-v: Honor HV_MSR_VP_INDEX_AVAILABLE privilege bit

2021-04-13 Thread Vitaly Kuznetsov
HV_X64_MSR_VP_INDEX is only available to guest when
HV_MSR_VP_INDEX_AVAILABLE bit is exposed.

Note, writing to HV_X64_MSR_VP_INDEX is only available from the host so
kvm_hv_set_msr() doesn't need an additional check.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 7fdd9b9c50d6..07f1fc8575e5 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1514,6 +1514,10 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata,
 
switch (msr) {
case HV_X64_MSR_VP_INDEX:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_VP_INDEX_AVAILABLE)))
+   return 1;
+
data = hv_vcpu->vp_index;
break;
case HV_X64_MSR_EOI:
-- 
2.30.2



[PATCH RFC 03/22] KVM: x86: hyper-v: Honor HV_MSR_VP_RUNTIME_AVAILABLE privilege bit

2021-04-13 Thread Vitaly Kuznetsov
HV_X64_MSR_VP_RUNTIME is only available to guest when
HV_MSR_VP_RUNTIME_AVAILABLE bit is exposed.

Note, writing to HV_X64_MSR_VP_RUNTIME is only available from the host so
kvm_hv_set_msr() doesn't need an additional check.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/hyperv.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 781f9da9a418..b39445aabbc2 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1504,6 +1504,10 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata,
data = hv_vcpu->hv_vapic;
break;
case HV_X64_MSR_VP_RUNTIME:
+   if (unlikely(!host && !(hv_vcpu->cpuid_cache.features_eax &
+   HV_MSR_VP_RUNTIME_AVAILABLE)))
+   return 1;
+
data = current_task_runtime_100ns() + hv_vcpu->runtime_offset;
break;
case HV_X64_MSR_SCONTROL:
-- 
2.30.2



[PATCH RFC 02/22] KVM: x86: hyper-v: Cache guest CPUID leaves determining features availability

2021-04-13 Thread Vitaly Kuznetsov
Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/include/asm/kvm_host.h |  8 ++
 arch/x86/kvm/hyperv.c   | 50 ++---
 2 files changed, 48 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3768819693e5..04bddcaa8cad 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -530,6 +530,14 @@ struct kvm_vcpu_hv {
struct kvm_vcpu_hv_stimer stimer[HV_SYNIC_STIMER_COUNT];
DECLARE_BITMAP(stimer_pending_bitmap, HV_SYNIC_STIMER_COUNT);
cpumask_t tlb_flush;
+   struct {
+   u32 features_eax; /* HYPERV_CPUID_FEATURES.EAX */
+   u32 features_ebx; /* HYPERV_CPUID_FEATURES.EBX */
+   u32 features_edx; /* HYPERV_CPUID_FEATURES.EDX */
+   u32 enlightenments_eax; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EAX */
+   u32 enlightenments_ebx; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EBX */
+   u32 syndbg_cap_eax; /* 
HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */
+   } cpuid_cache;
 };
 
 /* Xen HVM per vcpu emulation context */
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index f98370a39936..781f9da9a418 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -273,15 +273,10 @@ static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
 
 static bool kvm_hv_is_syndbg_enabled(struct kvm_vcpu *vcpu)
 {
-   struct kvm_cpuid_entry2 *entry;
-
-   entry = kvm_find_cpuid_entry(vcpu,
-HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES,
-0);
-   if (!entry)
-   return false;
+   struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
 
-   return entry->eax & HV_X64_SYNDBG_CAP_ALLOW_KERNEL_DEBUGGING;
+   return hv_vcpu->cpuid_cache.syndbg_cap_eax &
+   HV_X64_SYNDBG_CAP_ALLOW_KERNEL_DEBUGGING;
 }
 
 static int kvm_hv_syndbg_complete_userspace(struct kvm_vcpu *vcpu)
@@ -1801,12 +1796,47 @@ static u64 kvm_hv_send_ipi(struct kvm_vcpu *vcpu, u64 
ingpa, u64 outgpa,
 void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
 {
struct kvm_cpuid_entry2 *entry;
+   struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
 
entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_INTERFACE, 0);
-   if (entry && entry->eax == HYPERV_CPUID_SIGNATURE_EAX)
+   if (entry && entry->eax == HYPERV_CPUID_SIGNATURE_EAX) {
vcpu->arch.hyperv_enabled = true;
-   else
+   } else {
vcpu->arch.hyperv_enabled = false;
+   return;
+   }
+
+   if (!to_hv_vcpu(vcpu) && kvm_hv_vcpu_init(vcpu))
+   return;
+
+   hv_vcpu = to_hv_vcpu(vcpu);
+
+   entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_FEATURES, 0);
+   if (entry) {
+   hv_vcpu->cpuid_cache.features_eax = entry->eax;
+   hv_vcpu->cpuid_cache.features_ebx = entry->ebx;
+   hv_vcpu->cpuid_cache.features_edx = entry->edx;
+   } else {
+   hv_vcpu->cpuid_cache.features_eax = 0;
+   hv_vcpu->cpuid_cache.features_ebx = 0;
+   hv_vcpu->cpuid_cache.features_edx = 0;
+   }
+
+   entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_ENLIGHTMENT_INFO, 0);
+   if (entry) {
+   hv_vcpu->cpuid_cache.enlightenments_eax = entry->eax;
+   hv_vcpu->cpuid_cache.enlightenments_ebx = entry->ebx;
+   } else {
+   hv_vcpu->cpuid_cache.enlightenments_eax = 0;
+   hv_vcpu->cpuid_cache.enlightenments_ebx = 0;
+   }
+
+   entry = kvm_find_cpuid_entry(vcpu, 
HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES, 0);
+   if (entry) {
+   hv_vcpu->cpuid_cache.syndbg_cap_eax = entry->eax;
+   } else {
+   hv_vcpu->cpuid_cache.syndbg_cap_eax = 0;
+   }
 }
 
 bool kvm_hv_hypercall_enabled(struct kvm_vcpu *vcpu)
-- 
2.30.2



[PATCH RFC 01/22] asm-generic/hyperv: add HV_STATUS_ACCESS_DENIED definition

2021-04-13 Thread Vitaly Kuznetsov
>From TLFSv6.0b, this status means: "The caller did not possess sufficient
access rights to perform the requested operation."

Signed-off-by: Vitaly Kuznetsov 
---
 include/asm-generic/hyperv-tlfs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/asm-generic/hyperv-tlfs.h 
b/include/asm-generic/hyperv-tlfs.h
index 83448e837ded..e01a3bade13a 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -187,6 +187,7 @@ enum HV_GENERIC_SET_FORMAT {
 #define HV_STATUS_INVALID_HYPERCALL_INPUT  3
 #define HV_STATUS_INVALID_ALIGNMENT4
 #define HV_STATUS_INVALID_PARAMETER5
+#define HV_STATUS_ACCESS_DENIED6
 #define HV_STATUS_OPERATION_DENIED 8
 #define HV_STATUS_INSUFFICIENT_MEMORY  11
 #define HV_STATUS_INVALID_PORT_ID  17
-- 
2.30.2



[PATCH RFC 00/22] KVM: x86: hyper-v: Fine-grained access check to Hyper-V hypercalls and MSRs

2021-04-13 Thread Vitaly Kuznetsov
Currently, all implemented Hyper-V features (MSRs and hypercalls) are
available unconditionally to all Hyper-V enabled guests. This is not
ideal as KVM userspace may decide to provide only a subset of the
currently implemented features to emulate an older Hyper-V version,
to reduce attack surface,... Implement checks against guest visible
CPUIDs for all currently implemented MSRs and hypercalls.

RFC part:
- KVM has KVM_CAP_ENFORCE_PV_FEATURE_CPUID for KVM PV features. Should
 we use it for Hyper-V as well or should we rather add a Hyper-V specific
 CAP (or neither)?

TODO:
- Write a selftest
- Check with various Windows/Hyper-V versions that CPUID feature bits
 are actually respected.

Vitaly Kuznetsov (22):
  asm-generic/hyperv: add HV_STATUS_ACCESS_DENIED definition
  KVM: x86: hyper-v: Cache guest CPUID leaves determining features
availability
  KVM: x86: hyper-v: Honor HV_MSR_VP_RUNTIME_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_TIME_REF_COUNT_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_HYPERCALL_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_VP_INDEX_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_RESET_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_REFERENCE_TSC_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_SYNIC_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_SYNTIMER_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_MSR_APIC_ACCESS_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_ACCESS_FREQUENCY_MSRS privilege bit
  KVM: x86: hyper-v: Honor HV_ACCESS_REENLIGHTENMENT privilege bit
  KVM: x86: hyper-v: Honor HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE
privilege bit
  KVM: x86: hyper-v: Honor HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE
privilege bit
  KVM: x86: hyper-v: Honor HV_STIMER_DIRECT_MODE_AVAILABLE privilege bit
  KVM: x86: hyper-v: Honor HV_POST_MESSAGES privilege bit
  KVM: x86: hyper-v: Honor HV_SIGNAL_EVENTS privilege bit
  KVM: x86: hyper-v: Honor HV_DEBUGGING privilege bit
  KVM: x86: hyper-v: Honor HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED bit
  KVM: x86: hyper-v: Honor HV_X64_CLUSTER_IPI_RECOMMENDED bit
  KVM: x86: hyper-v: Check access to HVCALL_NOTIFY_LONG_SPIN_WAIT
hypercall

 arch/x86/include/asm/kvm_host.h   |   8 +
 arch/x86/kvm/hyperv.c | 305 +++---
 include/asm-generic/hyperv-tlfs.h |   1 +
 3 files changed, 291 insertions(+), 23 deletions(-)

-- 
2.30.2



Re: [PATCH 4/4] KVM: hyper-v: Advertise support for fast XMM hypercalls

2021-04-12 Thread Vitaly Kuznetsov
Siddharth Chandrasekaran  writes:

> On Thu, Apr 08, 2021 at 04:44:23PM +0200, Vitaly Kuznetsov wrote:
>> Siddharth Chandrasekaran  writes:
>> > On Thu, Apr 08, 2021 at 02:05:53PM +0200, Vitaly Kuznetsov wrote:
>> >> Siddharth Chandrasekaran  writes:
>> >> > Now that all extant hypercalls that can use XMM registers (based on
>> >> > spec) for input/outputs are patched to support them, we can start
>> >> > advertising this feature to guests.
>> >> >
>> >> > Cc: Alexander Graf 
>> >> > Cc: Evgeny Iakovlev 
>> >> > Signed-off-by: Siddharth Chandrasekaran 
>> >> > ---
>> >> >  arch/x86/include/asm/hyperv-tlfs.h | 4 ++--
>> >> >  arch/x86/kvm/hyperv.c  | 1 +
>> >> >  2 files changed, 3 insertions(+), 2 deletions(-)
>> >> >
>> >> > diff --git a/arch/x86/include/asm/hyperv-tlfs.h 
>> >> > b/arch/x86/include/asm/hyperv-tlfs.h
>> >> > index e6cd3fee562b..1f160ef60509 100644
>> >> > --- a/arch/x86/include/asm/hyperv-tlfs.h
>> >> > +++ b/arch/x86/include/asm/hyperv-tlfs.h
>> >> > @@ -49,10 +49,10 @@
>> >> >  /* Support for physical CPU dynamic partitioning events is available*/
>> >> >  #define HV_X64_CPU_DYNAMIC_PARTITIONING_AVAILABLEBIT(3)
>> >> >  /*
>> >> > - * Support for passing hypercall input parameter block via XMM
>> >> > + * Support for passing hypercall input and output parameter block via 
>> >> > XMM
>> >> >   * registers is available
>> >> >   */
>> >> > -#define HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLEBIT(4)
>> >> > +#define HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLEBIT(4) | 
>> >> > BIT(15)
>> >>
>> >> TLFS 6.0b states that there are two distinct bits for input and output:
>> >>
>> >> CPUID Leaf 0x4003.EDX:
>> >> Bit 4: support for passing hypercall input via XMM registers is available.
>> >> Bit 15: support for returning hypercall output via XMM registers is 
>> >> available.
>> >>
>> >> and HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLE is not currently used
>> >> anywhere, I'd suggest we just rename
>> >>
>> >> HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLE to 
>> >> HV_X64_HYPERCALL_XMM_INPUT_AVAILABLE
>> >> and add HV_X64_HYPERCALL_XMM_OUTPUT_AVAILABLE (bit 15).
>> >
>> > That is how I had it initially; but then noticed that we would never
>> > need to use either of them separately. So it seemed like a reasonable
>> > abstraction to put them together.
>> >
>> 
>> Actually, we may. In theory, KVM userspace may decide to expose just
>> one of these two to the guest as it is not obliged to copy everything
>> from KVM_GET_SUPPORTED_HV_CPUID so we will need separate
>> guest_cpuid_has() checks.
>
> Looks like guest_cpuid_has() check is for x86 CPU features only (if I'm
> not mistaken) and I don't see a suitable alternative that looks into
> vcpu->arch.cpuid_entries[]. So I plan to add a new method
> hv_guest_cpuid_has() in hyperv.c to have this check; does that sound
> right to you?
> If you can give a quick go-ahead, I'll make the changes requested so
> far and send v2 this series.

Sorry my mistake, guest_cpuid_has() was the wrong function to name. In the
meantime I started working on fine-grained access to the existing
Hyper-V enlightenments as well and I think the best approach would be to
cache CPUID 0x4003 (EAX, EBX, EDX) in kvm_hv_set_cpuid()  to avoid
looping through all guest CPUID entries on every hypercall. Your check
will then look like

 if (hv_vcpu->cpuid_cache.features_edx & HV_X64_HYPERCALL_XMM_INPUT_AVAILABLE)
 ...


 if (hv_vcpu->cpuid_cache.features_edx & HV_X64_HYPERCALL_XMM_OUTPUT_AVAILABLE)
 ...

We can wrap this into a hv_guest_cpuid_has() helper indeed, it'll look like:

 if (hv_guest_cpuid_has(vcpu, HYPERV_CPUID_FEATURES, CPUID_EDX, 
HV_X64_HYPERCALL_XMM_INPUT_AVAILABLE))
 ...

but I'm not sure it's worth it, maybe raw check is shorter and better.

I plan to send something out in a day or two, I'll Cc: you. Feel free to
do v2 without this, if your series gets merged first I can just add the
'fine-grained access' to mine.

Thanks!

-- 
Vitaly



Re: [PATCH 0/4] Add support for XMM fast hypercalls

2021-04-09 Thread Vitaly Kuznetsov
Siddharth Chandrasekaran  writes:

> On Thu, Apr 08, 2021 at 04:30:18PM +, Wei Liu wrote:
>> On Thu, Apr 08, 2021 at 05:54:43PM +0200, Siddharth Chandrasekaran wrote:
>> > On Thu, Apr 08, 2021 at 05:48:19PM +0200, Paolo Bonzini wrote:
>> > > On 08/04/21 17:40, Siddharth Chandrasekaran wrote:
>> > > > > > > Although the Hyper-v TLFS mentions that a guest cannot use this 
>> > > > > > > feature
>> > > > > > > unless the hypervisor advertises support for it, some hypercalls 
>> > > > > > > which
>> > > > > > > we plan on upstreaming in future uses them anyway.
>> > > > > > No, please don't do this. Check the feature bit(s) before you issue
>> > > > > > hypercalls which rely on the extended interface.
>> > > > > Perhaps Siddharth should clarify this, but I read it as Hyper-V being
>> > > > > buggy and using XMM arguments unconditionally.
>> > > > The guest is at fault here as it expects Hyper-V to consume arguments
>> > > > from XMM registers for certain hypercalls (that we are working) even if
>> > > > we didn't expose the feature via CPUID bits.
>> > >
>> > > What guest is that?
>> >
>> > It is a Windows Server 2016.
>> 
>> Can you be more specific? Are you implementing some hypercalls from
>> TLFS? If so, which ones?
>
> Yes all of them are from TLFS. We are implementing VSM and there are a
> bunch of hypercalls that we have implemented to manage VTL switches,
> memory protection and virtual interrupts.

Wow, sounds awesome! Do you plan to upstream this work?

> The following 3 hypercalls that use the XMM fast hypercalls are relevant
> to this patch set:
>
> HvCallModifyVtlProtectionMask
> HvGetVpRegisters 
> HvSetVpRegisters 

It seems AccessVSM and AccessVpRegisters privilges have implicit
dependency on XMM input/output. This will need to be enforced in KVM
userspace.

-- 
Vitaly



Re: [PATCH 4/4] KVM: hyper-v: Advertise support for fast XMM hypercalls

2021-04-09 Thread Vitaly Kuznetsov
Siddharth Chandrasekaran  writes:

> On Thu, Apr 08, 2021 at 04:44:23PM +0200, Vitaly Kuznetsov wrote:
>> CAUTION: This email originated from outside of the organization. Do not 
>> click links or open attachments unless you can confirm the sender and know 
>> the content is safe.
>>
>>
>>
>> Siddharth Chandrasekaran  writes:
>>
>> > On Thu, Apr 08, 2021 at 02:05:53PM +0200, Vitaly Kuznetsov wrote:
>> >> Siddharth Chandrasekaran  writes:
>> >>
>> >> > Now that all extant hypercalls that can use XMM registers (based on
>> >> > spec) for input/outputs are patched to support them, we can start
>> >> > advertising this feature to guests.
>> >> >
>> >> > Cc: Alexander Graf 
>> >> > Cc: Evgeny Iakovlev 
>> >> > Signed-off-by: Siddharth Chandrasekaran 
>> >> > ---
>> >> >  arch/x86/include/asm/hyperv-tlfs.h | 4 ++--
>> >> >  arch/x86/kvm/hyperv.c  | 1 +
>> >> >  2 files changed, 3 insertions(+), 2 deletions(-)
>> >> >
>> >> > diff --git a/arch/x86/include/asm/hyperv-tlfs.h 
>> >> > b/arch/x86/include/asm/hyperv-tlfs.h
>> >> > index e6cd3fee562b..1f160ef60509 100644
>> >> > --- a/arch/x86/include/asm/hyperv-tlfs.h
>> >> > +++ b/arch/x86/include/asm/hyperv-tlfs.h
>> >> > @@ -49,10 +49,10 @@
>> >> >  /* Support for physical CPU dynamic partitioning events is available*/
>> >> >  #define HV_X64_CPU_DYNAMIC_PARTITIONING_AVAILABLEBIT(3)
>> >> >  /*
>> >> > - * Support for passing hypercall input parameter block via XMM
>> >> > + * Support for passing hypercall input and output parameter block via 
>> >> > XMM
>> >> >   * registers is available
>> >> >   */
>> >> > -#define HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLEBIT(4)
>> >> > +#define HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLEBIT(4) | 
>> >> > BIT(15)
>> >>
>> >> TLFS 6.0b states that there are two distinct bits for input and output:
>> >>
>> >> CPUID Leaf 0x4003.EDX:
>> >> Bit 4: support for passing hypercall input via XMM registers is available.
>> >> Bit 15: support for returning hypercall output via XMM registers is 
>> >> available.
>> >>
>> >> and HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLE is not currently used
>> >> anywhere, I'd suggest we just rename
>> >>
>> >> HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLE to 
>> >> HV_X64_HYPERCALL_XMM_INPUT_AVAILABLE
>> >> and add HV_X64_HYPERCALL_XMM_OUTPUT_AVAILABLE (bit 15).
>> >
>> > That is how I had it initially; but then noticed that we would never
>> > need to use either of them separately. So it seemed like a reasonable
>> > abstraction to put them together.
>> >
>>
>> Actually, we may. In theory, KVM userspace may decide to expose just
>> one of these two to the guest as it is not obliged to copy everything
>> from KVM_GET_SUPPORTED_HV_CPUID so we will need separate
>> guest_cpuid_has() checks.
>
> Makes sense. I'll split them and add the checks.
>
>> (This reminds me of something I didn't see in your series:
>> we need to check that XMM hypercall parameters support was actually
>> exposed to the guest as it is illegal for a guest to use it otherwise --
>> and we will likely need two checks, for input and output).
>
> We observed that Windows expects Hyper-V to support XMM params even if
> we don't advertise this feature but if userspace wants to hide this
> feature and the guest does it anyway, then it makes sense to treat it as
> an illegal OP.
>

Out of pure curiosity, which Windows version behaves like that? And how
does this work with KVM without your patches?

Sane KVM userspaces will certainly expose both XMM input and output
capabilities together but having an ability to hide one or both of them
may come handy while debugging.

Also, we weren't enforcing the rule that enlightenments not exposed to
the guest don't work, even the whole Hyper-V emulation interface was
available to all guests who were smart enough to know how to enable it!
I don't like this for two reasons: security (large attack surface) and
the fact that someone 'smart' may decide to use Hyper-V emulation
features on KVM as 'general purpose' features saying 'they're always
available anyway', this risks becoming an ABI.

Let's at least properly check if the feature was exposed to the guest
for all new enlightenments.

-- 
Vitaly



Re: [PATCH 4/4] KVM: hyper-v: Advertise support for fast XMM hypercalls

2021-04-08 Thread Vitaly Kuznetsov
Siddharth Chandrasekaran  writes:

> On Thu, Apr 08, 2021 at 02:05:53PM +0200, Vitaly Kuznetsov wrote:
>> Siddharth Chandrasekaran  writes:
>>
>> > Now that all extant hypercalls that can use XMM registers (based on
>> > spec) for input/outputs are patched to support them, we can start
>> > advertising this feature to guests.
>> >
>> > Cc: Alexander Graf 
>> > Cc: Evgeny Iakovlev 
>> > Signed-off-by: Siddharth Chandrasekaran 
>> > ---
>> >  arch/x86/include/asm/hyperv-tlfs.h | 4 ++--
>> >  arch/x86/kvm/hyperv.c  | 1 +
>> >  2 files changed, 3 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/arch/x86/include/asm/hyperv-tlfs.h 
>> > b/arch/x86/include/asm/hyperv-tlfs.h
>> > index e6cd3fee562b..1f160ef60509 100644
>> > --- a/arch/x86/include/asm/hyperv-tlfs.h
>> > +++ b/arch/x86/include/asm/hyperv-tlfs.h
>> > @@ -49,10 +49,10 @@
>> >  /* Support for physical CPU dynamic partitioning events is available*/
>> >  #define HV_X64_CPU_DYNAMIC_PARTITIONING_AVAILABLEBIT(3)
>> >  /*
>> > - * Support for passing hypercall input parameter block via XMM
>> > + * Support for passing hypercall input and output parameter block via XMM
>> >   * registers is available
>> >   */
>> > -#define HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLEBIT(4)
>> > +#define HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLEBIT(4) | 
>> > BIT(15)
>>
>> TLFS 6.0b states that there are two distinct bits for input and output:
>>
>> CPUID Leaf 0x4003.EDX:
>> Bit 4: support for passing hypercall input via XMM registers is available.
>> Bit 15: support for returning hypercall output via XMM registers is 
>> available.
>>
>> and HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLE is not currently used
>> anywhere, I'd suggest we just rename
>>
>> HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLE to HV_X64_HYPERCALL_XMM_INPUT_AVAILABLE
>> and add HV_X64_HYPERCALL_XMM_OUTPUT_AVAILABLE (bit 15).
>
> That is how I had it initially; but then noticed that we would never
> need to use either of them separately. So it seemed like a reasonable
> abstraction to put them together.
>

Actually, we may. In theory, KVM userspace may decide to expose just
one of these two to the guest as it is not obliged to copy everything
from KVM_GET_SUPPORTED_HV_CPUID so we will need separate
guest_cpuid_has() checks.

(This reminds me of something I didn't see in your series:
we need to check that XMM hypercall parameters support was actually
exposed to the guest as it is illegal for a guest to use it otherwise --
and we will likely need two checks, for input and output).

Also, (and that's what triggered my comment) all other HV_ACCESS_* in
kvm_get_hv_cpuid() are single bits so my first impression was that you
forgot one bit, but then I saw that you combined them together.

-- 
Vitaly



Re: [PATCH 3/4] KVM: x86: kvm_hv_flush_tlb use inputs from XMM registers

2021-04-08 Thread Vitaly Kuznetsov
Paolo Bonzini  writes:

> On 08/04/21 14:01, Vitaly Kuznetsov wrote:
>> 
>> Also, we can probably defer kvm_hv_hypercall_read_xmm() until we know
>> how many regs we actually need to not read them all (we will always
>> need xmm[0] I guess so we can as well read it here).
>
> The cost is get/put FPU, so I think there's not much to gain from that.
>

Maybe, I just think that in most cases we will only need xmm0. To make
the optimization work we can probably do kvm_get_fpu() once we figured
out that we're dealing with XMM hypercall and do kvm_put_fpu() when
we're done processing hypercall parameters. This way we don't need to do
get/put twice. We can certainly leave this idea to the (possible) future
optimizations.

-- 
Vitaly



Re: [PATCH 4/4] KVM: hyper-v: Advertise support for fast XMM hypercalls

2021-04-08 Thread Vitaly Kuznetsov
Siddharth Chandrasekaran  writes:

> Now that all extant hypercalls that can use XMM registers (based on
> spec) for input/outputs are patched to support them, we can start
> advertising this feature to guests.
>
> Cc: Alexander Graf 
> Cc: Evgeny Iakovlev 
> Signed-off-by: Siddharth Chandrasekaran 
> ---
>  arch/x86/include/asm/hyperv-tlfs.h | 4 ++--
>  arch/x86/kvm/hyperv.c  | 1 +
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/hyperv-tlfs.h 
> b/arch/x86/include/asm/hyperv-tlfs.h
> index e6cd3fee562b..1f160ef60509 100644
> --- a/arch/x86/include/asm/hyperv-tlfs.h
> +++ b/arch/x86/include/asm/hyperv-tlfs.h
> @@ -49,10 +49,10 @@
>  /* Support for physical CPU dynamic partitioning events is available*/
>  #define HV_X64_CPU_DYNAMIC_PARTITIONING_AVAILABLEBIT(3)
>  /*
> - * Support for passing hypercall input parameter block via XMM
> + * Support for passing hypercall input and output parameter block via XMM
>   * registers is available
>   */
> -#define HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLEBIT(4)
> +#define HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLEBIT(4) | BIT(15)

TLFS 6.0b states that there are two distinct bits for input and output:

CPUID Leaf 0x4003.EDX:
Bit 4: support for passing hypercall input via XMM registers is available.
Bit 15: support for returning hypercall output via XMM registers is available.

and HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLE is not currently used
anywhere, I'd suggest we just rename 

HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLE to HV_X64_HYPERCALL_XMM_INPUT_AVAILABLE
and add HV_X64_HYPERCALL_XMM_OUTPUT_AVAILABLE (bit 15).

>  /* Support for a virtual guest idle state is available */
>  #define HV_X64_GUEST_IDLE_STATE_AVAILABLEBIT(5)
>  /* Frequency MSRs available */
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index bf2f86f263f1..dd462c1d641d 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -2254,6 +2254,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct 
> kvm_cpuid2 *cpuid,
>   ent->ebx |= HV_POST_MESSAGES;
>   ent->ebx |= HV_SIGNAL_EVENTS;
>  
> + ent->edx |= HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLE;
>   ent->edx |= HV_FEATURE_FREQUENCY_MSRS_AVAILABLE;
>   ent->edx |= HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE;

-- 
Vitaly



Re: [PATCH 3/4] KVM: x86: kvm_hv_flush_tlb use inputs from XMM registers

2021-04-08 Thread Vitaly Kuznetsov
Siddharth Chandrasekaran  writes:

> Hyper-V supports the use of XMM registers to perform fast hypercalls.
> This allows guests to take advantage of the improved performance of the
> fast hypercall interface even though a hypercall may require more than
> (the current maximum of) two input registers.
>
> The XMM fast hypercall interface uses six additional XMM registers (XMM0
> to XMM5) to allow the guest to pass an input parameter block of up to
> 112 bytes. Hyper-V can also return data back to the guest in the
> remaining XMM registers that are not used by the current hypercall.
>
> Add framework to read/write to XMM registers in kvm_hv_hypercall() and
> use the additional hypercall inputs from XMM registers in
> kvm_hv_flush_tlb() when possible.
>
> Cc: Alexander Graf 
> Co-developed-by: Evgeny Iakovlev 
> Signed-off-by: Evgeny Iakovlev 
> Signed-off-by: Siddharth Chandrasekaran 
> ---
>  arch/x86/kvm/hyperv.c | 109 ++
>  1 file changed, 90 insertions(+), 19 deletions(-)
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 8f6babd1ea0d..bf2f86f263f1 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -36,6 +36,7 @@
>  
>  #include "trace.h"
>  #include "irq.h"
> +#include "fpu.h"
>  
>  /* "Hv#1" signature */
>  #define HYPERV_CPUID_SIGNATURE_EAX 0x31237648
> @@ -1623,6 +1624,8 @@ static __always_inline unsigned long 
> *sparse_set_to_vcpu_mask(
>   return vcpu_bitmap;
>  }
>  
> +#define KVM_HV_HYPERCALL_MAX_XMM_REGISTERS  6
> +
>  struct kvm_hv_hcall {
>   u64 param;
>   u64 ingpa;
> @@ -1632,10 +1635,14 @@ struct kvm_hv_hcall {
>   u16 rep_idx;
>   bool fast;
>   bool rep;
> + sse128_t xmm[KVM_HV_HYPERCALL_MAX_XMM_REGISTERS];
> + bool xmm_dirty;
>  };
>  
>  static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, 
> bool ex)
>  {
> + int i, j;
> + gpa_t gpa;
>   struct kvm *kvm = vcpu->kvm;
>   struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>   struct hv_tlb_flush_ex flush_ex;
> @@ -1649,8 +1656,15 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, 
> struct kvm_hv_hcall *hc, bool
>   bool all_cpus;
>  
>   if (!ex) {
> - if (unlikely(kvm_read_guest(kvm, hc->ingpa, , 
> sizeof(flush
> - return HV_STATUS_INVALID_HYPERCALL_INPUT;
> + if (hc->fast) {
> + flush.address_space = hc->ingpa;
> + flush.flags = hc->outgpa;
> + flush.processor_mask = sse128_lo(hc->xmm[0]);
> + } else {
> + if (unlikely(kvm_read_guest(kvm, hc->ingpa,
> + , sizeof(flush
> + return HV_STATUS_INVALID_HYPERCALL_INPUT;
> + }
>  
>   trace_kvm_hv_flush_tlb(flush.processor_mask,
>  flush.address_space, flush.flags);
> @@ -1668,9 +1682,16 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, 
> struct kvm_hv_hcall *hc, bool
>   all_cpus = (flush.flags & HV_FLUSH_ALL_PROCESSORS) ||
>   flush.processor_mask == 0;
>   } else {
> - if (unlikely(kvm_read_guest(kvm, hc->ingpa, _ex,
> - sizeof(flush_ex
> - return HV_STATUS_INVALID_HYPERCALL_INPUT;
> + if (hc->fast) {
> + flush_ex.address_space = hc->ingpa;
> + flush_ex.flags = hc->outgpa;
> + memcpy(_ex.hv_vp_set,
> +>xmm[0], sizeof(hc->xmm[0]));
> + } else {
> + if (unlikely(kvm_read_guest(kvm, hc->ingpa, _ex,
> + sizeof(flush_ex
> + return HV_STATUS_INVALID_HYPERCALL_INPUT;
> + }
>  
>   trace_kvm_hv_flush_tlb_ex(flush_ex.hv_vp_set.valid_bank_mask,
> flush_ex.hv_vp_set.format,
> @@ -1681,20 +1702,29 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, 
> struct kvm_hv_hcall *hc, bool
>   all_cpus = flush_ex.hv_vp_set.format !=
>   HV_GENERIC_SET_SPARSE_4K;
>  
> - sparse_banks_len =
> - bitmap_weight((unsigned long *)_bank_mask, 64) *
> - sizeof(sparse_banks[0]);
> + sparse_banks_len = bitmap_weight((unsigned long 
> *)_bank_mask, 64);
>  
>   if (!sparse_banks_len && !all_cpus)
>   goto ret_success;
>  
> - if (!all_cpus &&
> - kvm_read_guest(kvm,
> -hc->ingpa + offsetof(struct hv_tlb_flush_ex,
> - 
> hv_vp_set.bank_contents),
> -sparse_banks,
> -sparse_banks_len))
> - 

Re: [PATCH 7/7] KVM: SVM: hyper-v: Direct Virtual Flush support

2021-04-08 Thread Vitaly Kuznetsov
Vineeth Pillai  writes:

> From Hyper-V TLFS:
>  "The hypervisor exposes hypercalls (HvFlushVirtualAddressSpace,
>   HvFlushVirtualAddressSpaceEx, HvFlushVirtualAddressList, and
>   HvFlushVirtualAddressListEx) that allow operating systems to more
>   efficiently manage the virtual TLB. The L1 hypervisor can choose to
>   allow its guest to use those hypercalls and delegate the responsibility
>   to handle them to the L0 hypervisor. This requires the use of a
>   partition assist page."
>
> Add the Direct Virtual Flush support for SVM.
>
> Related VMX changes:
> commit 6f6a657c9998 ("KVM/Hyper-V/VMX: Add direct tlb flush support")
>
> Signed-off-by: Vineeth Pillai 
> ---
>  arch/x86/kvm/svm/svm.c | 48 ++
>  1 file changed, 48 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 3562a247b7e8..c6d3f3a7c986 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -440,6 +440,32 @@ static void svm_init_osvw(struct kvm_vcpu *vcpu)
>   vcpu->arch.osvw.status |= 1;
>  }
>  
> +#if IS_ENABLED(CONFIG_HYPERV)
> +static int hv_enable_direct_tlbflush(struct kvm_vcpu *vcpu)
> +{
> + struct hv_enlightenments *hve;
> + struct hv_partition_assist_pg **p_hv_pa_pg =
> + _kvm_hv(vcpu->kvm)->hv_pa_pg;
> +
> + if (!*p_hv_pa_pg)
> + *p_hv_pa_pg = kzalloc(PAGE_SIZE, GFP_KERNEL);
> +
> + if (!*p_hv_pa_pg)
> + return -ENOMEM;
> +
> + hve = (struct hv_enlightenments 
> *)_svm(vcpu)->vmcb->hv_enlightenments;
> +
> + hve->partition_assist_page = __pa(*p_hv_pa_pg);
> + hve->hv_vm_id = (unsigned long)vcpu->kvm;
> + if (!hve->hv_enlightenments_control.nested_flush_hypercall) {
> + hve->hv_enlightenments_control.nested_flush_hypercall = 1;
> + vmcb_mark_dirty(to_svm(vcpu)->vmcb, 
> VMCB_HV_NESTED_ENLIGHTENMENTS);
> + }
> +
> + return 0;
> +}
> +#endif
> +
>  static int has_svm(void)
>  {
>   const char *msg;
> @@ -1034,6 +1060,21 @@ static __init int svm_hardware_setup(void)
>   svm_x86_ops.tlb_remote_flush_with_range =
>   kvm_hv_remote_flush_tlb_with_range;
>   }
> +
> + if (ms_hyperv.nested_features & HV_X64_NESTED_DIRECT_FLUSH) {
> + pr_info("kvm: Hyper-V Direct TLB Flush enabled\n");
> + for_each_online_cpu(cpu) {
> + struct hv_vp_assist_page *vp_ap =
> + hv_get_vp_assist_page(cpu);
> +
> + if (!vp_ap)
> + continue;
> +
> + vp_ap->nested_control.features.directhypercall = 1;
> + }
> + svm_x86_ops.enable_direct_tlbflush =
> + hv_enable_direct_tlbflush;
> + }
>  #endif
>  
>   if (nrips) {
> @@ -3913,6 +3954,13 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct 
> kvm_vcpu *vcpu)
>   }
>   svm->vmcb->save.cr2 = vcpu->arch.cr2;
>  
> +#if IS_ENABLED(CONFIG_HYPERV)
> + if (svm->vmcb->hv_enlightenments.hv_vp_id != 
> to_hv_vcpu(vcpu)->vp_index) {

This looks wrong (see my previous comment about mixing KVM-on-Hyper-V
and Windows/Hyper-V-on-KVM). 'to_hv_vcpu(vcpu)->vp_index' is
'Windows/Hyper-V-on-KVM' thingy, it does not exist when we run without
any Hyper-V enlightenments exposed (e.g. when we run Linux as our
guest).

> + svm->vmcb->hv_enlightenments.hv_vp_id = 
> to_hv_vcpu(vcpu)->vp_index;
> + vmcb_mark_dirty(svm->vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS);
> + }
> +#endif
> +
>   /*
>* Run with all-zero DR6 unless needed, so that we can get the exact 
> cause
>* of a #DB.

-- 
Vitaly



Re: [PATCH 6/7] KVM: SVM: hyper-v: Enlightened MSR-Bitmap support

2021-04-08 Thread Vitaly Kuznetsov
Vineeth Pillai  writes:

> Enlightened MSR-Bitmap as per TLFS:
>
>  "The L1 hypervisor may collaborate with the L0 hypervisor to make MSR
>   accesses more efficient. It can enable enlightened MSR bitmaps by setting
>   the corresponding field in the enlightened VMCS to 1. When enabled, L0
>   hypervisor does not monitor the MSR bitmaps for changes. Instead, the L1
>   hypervisor must invalidate the corresponding clean field after making
>   changes to one of the MSR bitmaps."
>
> Enable this for SVM.
>
> Related VMX changes:
> commit ceef7d10dfb6 ("KVM: x86: VMX: hyper-v: Enlightened MSR-Bitmap support")
>
> Signed-off-by: Vineeth Pillai 
> ---
>  arch/x86/kvm/svm/svm.c | 27 +++
>  1 file changed, 27 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 6287cab61f15..3562a247b7e8 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -646,6 +646,27 @@ static bool msr_write_intercepted(struct kvm_vcpu *vcpu, 
> u32 msr)
>   return !!test_bit(bit_write,  );
>  }
>  
> +#if IS_ENABLED(CONFIG_HYPERV)
> +static inline void hv_vmcb_dirty_nested_enlightenments(struct kvm_vcpu *vcpu)
> +{
> + struct vmcb *vmcb = to_svm(vcpu)->vmcb;
> +
> + /*
> +  * vmcb can be NULL if called during early vcpu init.
> +  * And its okay not to mark vmcb dirty during vcpu init
> +  * as we mark it dirty unconditionally towards end of vcpu
> +  * init phase.
> +  */
> + if (vmcb && vmcb_is_clean(vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS) &&
> + vmcb->hv_enlightenments.hv_enlightenments_control.msr_bitmap)
> + vmcb_mark_dirty(vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS);

vmcb_is_clean() check seems to be superfluous, vmcb_mark_dirty() does no
harm if the bit was already cleared.

> +}
> +#else
> +static inline void hv_vmcb_dirty_nested_enlightenments(struct kvm_vcpu *vcpu)
> +{
> +}
> +#endif
> +
>  static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
>   u32 msr, int read, int write)
>  {
> @@ -677,6 +698,9 @@ static void set_msr_interception_bitmap(struct kvm_vcpu 
> *vcpu, u32 *msrpm,
>   write ? clear_bit(bit_write, ) : set_bit(bit_write, );
>  
>   msrpm[offset] = tmp;
> +
> + hv_vmcb_dirty_nested_enlightenments(vcpu);
> +
>  }
>  
>  void set_msr_interception(struct kvm_vcpu *vcpu, u32 *msrpm, u32 msr,
> @@ -1135,6 +1159,9 @@ static void hv_init_vmcb(struct vmcb *vmcb)
>   if (npt_enabled &&
>   ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB)
>   hve->hv_enlightenments_control.enlightened_npt_tlb = 1;
> +
> + if (ms_hyperv.nested_features & HV_X64_NESTED_MSR_BITMAP)
> + hve->hv_enlightenments_control.msr_bitmap = 1;
>  }
>  #else
>  static inline void hv_init_vmcb(struct vmcb *vmcb)

-- 
Vitaly



Re: [PATCH 5/7] KVM: SVM: hyper-v: Remote TLB flush for SVM

2021-04-08 Thread Vitaly Kuznetsov
Vineeth Pillai  writes:

> Enable remote TLB flush for SVM.
>
> Signed-off-by: Vineeth Pillai 
> ---
>  arch/x86/kvm/svm/svm.c | 35 +++
>  1 file changed, 35 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index baee91c1e936..6287cab61f15 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -36,6 +36,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include "trace.h"
> @@ -43,6 +44,8 @@
>  #include "svm.h"
>  #include "svm_ops.h"
>  
> +#include "hyperv.h"
> +
>  #define __ex(x) __kvm_handle_fault_on_reboot(x)
>  
>  MODULE_AUTHOR("Qumranet");
> @@ -928,6 +931,8 @@ static __init void svm_set_cpu_caps(void)
>   kvm_cpu_cap_set(X86_FEATURE_VIRT_SSBD);
>  }
>  
> +static struct kvm_x86_ops svm_x86_ops;
> +
>  static __init int svm_hardware_setup(void)
>  {
>   int cpu;
> @@ -997,6 +1002,16 @@ static __init int svm_hardware_setup(void)
>   kvm_configure_mmu(npt_enabled, get_max_npt_level(), PG_LEVEL_1G);
>   pr_info("kvm: Nested Paging %sabled\n", npt_enabled ? "en" : "dis");
>  
> +#if IS_ENABLED(CONFIG_HYPERV)
> + if (ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB
> + && npt_enabled) {
> + pr_info("kvm: Hyper-V enlightened NPT TLB flush enabled\n");
> + svm_x86_ops.tlb_remote_flush = kvm_hv_remote_flush_tlb;
> + svm_x86_ops.tlb_remote_flush_with_range =
> + kvm_hv_remote_flush_tlb_with_range;
> + }
> +#endif
> +
>   if (nrips) {
>   if (!boot_cpu_has(X86_FEATURE_NRIPS))
>   nrips = false;
> @@ -1112,6 +1127,21 @@ static void svm_check_invpcid(struct vcpu_svm *svm)
>   }
>  }
>  
> +#if IS_ENABLED(CONFIG_HYPERV)
> +static void hv_init_vmcb(struct vmcb *vmcb)
> +{
> + struct hv_enlightenments *hve = >hv_enlightenments;
> +
> + if (npt_enabled &&
> + ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB)
> + hve->hv_enlightenments_control.enlightened_npt_tlb = 1;
> +}
> +#else
> +static inline void hv_init_vmcb(struct vmcb *vmcb)
> +{
> +}
> +#endif
> +
>  static void init_vmcb(struct vcpu_svm *svm)
>  {
>   struct vmcb_control_area *control = >vmcb->control;
> @@ -1274,6 +1304,8 @@ static void init_vmcb(struct vcpu_svm *svm)
>   }
>   }
>  
> + hv_init_vmcb(svm->vmcb);
> +
>   vmcb_mark_all_dirty(svm->vmcb);
>  
>   enable_gif(svm);
> @@ -3967,6 +3999,9 @@ static void svm_load_mmu_pgd(struct kvm_vcpu *vcpu, 
> unsigned long root,
>   svm->vmcb->control.nested_cr3 = cr3;
>   vmcb_mark_dirty(svm->vmcb, VMCB_NPT);
>  
> + if (kvm_x86_ops.tlb_remote_flush)
> + kvm_update_arch_tdp_pointer(vcpu->kvm, vcpu, cr3);
> +

VMX has "#if IS_ENABLED(CONFIG_HYPERV)" around this, should we add it
here too?

>   /* Loading L2's CR3 is handled by enter_svm_guest_mode.  */
>   if (!test_bit(VCPU_EXREG_CR3, (ulong *)>arch.regs_avail))
>   return;

-- 
Vitaly



Re: [PATCH 3/7] KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx

2021-04-08 Thread Vitaly Kuznetsov
Vineeth Pillai  writes:

> Currently the remote TLB flush logic is specific to VMX.
> Move it to a common place so that SVM can use it as well.
>
> Signed-off-by: Vineeth Pillai 
> ---
>  arch/x86/include/asm/kvm_host.h | 15 +
>  arch/x86/kvm/hyperv.c   | 89 ++
>  arch/x86/kvm/hyperv.h   | 12 
>  arch/x86/kvm/vmx/vmx.c  | 97 +++--
>  arch/x86/kvm/vmx/vmx.h  | 10 
>  5 files changed, 123 insertions(+), 100 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 877a4025d8da..336716124b7e 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -530,6 +530,12 @@ struct kvm_vcpu_hv {
>   struct kvm_vcpu_hv_stimer stimer[HV_SYNIC_STIMER_COUNT];
>   DECLARE_BITMAP(stimer_pending_bitmap, HV_SYNIC_STIMER_COUNT);
>   cpumask_t tlb_flush;
> + /*
> +  * Two Dimensional paging CR3
> +  * EPTP for Intel
> +  * nCR3 for AMD
> +  */
> + u64 tdp_pointer;
>  };

'struct kvm_vcpu_hv' is only allocated when we emulate Hyper-V in KVM
(run Windows/Hyper-V guests on top of KVM). Remote TLB flush is used
when we run KVM on Hyper-V and this is a very different beast. Let's not
mix these things together. I understand that some unification is needed
to bring the AMD specific feature but let's do it differently.

E.g. 'ept_pointer' and friends from 'struct kvm_vmx' can just go to
'struct kvm_vcpu_arch' (in case they really need to be unified).

>  
>  /* Xen HVM per vcpu emulation context */
> @@ -884,6 +890,12 @@ struct kvm_hv_syndbg {
>   u64 options;
>  };
>  
> +enum tdp_pointers_status {
> + TDP_POINTERS_CHECK = 0,
> + TDP_POINTERS_MATCH = 1,
> + TDP_POINTERS_MISMATCH = 2
> +};
> +
>  /* Hyper-V emulation context */
>  struct kvm_hv {
>   struct mutex hv_lock;
> @@ -908,6 +920,9 @@ struct kvm_hv {
>  
>   struct hv_partition_assist_pg *hv_pa_pg;
>   struct kvm_hv_syndbg hv_syndbg;
> +
> + enum tdp_pointers_status tdp_pointers_match;
> + spinlock_t tdp_pointer_lock;
>  };
>  
>  struct msr_bitmap_range {
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 58fa8c029867..c5bec598bf28 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -32,6 +32,7 @@
>  #include 
>  
>  #include 
> +#include 
>  #include 
>  
>  #include "trace.h"
> @@ -913,6 +914,8 @@ static int kvm_hv_vcpu_init(struct kvm_vcpu *vcpu)
>   for (i = 0; i < ARRAY_SIZE(hv_vcpu->stimer); i++)
>   stimer_init(_vcpu->stimer[i], i);
>  
> + hv_vcpu->tdp_pointer = INVALID_PAGE;
> +
>   hv_vcpu->vp_index = kvm_vcpu_get_idx(vcpu);
>  
>   return 0;
> @@ -1960,6 +1963,7 @@ void kvm_hv_init_vm(struct kvm *kvm)
>  {
>   struct kvm_hv *hv = to_kvm_hv(kvm);
>  
> + spin_lock_init(>tdp_pointer_lock);
>   mutex_init(>hv_lock);
>   idr_init(>conn_to_evt);
>  }
> @@ -2180,3 +2184,88 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct 
> kvm_cpuid2 *cpuid,
>  
>   return 0;
>  }
> +
> +/* check_tdp_pointer() should be under protection of tdp_pointer_lock. */
> +static void check_tdp_pointer_match(struct kvm *kvm)
> +{
> + u64 tdp_pointer = INVALID_PAGE;
> + bool valid_tdp = false;
> + struct kvm_vcpu *vcpu;
> + int i;
> +
> + kvm_for_each_vcpu(i, vcpu, kvm) {
> + if (!valid_tdp) {
> + tdp_pointer = to_hv_vcpu(vcpu)->tdp_pointer;
> + valid_tdp = true;
> + continue;
> + }
> +
> + if (tdp_pointer != to_hv_vcpu(vcpu)->tdp_pointer) {
> + to_kvm_hv(kvm)->tdp_pointers_match
> + = TDP_POINTERS_MISMATCH;
> + return;
> + }
> + }
> +
> + to_kvm_hv(kvm)->tdp_pointers_match = TDP_POINTERS_MATCH;
> +}
> +
> +static int kvm_fill_hv_flush_list_func(struct hv_guest_mapping_flush_list 
> *flush,
> + void *data)
> +{
> + struct kvm_tlb_range *range = data;
> +
> + return hyperv_fill_flush_guest_mapping_list(flush, range->start_gfn,
> + range->pages);
> +}
> +
> +static inline int __hv_remote_flush_tlb_with_range(struct kvm *kvm,
> + struct kvm_vcpu *vcpu, struct kvm_tlb_range *range)
> +{
> + u64 tdp_pointer = to_hv_vcpu(vcpu)->tdp_pointer;
> +
> + /*
> +  * FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE hypercall needs address
> +  * of the base of EPT PML4 table, strip off EPT configuration
> +  * information.
> +  */
> + if (range)
> + return hyperv_flush_guest_mapping_range(tdp_pointer & PAGE_MASK,
> + kvm_fill_hv_flush_list_func, (void *)range);
> + else
> + return hyperv_flush_guest_mapping(tdp_pointer & PAGE_MASK);
> +}
> +
> +int kvm_hv_remote_flush_tlb_with_range(struct kvm *kvm,
> + struct kvm_tlb_range *range)
> +{
> +

Re: [PATCH 1/7] hyperv: Detect Nested virtualization support for SVM

2021-04-08 Thread Vitaly Kuznetsov
Vineeth Pillai  writes:

> Detect nested features exposed by Hyper-V if SVM is enabled.
>
> Signed-off-by: Vineeth Pillai 
> ---
>  arch/x86/kernel/cpu/mshyperv.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index 3546d3e21787..4d364acfe95d 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -325,9 +325,17 @@ static void __init ms_hyperv_init_platform(void)
>   ms_hyperv.isolation_config_a, 
> ms_hyperv.isolation_config_b);
>   }
>  
> - if (ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED) {
> + /*
> +  * AMD does not need enlightened VMCS as VMCB is already a
> +  * datastructure in memory. 

Well, VMCS is also a structure in memory, isn't it? It's just that we
don't have a 'clean field' concept for it and we can't use normal memory
accesses.

>   We need to get the nested
> +  * features if SVM is enabled.
> +  */
> + if (boot_cpu_has(X86_FEATURE_SVM) ||
> + ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED) {

Do I understand correctly that we can just look at CPUID.0x4000.EAX
and in case it is >= 0x400A we can read HYPERV_CPUID_NESTED_FEATURES
leaf? I'd suggest we do that intead then.

>   ms_hyperv.nested_features =
>   cpuid_eax(HYPERV_CPUID_NESTED_FEATURES);
> + pr_info("Hyper-V nested_features: 0x%x\n",
> + ms_hyperv.nested_features);
>   }
>  
>   /*

-- 
Vitaly



Re: linux-next: Fixes tag needs some work in the pm tree

2021-04-08 Thread Vitaly Kuznetsov
Stephen Rothwell  writes:

> Hi all,
>
> In commit
>
>   fa26d0c778b4 ("ACPI: processor: Fix build when CONFIG_ACPI_PROCESSOR=m")
>
> Fixes tag
>
>   Fixes: 8c182bd7 ("CPI: processor: Fix CPU0 wakeup in 
> acpi_idle_play_dead()")

"A" in "ACPI" seems to be missing

>
> has these problem(s):
>
>   - Subject does not match target commit subject
> Just use
>   git log -1 --format='Fixes: %h ("%s")'

This is an extremely unlucky fix :-( 

-- 
Vitaly



Re: [RFC PATCH 04/18] virt/mshv: request version ioctl

2021-04-07 Thread Vitaly Kuznetsov
Wei Liu  writes:

> On Wed, Apr 07, 2021 at 09:38:21AM +0200, Vitaly Kuznetsov wrote:
>
>> One more though: it is probably a good idea to introduce selftests for
>> /dev/mshv (similar to KVM's selftests in
>> /tools/testing/selftests/kvm). Selftests don't really need a stable ABI
>> as they live in the same linux.git and can be updated in the same patch
>> series which changes /dev/mshv behavior. Selftests are very useful for
>> checking there are no regressions, especially in the situation when
>> there's no publicly available userspace for /dev/mshv.
>
> I think this can wait until we merge the first implementation in tree.
> There are still a lot of moving parts. Our (currently limited) internal
> test cases need more cleaning up before they are ready. I certainly
> don't want to distract Nuno from getting the foundation right.
>

I'm absolutely fine with this approach, selftests are a nice add-on, not
a requirement for the initial implementation. Also, to make them more
useful to mere mortals, a doc on how to run Linux as root Hyper-V
partition would come handy)

-- 
Vitaly



Re: [RFC PATCH 04/18] virt/mshv: request version ioctl

2021-04-07 Thread Vitaly Kuznetsov
Nuno Das Neves  writes:

> On 3/5/2021 1:18 AM, Vitaly Kuznetsov wrote:
>> Nuno Das Neves  writes:
>> 
>>> On 2/9/2021 5:11 AM, Vitaly Kuznetsov wrote:
>>>> Nuno Das Neves  writes:
>>>>
>> ...
>>>>> +
>>>>> +3.1 MSHV_REQUEST_VERSION
>>>>> +
>>>>> +:Type: /dev/mshv ioctl
>>>>> +:Parameters: pointer to a u32
>>>>> +:Returns: 0 on success
>>>>> +
>>>>> +Before issuing any other ioctls, a MSHV_REQUEST_VERSION ioctl must be 
>>>>> called to
>>>>> +establish the interface version with the kernel module.
>>>>> +
>>>>> +The caller should pass the MSHV_VERSION as an argument.
>>>>> +
>>>>> +The kernel module will check which interface versions it supports and 
>>>>> return 0
>>>>> +if one of them matches.
>>>>> +
>>>>> +This /dev/mshv file descriptor will remain 'locked' to that version as 
>>>>> long as
>>>>> +it is open - this ioctl can only be called once per open.
>>>>> +
>>>>
>>>> KVM used to have KVM_GET_API_VERSION too but this turned out to be not
>>>> very convenient so we use capabilities (KVM_CHECK_EXTENSION/KVM_ENABLE_CAP)
>>>> instead.
>>>>
>>>
>>> The goal of MSHV_REQUEST_VERSION is to support changes to APIs in the core 
>>> set.
>>> When we add new features/ioctls beyond the core we can use an 
>>> extension/capability
>>> approach like KVM.
>>>
>> 
>> Driver versions is a very bad idea from distribution/stable kernel point
>> of view as it presumes that the history is linear. It is not.
>> 
>> Imagine you have the following history upstream:
>> 
>> MSHV_REQUEST_VERSION = 1
>> <100 commits with features/fixes>
>> MSHV_REQUEST_VERSION = 2
>> 
>> MSHV_REQUEST_VERSION = 2
>> 
>> Now I'm a linux distribution / stable kernel maintainer. My kernel is at
>> MSHV_REQUEST_VERSION = 1. Now I want to backport 1 feature from between
>> VER=1 and VER=2 and another feature from between VER=2 and VER=3. My
>> history now looks like
>> 
>> MSHV_REQUEST_VERSION = 1
>> <5 commits from between VER=1 and VER=2>
>>Which version should I declare here 
>> <5 commits from between VER=2 and VER=3>
>>Which version should I declare here 
>> 
>> If I keep VER=1 then userspace will think that I don't have any extra
>> features added and just won't use them. If I change VER to 2/3, it'll
>> think I have *all* features from between these versions.
>> 
>> The only reasonable way to manage this is to attach a "capability" to
>> every ABI change and expose this capability *in the same commit which
>> introduces the change to the ABI*. This way userspace will now exactly
>> which ioctls are available and what are their interfaces.
>> 
>> Also, trying to define "core set" is hard but you don't really need
>> to.
>> 
>
> We've had some internal discussion on this.
>
> There is bound to be some iteration before this ABI is stable, since even the
> underlying Microsoft hypervisor interfaces aren't stable just yet.
>
> It might make more sense to just have an IOCTL to check if the API is stable 
> yet.
> This would be analogous to checking if kVM_GET_API_VERSION returns 12.
>
> How does this sound as a proposal?
> An MSHV_CHECK_EXTENSION ioctl to query extensions to the core /dev/mshv API.
>
> It takes a single argument, an integer named MSHV_CAP_* corresponding to
> the extension to check the existence of.
>
> The ioctl will return 0 if the extension is unsupported, or a positive integer
> if supported.
>
> We can initially include a capability called MSHV_CAP_CORE_API_STABLE.
> If supported, the core APIs are stable.

This sounds reasonable, I'd suggest you reserve MSHV_CAP_CORE_API_STABLE
right away but don't expose it yet so it's clear the API is not yet
stable. Test userspace you have may always assume it's running with the
latest kernel.

Also, please be clear about the fact that /dev/mshv doesn't
provide a stable API yet so nobody builds an application on top of
it.

One more though: it is probably a good idea to introduce selftests for
/dev/mshv (similar to KVM's selftests in
/tools/testing/selftests/kvm). Selftests don't really need a stable ABI
as they live in the same linux.git and can be updated in the same patch
series which changes /dev/mshv behavior. Selftests are very useful for
checking there are no regressions, especially in the situation when
there's no publicly available userspace for /dev/mshv.

-- 
Vitaly



[PATCH v3] ACPI: processor: Fix build when CONFIG_ACPI_PROCESSOR=m

2021-04-06 Thread Vitaly Kuznetsov
Commit 8c182bd7 ("ACPI: processor: Fix CPU0 wakeup in
acpi_idle_play_dead()") tried to fix CPU0 hotplug breakage by copying
wakeup_cpu0() + start_cpu0() logic from hlt_play_dead()//mwait_play_dead()
into acpi_idle_play_dead(). The problem is that these functions are not
exported to modules so when CONFIG_ACPI_PROCESSOR=m build fails.

The issue could've been fixed by exporting both wakeup_cpu0()/start_cpu0()
(the later from assembly) but it seems putting the whole pattern into a
new function and exporting it instead is better.

Reported-by: kernel test robot 
Fixes: 8c182bd7 ("CPI: processor: Fix CPU0 wakeup in acpi_idle_play_dead()")
Cc:  # 5.10+
Signed-off-by: Vitaly Kuznetsov 
---
Changes since v2:
- Use proper kerneldoc format [Rafael J. Wysocki]
---
 arch/x86/include/asm/smp.h|  2 +-
 arch/x86/kernel/smpboot.c | 26 --
 drivers/acpi/processor_idle.c |  4 +---
 3 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 57ef2094af93..630ff08532be 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -132,7 +132,7 @@ void native_play_dead(void);
 void play_dead_common(void);
 void wbinvd_on_cpu(int cpu);
 int wbinvd_on_all_cpus(void);
-bool wakeup_cpu0(void);
+void cond_wakeup_cpu0(void);
 
 void native_smp_send_reschedule(int cpu);
 void native_send_call_func_ipi(const struct cpumask *mask);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index f877150a91da..16703c35a944 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1659,13 +1659,17 @@ void play_dead_common(void)
local_irq_disable();
 }
 
-bool wakeup_cpu0(void)
+/**
+ * cond_wakeup_cpu0 - Wake up CPU0 if needed.
+ *
+ * If NMI wants to wake up CPU0, start CPU0.
+ */
+void cond_wakeup_cpu0(void)
 {
if (smp_processor_id() == 0 && enable_start_cpu0)
-   return true;
-
-   return false;
+   start_cpu0();
 }
+EXPORT_SYMBOL_GPL(cond_wakeup_cpu0);
 
 /*
  * We need to flush the caches before going to sleep, lest we have
@@ -1734,11 +1738,8 @@ static inline void mwait_play_dead(void)
__monitor(mwait_ptr, 0, 0);
mb();
__mwait(eax, 0);
-   /*
-* If NMI wants to wake up CPU0, start CPU0.
-*/
-   if (wakeup_cpu0())
-   start_cpu0();
+
+   cond_wakeup_cpu0();
}
 }
 
@@ -1749,11 +1750,8 @@ void hlt_play_dead(void)
 
while (1) {
native_halt();
-   /*
-* If NMI wants to wake up CPU0, start CPU0.
-*/
-   if (wakeup_cpu0())
-   start_cpu0();
+
+   cond_wakeup_cpu0();
}
 }
 
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 768a6b4d2368..4e2d76b8b697 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -544,9 +544,7 @@ static int acpi_idle_play_dead(struct cpuidle_device *dev, 
int index)
return -ENODEV;
 
 #if defined(CONFIG_X86) && defined(CONFIG_HOTPLUG_CPU)
-   /* If NMI wants to wake up CPU0, start CPU0. */
-   if (wakeup_cpu0())
-   start_cpu0();
+   cond_wakeup_cpu0();
 #endif
}
 
-- 
2.30.2



Re: [PATCH v2] ACPI: processor: Fix build when CONFIG_ACPI_PROCESSOR=m

2021-04-06 Thread Vitaly Kuznetsov
"Rafael J. Wysocki"  writes:

> On Tue, Apr 6, 2021 at 4:01 PM Vitaly Kuznetsov  wrote:
>>
>> Commit 8c182bd7 ("ACPI: processor: Fix CPU0 wakeup in
>> acpi_idle_play_dead()") tried to fix CPU0 hotplug breakage by copying
>> wakeup_cpu0() + start_cpu0() logic from hlt_play_dead()//mwait_play_dead()
>> into acpi_idle_play_dead(). The problem is that these functions are not
>> exported to modules so when CONFIG_ACPI_PROCESSOR=m build fails.
>>
>> The issue could've been fixed by exporting both wakeup_cpu0()/start_cpu0()
>> (the later from assembly) but it seems putting the whole pattern into a
>> new function and exporting it instead is better.
>>
>> Reported-by: kernel test robot 
>> Fixes: 8c182bd7 ("CPI: processor: Fix CPU0 wakeup in 
>> acpi_idle_play_dead()")
>> Cc:  # 5.10+
>> Signed-off-by: Vitaly Kuznetsov 
>> ---
>> Changes since v1:
>> - Rename wakeup_cpu0() to cond_wakeup_cpu0() and fold wakeup_cpu0() in
>>  as it has no other users [Rafael J. Wysocki]
>> ---
>>  arch/x86/include/asm/smp.h|  2 +-
>>  arch/x86/kernel/smpboot.c | 24 ++--
>>  drivers/acpi/processor_idle.c |  4 +---
>>  3 files changed, 12 insertions(+), 18 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
>> index 57ef2094af93..630ff08532be 100644
>> --- a/arch/x86/include/asm/smp.h
>> +++ b/arch/x86/include/asm/smp.h
>> @@ -132,7 +132,7 @@ void native_play_dead(void);
>>  void play_dead_common(void);
>>  void wbinvd_on_cpu(int cpu);
>>  int wbinvd_on_all_cpus(void);
>> -bool wakeup_cpu0(void);
>> +void cond_wakeup_cpu0(void);
>>
>>  void native_smp_send_reschedule(int cpu);
>>  void native_send_call_func_ipi(const struct cpumask *mask);
>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>> index f877150a91da..147f1bba9736 100644
>> --- a/arch/x86/kernel/smpboot.c
>> +++ b/arch/x86/kernel/smpboot.c
>> @@ -1659,13 +1659,15 @@ void play_dead_common(void)
>> local_irq_disable();
>>  }
>>
>> -bool wakeup_cpu0(void)
>> +/*
>> + * If NMI wants to wake up CPU0, start CPU0.
>> + */
>
> Hasn't checkpatch.pl complained about this?
>

No, it didn't.

> A proper kerneldoc would be something like:
>
> /**
>  * cond_wakeup_cpu0 - Wake up CPU0 if needed.
>  *
>  * If NMI wants to wake up CPU0, start CPU0.
>  */

Yea, I didn't do that partly because of my laziness but partly because
I don't see much usage of this format in arch/x86/kernel/[smpboot.c]. I
can certainly do v3 if it's prefered.

>
>> +void cond_wakeup_cpu0(void)
>>  {
>> if (smp_processor_id() == 0 && enable_start_cpu0)
>> -   return true;
>> -
>> -   return false;
>> +   start_cpu0();
>>  }
>> +EXPORT_SYMBOL_GPL(cond_wakeup_cpu0);
>>
>>  /*
>>   * We need to flush the caches before going to sleep, lest we have
>> @@ -1734,11 +1736,8 @@ static inline void mwait_play_dead(void)
>> __monitor(mwait_ptr, 0, 0);
>> mb();
>> __mwait(eax, 0);
>> -   /*
>> -* If NMI wants to wake up CPU0, start CPU0.
>> -*/
>> -   if (wakeup_cpu0())
>> -   start_cpu0();
>> +
>> +   cond_wakeup_cpu0();
>> }
>>  }
>>
>> @@ -1749,11 +1748,8 @@ void hlt_play_dead(void)
>>
>> while (1) {
>> native_halt();
>> -   /*
>> -* If NMI wants to wake up CPU0, start CPU0.
>> -*/
>> -   if (wakeup_cpu0())
>> -   start_cpu0();
>> +
>> +   cond_wakeup_cpu0();
>> }
>>  }
>>
>> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
>> index 768a6b4d2368..4e2d76b8b697 100644
>> --- a/drivers/acpi/processor_idle.c
>> +++ b/drivers/acpi/processor_idle.c
>> @@ -544,9 +544,7 @@ static int acpi_idle_play_dead(struct cpuidle_device 
>> *dev, int index)
>> return -ENODEV;
>>
>>  #if defined(CONFIG_X86) && defined(CONFIG_HOTPLUG_CPU)
>> -   /* If NMI wants to wake up CPU0, start CPU0. */
>> -   if (wakeup_cpu0())
>> -   start_cpu0();
>> +   cond_wakeup_cpu0();
>>  #endif
>> }
>>
>> --
>> 2.30.2
>>
>

-- 
Vitaly



Re: [PATCH v3 1/4] KVM: x86: Fix a spurious -E2BIG in KVM_GET_EMULATED_CPUID

2021-04-06 Thread Vitaly Kuznetsov
Emanuele Giuseppe Esposito  writes:

> When retrieving emulated CPUID entries, check for an insufficient array
> size if and only if KVM is actually inserting an entry.
> If userspace has a priori knowledge of the exact array size,
> KVM_GET_EMULATED_CPUID will incorrectly fail due to effectively requiring
> an extra, unused entry.
>
> Signed-off-by: Emanuele Giuseppe Esposito 
> ---
>  arch/x86/kvm/cpuid.c | 33 -
>  1 file changed, 16 insertions(+), 17 deletions(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 6bd2f8b830e4..27059ddf9f0a 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -567,34 +567,33 @@ static struct kvm_cpuid_entry2 *do_host_cpuid(struct 
> kvm_cpuid_array *array,
>  
>  static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
>  {
> - struct kvm_cpuid_entry2 *entry;
> -
> - if (array->nent >= array->maxnent)
> - return -E2BIG;
> + struct kvm_cpuid_entry2 entry;
>  
> - entry = >entries[array->nent];
> - entry->function = func;
> - entry->index = 0;
> - entry->flags = 0;
> + memset(, 0, sizeof(entry));
> + entry.function = func;
>  
>   switch (func) {
>   case 0:
> - entry->eax = 7;
> - ++array->nent;
> + entry.eax = 7;
>   break;
>   case 1:
> - entry->ecx = F(MOVBE);
> - ++array->nent;
> + entry.ecx = F(MOVBE);
>   break;
>   case 7:
> - entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> - entry->eax = 0;
> - entry->ecx = F(RDPID);
> - ++array->nent;
> - default:
> + entry.flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> + entry.eax = 0;

Nitpick: there's no need to set entry.eax = 0 as the whole structure was
zeroed. Also, '|=' for flags could be just '='.

> + entry.ecx = F(RDPID);
>   break;
> + default:
> +     goto out;
>   }
>  
> + if (array->nent >= array->maxnent)
> + return -E2BIG;
> +
> + memcpy(>entries[array->nent++], , sizeof(entry));
> +
> +out:
>   return 0;
>  }

Reviewed-by: Vitaly Kuznetsov 

-- 
Vitaly



[PATCH v2] ACPI: processor: Fix build when CONFIG_ACPI_PROCESSOR=m

2021-04-06 Thread Vitaly Kuznetsov
Commit 8c182bd7 ("ACPI: processor: Fix CPU0 wakeup in
acpi_idle_play_dead()") tried to fix CPU0 hotplug breakage by copying
wakeup_cpu0() + start_cpu0() logic from hlt_play_dead()//mwait_play_dead()
into acpi_idle_play_dead(). The problem is that these functions are not
exported to modules so when CONFIG_ACPI_PROCESSOR=m build fails.

The issue could've been fixed by exporting both wakeup_cpu0()/start_cpu0()
(the later from assembly) but it seems putting the whole pattern into a
new function and exporting it instead is better.

Reported-by: kernel test robot 
Fixes: 8c182bd7 ("CPI: processor: Fix CPU0 wakeup in acpi_idle_play_dead()")
Cc:  # 5.10+
Signed-off-by: Vitaly Kuznetsov 
---
Changes since v1:
- Rename wakeup_cpu0() to cond_wakeup_cpu0() and fold wakeup_cpu0() in
 as it has no other users [Rafael J. Wysocki]
---
 arch/x86/include/asm/smp.h|  2 +-
 arch/x86/kernel/smpboot.c | 24 ++--
 drivers/acpi/processor_idle.c |  4 +---
 3 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 57ef2094af93..630ff08532be 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -132,7 +132,7 @@ void native_play_dead(void);
 void play_dead_common(void);
 void wbinvd_on_cpu(int cpu);
 int wbinvd_on_all_cpus(void);
-bool wakeup_cpu0(void);
+void cond_wakeup_cpu0(void);
 
 void native_smp_send_reschedule(int cpu);
 void native_send_call_func_ipi(const struct cpumask *mask);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index f877150a91da..147f1bba9736 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1659,13 +1659,15 @@ void play_dead_common(void)
local_irq_disable();
 }
 
-bool wakeup_cpu0(void)
+/*
+ * If NMI wants to wake up CPU0, start CPU0.
+ */
+void cond_wakeup_cpu0(void)
 {
if (smp_processor_id() == 0 && enable_start_cpu0)
-   return true;
-
-   return false;
+   start_cpu0();
 }
+EXPORT_SYMBOL_GPL(cond_wakeup_cpu0);
 
 /*
  * We need to flush the caches before going to sleep, lest we have
@@ -1734,11 +1736,8 @@ static inline void mwait_play_dead(void)
__monitor(mwait_ptr, 0, 0);
mb();
__mwait(eax, 0);
-   /*
-* If NMI wants to wake up CPU0, start CPU0.
-*/
-   if (wakeup_cpu0())
-   start_cpu0();
+
+   cond_wakeup_cpu0();
}
 }
 
@@ -1749,11 +1748,8 @@ void hlt_play_dead(void)
 
while (1) {
native_halt();
-   /*
-* If NMI wants to wake up CPU0, start CPU0.
-*/
-   if (wakeup_cpu0())
-   start_cpu0();
+
+   cond_wakeup_cpu0();
}
 }
 
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 768a6b4d2368..4e2d76b8b697 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -544,9 +544,7 @@ static int acpi_idle_play_dead(struct cpuidle_device *dev, 
int index)
return -ENODEV;
 
 #if defined(CONFIG_X86) && defined(CONFIG_HOTPLUG_CPU)
-   /* If NMI wants to wake up CPU0, start CPU0. */
-   if (wakeup_cpu0())
-   start_cpu0();
+   cond_wakeup_cpu0();
 #endif
}
 
-- 
2.30.2



Re: [PATCH] ACPI: processor: Fix build when CONFIG_ACPI_PROCESSOR=m

2021-04-06 Thread Vitaly Kuznetsov
"Rafael J. Wysocki"  writes:

> On Tue, Apr 6, 2021 at 2:50 PM Vitaly Kuznetsov  wrote:
>>
>> Commit 8c182bd7 ("ACPI: processor: Fix CPU0 wakeup in
>> acpi_idle_play_dead()") tried to fix CPU0 hotplug breakage by copying
>> wakeup_cpu0() + start_cpu0() logic from hlt_play_dead()//mwait_play_dead()
>> into acpi_idle_play_dead(). The problem is that these functions are not
>> exported to modules so when CONFIG_ACPI_PROCESSOR=m build fails.
>>
>> The issue could've been fixed by exporting both wakeup_cpu0()/start_cpu0()
>> (the later from assembly) but it seems putting the whole pattern into a
>> new function and exporting it instead is better.
>>
>> Reported-by: kernel test robot 
>> Fixes: 8c182bd7 ("CPI: processor: Fix CPU0 wakeup in 
>> acpi_idle_play_dead()")
>> Cc:  # 5.10+
>> Signed-off-by: Vitaly Kuznetsov 
>> ---
>>  arch/x86/include/asm/smp.h|  2 +-
>>  arch/x86/kernel/smpboot.c | 15 ++-
>>  drivers/acpi/processor_idle.c |  3 +--
>>  3 files changed, 12 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
>> index 57ef2094af93..6f79deb1f970 100644
>> --- a/arch/x86/include/asm/smp.h
>> +++ b/arch/x86/include/asm/smp.h
>> @@ -132,7 +132,7 @@ void native_play_dead(void);
>>  void play_dead_common(void);
>>  void wbinvd_on_cpu(int cpu);
>>  int wbinvd_on_all_cpus(void);
>> -bool wakeup_cpu0(void);
>> +void wakeup_cpu0_if_needed(void);
>>
>>  void native_smp_send_reschedule(int cpu);
>>  void native_send_call_func_ipi(const struct cpumask *mask);
>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>> index f877150a91da..9547d870ee27 100644
>> --- a/arch/x86/kernel/smpboot.c
>> +++ b/arch/x86/kernel/smpboot.c
>> @@ -1659,7 +1659,7 @@ void play_dead_common(void)
>> local_irq_disable();
>>  }
>>
>> -bool wakeup_cpu0(void)
>> +static bool wakeup_cpu0(void)
>>  {
>> if (smp_processor_id() == 0 && enable_start_cpu0)
>> return true;
>> @@ -1667,6 +1667,13 @@ bool wakeup_cpu0(void)
>> return false;
>>  }
>>
>> +void wakeup_cpu0_if_needed(void)
>> +{
>> +   if (wakeup_cpu0())
>> +   start_cpu0();
>
> Note that all of the callers of wakeup_cpu0 do the above, so maybe
> make them all use the new function?
>
> In which case it can be rewritten in the following way
>
> void cond_wakeup_cpu0(void)
> {
> if (smp_processor_id() == 0 && enable_start_cpu0)
> start_cpu0();
> }
> EXPORT_SYMBOL_GPL(cond_wakeup_cpu0);
>

Sure, separate wakeup_cpu0() is no longer needed.

> Also please add a proper kerneldoc comment to it and maybe drop the
> comments at the call sites?

Also a good idea. v2 is coming, thanks!

>
>
>> +}
>> +EXPORT_SYMBOL_GPL(wakeup_cpu0_if_needed);
>> +
>>  /*
>>   * We need to flush the caches before going to sleep, lest we have
>>   * dirty data in our caches when we come back up.
>> @@ -1737,8 +1744,7 @@ static inline void mwait_play_dead(void)
>> /*
>>  * If NMI wants to wake up CPU0, start CPU0.
>>  */
>> -   if (wakeup_cpu0())
>> -   start_cpu0();
>> +   wakeup_cpu0_if_needed();
>> }
>>  }
>>
>> @@ -1752,8 +1758,7 @@ void hlt_play_dead(void)
>> /*
>>  * If NMI wants to wake up CPU0, start CPU0.
>>  */
>> -   if (wakeup_cpu0())
>> -   start_cpu0();
>> +   wakeup_cpu0_if_needed();
>> }
>>  }
>>
>> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
>> index 768a6b4d2368..de15116b754a 100644
>> --- a/drivers/acpi/processor_idle.c
>> +++ b/drivers/acpi/processor_idle.c
>> @@ -545,8 +545,7 @@ static int acpi_idle_play_dead(struct cpuidle_device 
>> *dev, int index)
>>
>>  #if defined(CONFIG_X86) && defined(CONFIG_HOTPLUG_CPU)
>> /* If NMI wants to wake up CPU0, start CPU0. */
>> -   if (wakeup_cpu0())
>> -   start_cpu0();
>> +   wakeup_cpu0_if_needed();
>>  #endif
>> }
>>
>> --
>> 2.30.2
>>
>

-- 
Vitaly



[PATCH] ACPI: processor: Fix build when CONFIG_ACPI_PROCESSOR=m

2021-04-06 Thread Vitaly Kuznetsov
Commit 8c182bd7 ("ACPI: processor: Fix CPU0 wakeup in
acpi_idle_play_dead()") tried to fix CPU0 hotplug breakage by copying
wakeup_cpu0() + start_cpu0() logic from hlt_play_dead()//mwait_play_dead()
into acpi_idle_play_dead(). The problem is that these functions are not
exported to modules so when CONFIG_ACPI_PROCESSOR=m build fails.

The issue could've been fixed by exporting both wakeup_cpu0()/start_cpu0()
(the later from assembly) but it seems putting the whole pattern into a
new function and exporting it instead is better.

Reported-by: kernel test robot 
Fixes: 8c182bd7 ("CPI: processor: Fix CPU0 wakeup in acpi_idle_play_dead()")
Cc:  # 5.10+
Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/include/asm/smp.h|  2 +-
 arch/x86/kernel/smpboot.c | 15 ++-
 drivers/acpi/processor_idle.c |  3 +--
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 57ef2094af93..6f79deb1f970 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -132,7 +132,7 @@ void native_play_dead(void);
 void play_dead_common(void);
 void wbinvd_on_cpu(int cpu);
 int wbinvd_on_all_cpus(void);
-bool wakeup_cpu0(void);
+void wakeup_cpu0_if_needed(void);
 
 void native_smp_send_reschedule(int cpu);
 void native_send_call_func_ipi(const struct cpumask *mask);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index f877150a91da..9547d870ee27 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1659,7 +1659,7 @@ void play_dead_common(void)
local_irq_disable();
 }
 
-bool wakeup_cpu0(void)
+static bool wakeup_cpu0(void)
 {
if (smp_processor_id() == 0 && enable_start_cpu0)
return true;
@@ -1667,6 +1667,13 @@ bool wakeup_cpu0(void)
return false;
 }
 
+void wakeup_cpu0_if_needed(void)
+{
+   if (wakeup_cpu0())
+   start_cpu0();
+}
+EXPORT_SYMBOL_GPL(wakeup_cpu0_if_needed);
+
 /*
  * We need to flush the caches before going to sleep, lest we have
  * dirty data in our caches when we come back up.
@@ -1737,8 +1744,7 @@ static inline void mwait_play_dead(void)
/*
 * If NMI wants to wake up CPU0, start CPU0.
 */
-   if (wakeup_cpu0())
-   start_cpu0();
+   wakeup_cpu0_if_needed();
}
 }
 
@@ -1752,8 +1758,7 @@ void hlt_play_dead(void)
/*
 * If NMI wants to wake up CPU0, start CPU0.
 */
-   if (wakeup_cpu0())
-   start_cpu0();
+   wakeup_cpu0_if_needed();
}
 }
 
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 768a6b4d2368..de15116b754a 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -545,8 +545,7 @@ static int acpi_idle_play_dead(struct cpuidle_device *dev, 
int index)
 
 #if defined(CONFIG_X86) && defined(CONFIG_HOTPLUG_CPU)
/* If NMI wants to wake up CPU0, start CPU0. */
-   if (wakeup_cpu0())
-   start_cpu0();
+   wakeup_cpu0_if_needed();
 #endif
}
 
-- 
2.30.2



Re: [PATCH v2 2/2] KVM: nSVM: improve SYSENTER emulation on AMD

2021-04-01 Thread Vitaly Kuznetsov
Paolo Bonzini  writes:

> On 01/04/21 15:03, Vitaly Kuznetsov wrote:
>>> +   svm->sysenter_eip_hi = guest_cpuid_is_intel(vcpu) ? (data >> 
>>> 32) : 0;
>> 
>> (Personal taste) I'd suggest we keep the whole 'sysenter_eip'/'sysenter_esp'
>> even if we only use the upper 32 bits of it. That would reduce the code
>> churn a little bit (no need to change 'struct vcpu_svm').
>
> Would there really be less changes?  Consider that you'd have to look at 
> the VMCB anyway because svm_get_msr can be reached not just for guest 
> RDMSR but also for ioctls.
>

I was thinking about the hunk in arch/x86/kvm/svm/svm.h tweaking
vcpu_svm. My opinion is not strong at all here)

-- 
Vitaly



Re: [PATCH v2 2/2] KVM: nSVM: improve SYSENTER emulation on AMD

2021-04-01 Thread Vitaly Kuznetsov
Maxim Levitsky  writes:

> Currently to support Intel->AMD migration, if CPU vendor is GenuineIntel,
> we emulate the full 64 value for MSR_IA32_SYSENTER_{EIP|ESP}
> msrs, and we also emulate the sysenter/sysexit instruction in long mode.
>
> (Emulator does still refuse to emulate sysenter in 64 bit mode, on the
> ground that the code for that wasn't tested and likely has no users)
>
> However when virtual vmload/vmsave is enabled, the vmload instruction will
> update these 32 bit msrs without triggering their msr intercept,
> which will lead to having stale values in kvm's shadow copy of these msrs,
> which relies on the intercept to be up to date.
>
> Fix/optimize this by doing the following:
>
> 1. Enable the MSR intercepts for SYSENTER MSRs iff vendor=GenuineIntel
>(This is both a tiny optimization and also ensures that in case
>the guest cpu vendor is AMD, the msrs will be 32 bit wide as
>AMD defined).
>
> 2. Store only high 32 bit part of these msrs on interception and combine
>it with hardware msr value on intercepted read/writes
>iff vendor=GenuineIntel.
>
> 3. Disable vmload/vmsave virtualization if vendor=GenuineIntel.
>(It is somewhat insane to set vendor=GenuineIntel and still enable
>SVM for the guest but well whatever).
>Then zero the high 32 bit parts when kvm intercepts and emulates vmload.
>
> Thanks a lot to Paulo Bonzini for helping me with fixing this in the most

s/Paulo/Paolo/ :-)

> correct way.
>
> This patch fixes nested migration of 32 bit nested guests, that was
> broken because incorrect cached values of SYSENTER msrs were stored in
> the migration stream if L1 changed these msrs with
> vmload prior to L2 entry.
>
> Signed-off-by: Maxim Levitsky 
> ---
>  arch/x86/kvm/svm/svm.c | 99 +++---
>  arch/x86/kvm/svm/svm.h |  6 +--
>  2 files changed, 68 insertions(+), 37 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 271196400495..6c39b0cd6ec6 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -95,6 +95,8 @@ static const struct svm_direct_access_msrs {
>  } direct_access_msrs[MAX_DIRECT_ACCESS_MSRS] = {
>   { .index = MSR_STAR,.always = true  },
>   { .index = MSR_IA32_SYSENTER_CS,.always = true  },
> + { .index = MSR_IA32_SYSENTER_EIP,   .always = false },
> + { .index = MSR_IA32_SYSENTER_ESP,   .always = false },
>  #ifdef CONFIG_X86_64
>   { .index = MSR_GS_BASE, .always = true  },
>   { .index = MSR_FS_BASE, .always = true  },
> @@ -1258,16 +1260,6 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
>   if (kvm_vcpu_apicv_active(vcpu))
>   avic_init_vmcb(svm);
>  
> - /*
> -  * If hardware supports Virtual VMLOAD VMSAVE then enable it
> -  * in VMCB and clear intercepts to avoid #VMEXIT.
> -  */
> - if (vls) {
> - svm_clr_intercept(svm, INTERCEPT_VMLOAD);
> - svm_clr_intercept(svm, INTERCEPT_VMSAVE);
> - svm->vmcb->control.virt_ext |= 
> VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK;
> - }
> -
>   if (vgif) {
>   svm_clr_intercept(svm, INTERCEPT_STGI);
>   svm_clr_intercept(svm, INTERCEPT_CLGI);
> @@ -2133,9 +2125,11 @@ static int vmload_vmsave_interception(struct kvm_vcpu 
> *vcpu, bool vmload)
>  
>   ret = kvm_skip_emulated_instruction(vcpu);
>  
> - if (vmload)
> + if (vmload) {
>   nested_svm_vmloadsave(vmcb12, svm->vmcb);
> - else
> + svm->sysenter_eip_hi = 0;
> + svm->sysenter_esp_hi = 0;
> + } else
>   nested_svm_vmloadsave(svm->vmcb, vmcb12);

Nitpicking: {} are now needed for both branches here.

>  
>   kvm_vcpu_unmap(vcpu, , true);
> @@ -2677,10 +2671,14 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct 
> msr_data *msr_info)
>   msr_info->data = svm->vmcb01.ptr->save.sysenter_cs;
>   break;
>   case MSR_IA32_SYSENTER_EIP:
> - msr_info->data = svm->sysenter_eip;
> + msr_info->data = (u32)svm->vmcb01.ptr->save.sysenter_eip;
> + if (guest_cpuid_is_intel(vcpu))
> + msr_info->data |= (u64)svm->sysenter_eip_hi << 32;
>   break;
>   case MSR_IA32_SYSENTER_ESP:
> - msr_info->data = svm->sysenter_esp;
> + msr_info->data = svm->vmcb01.ptr->save.sysenter_esp;
> + if (guest_cpuid_is_intel(vcpu))
> + msr_info->data |= (u64)svm->sysenter_esp_hi << 32;
>   break;
>   case MSR_TSC_AUX:
>   if (!boot_cpu_has(X86_FEATURE_RDTSCP))
> @@ -2885,12 +2883,19 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct 
> msr_data *msr)
>   svm->vmcb01.ptr->save.sysenter_cs = data;
>   break;
>   case MSR_IA32_SYSENTER_EIP:
> - svm->sysenter_eip = 

Re: [PATCH 4/4] selftests: kvm: add get_emulated_cpuid test

2021-04-01 Thread Vitaly Kuznetsov
Emanuele Giuseppe Esposito  writes:

> Introduce a new selftest for the KVM_GET_EMULATED_CPUID
> ioctl. Since the behavior and functionality is similar to
> get_cpuid_test, the test checks:
>
> 1) checks for corner case in the nent field of the struct kvm_cpuid2.
> 2) sets and gets it as cpuid from the guest VM
>
> Signed-off-by: Emanuele Giuseppe Esposito 
> ---
>  tools/testing/selftests/kvm/.gitignore|   1 +
>  tools/testing/selftests/kvm/Makefile  |   1 +
>  .../selftests/kvm/x86_64/get_emulated_cpuid.c | 183 ++
>  3 files changed, 185 insertions(+)
>  create mode 100644 tools/testing/selftests/kvm/x86_64/get_emulated_cpuid.c
>
> diff --git a/tools/testing/selftests/kvm/.gitignore 
> b/tools/testing/selftests/kvm/.gitignore
> index 7bd7e776c266..f1523f3bfd04 100644
> --- a/tools/testing/selftests/kvm/.gitignore
> +++ b/tools/testing/selftests/kvm/.gitignore
> @@ -8,6 +8,7 @@
>  /x86_64/debug_regs
>  /x86_64/evmcs_test
>  /x86_64/get_cpuid_test
> +x86_64/get_emulated_cpuid
>  /x86_64/get_msr_index_features
>  /x86_64/kvm_pv_test
>  /x86_64/hyperv_clock
> diff --git a/tools/testing/selftests/kvm/Makefile 
> b/tools/testing/selftests/kvm/Makefile
> index 67eebb53235f..0d8d3bd5a7c7 100644
> --- a/tools/testing/selftests/kvm/Makefile
> +++ b/tools/testing/selftests/kvm/Makefile
> @@ -40,6 +40,7 @@ LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c 
> lib/s390x/diag318_test_ha
>  
>  TEST_GEN_PROGS_x86_64 = x86_64/cr4_cpuid_sync_test
>  TEST_GEN_PROGS_x86_64 += x86_64/get_msr_index_features
> +TEST_GEN_PROGS_x86_64 += x86_64/get_emulated_cpuid
>  TEST_GEN_PROGS_x86_64 += x86_64/evmcs_test
>  TEST_GEN_PROGS_x86_64 += x86_64/get_cpuid_test
>  TEST_GEN_PROGS_x86_64 += x86_64/hyperv_clock
> diff --git a/tools/testing/selftests/kvm/x86_64/get_emulated_cpuid.c 
> b/tools/testing/selftests/kvm/x86_64/get_emulated_cpuid.c
> new file mode 100644
> index ..f5294dc4b8ff
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/get_emulated_cpuid.c
> @@ -0,0 +1,183 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2021, Red Hat Inc.
> + *
> + * Generic tests for KVM CPUID set/get ioctls
> + */
> +#include 
> +#include 
> +#include 
> +
> +#include "test_util.h"
> +#include "kvm_util.h"
> +#include "processor.h"
> +
> +#define VCPU_ID 0
> +#define MAX_NENT 1000
> +
> +/* CPUIDs known to differ */
> +struct {
> + u32 function;
> + u32 index;
> +} mangled_cpuids[] = {
> + {.function = 0xd, .index = 0},
> +};
> +
> +static void guest_main(void)
> +{
> +
> +}
> +
> +static bool is_cpuid_mangled(struct kvm_cpuid_entry2 *entrie)
> +{
> + int i;
> +
> + for (i = 0; i < sizeof(mangled_cpuids); i++) {
> + if (mangled_cpuids[i].function == entrie->function &&
> + mangled_cpuids[i].index == entrie->index)
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static void check_cpuid(struct kvm_cpuid2 *cpuid, struct kvm_cpuid_entry2 
> *entrie)
> +{
> + int i;
> +
> + for (i = 0; i < cpuid->nent; i++) {
> + if (cpuid->entries[i].function == entrie->function &&
> + cpuid->entries[i].index == entrie->index) {
> + if (is_cpuid_mangled(entrie))
> + return;
> +
> + TEST_ASSERT(cpuid->entries[i].eax == entrie->eax &&
> + cpuid->entries[i].ebx == entrie->ebx &&
> + cpuid->entries[i].ecx == entrie->ecx &&
> + cpuid->entries[i].edx == entrie->edx,
> + "CPUID 0x%x.%x differ: 0x%x:0x%x:0x%x:0x%x 
> vs 0x%x:0x%x:0x%x:0x%x",
> + entrie->function, entrie->index,
> + cpuid->entries[i].eax, 
> cpuid->entries[i].ebx,
> + cpuid->entries[i].ecx, 
> cpuid->entries[i].edx,
> + entrie->eax, entrie->ebx, entrie->ecx, 
> entrie->edx);
> + return;
> + }
> + }
> +
> + TEST_ASSERT(false, "CPUID 0x%x.%x not found", entrie->function, 
> entrie->index);
> +}
> +
> +static void compare_cpuids(struct kvm_cpuid2 *cpuid1,
> +struct kvm_cpuid2 *cpuid2)
> +{
> + int i;
> +
> + for (i = 0; i < cpuid1->nent; i++)
> + check_cpuid(cpuid2, >entries[i]);
> +
> + for (i = 0; i < cpuid2->nent; i++)
> + check_cpuid(cpuid1, >entries[i]);
> +}

CPUID comparison here seems to be borrowed from get_cpuid_test.c, I
think we can either put it to a library or (my preference) just merge
these two selftests together. 'get_cpuid_test' name is generic enough to
be used for KVM_GET_EMULATED_CPUID too.

> +
> +struct kvm_cpuid2 *vcpu_alloc_cpuid(struct kvm_vm *vm, vm_vaddr_t *p_gva, 
> struct kvm_cpuid2 *cpuid)
> +{
> + int size = sizeof(*cpuid) + cpuid->nent * 

Re: [PATCH] KVM: x86: Fix potential memory access error

2021-04-01 Thread Vitaly Kuznetsov
Sean Christopherson  writes:

> On Wed, Mar 31, 2021, Yang Li wrote:
>> Using __set_bit() to set a bit in an integer is not a good idea, since
>> the function expects an unsigned long as argument, which can be 64bit wide.
>> Coverity reports this problem as
>> 
>> High:Out-of-bounds access(INCOMPATIBLE_CAST)
>> CWE119: Out-of-bounds access to a scalar
>> Pointer ">arch.regs_avail" points to an object whose effective
>> type is "unsigned int" (32 bits, unsigned) but is dereferenced as a
>> wider "unsigned long" (64 bits, unsigned). This may lead to memory
>> corruption.
>> 
>> /home/heyuan.shy/git-repo/linux/arch/x86/kvm/kvm_cache_regs.h:
>> kvm_register_is_available
>> 
>> Just use BIT instead.
>
> Meh, we're hosed either way.  Using BIT() will either result in undefined
> behavior due to SHL shifting beyond the size of a u64, or setting random bits
> if the truncated shift ends up being less than 63.
>

A stupid question: why can't we just make 'regs_avail'/'regs_dirty'
'unsigned long' and drop a bunch of '(unsigned long *)' casts? 

-- 
Vitaly



Re: [PATCH -next] ACPI: processor: Fix a prepocessor warning

2021-04-01 Thread Vitaly Kuznetsov
Shixin Liu  writes:

> When compiling with defconfig on x86_64, I got a warning:
>
> drivers/acpi/processor_idle.c: In function ‘acpi_idle_play_dead’:
> drivers/acpi/processor_idle.c:542:15: warning: extra tokens at end of #ifdef 
> directive
>   542 | #ifdef defined(CONFIG_X86) && defined(CONFIG_HOTPLUG_CPU)
>   |
>
> Fixes: bc5706eaeae0 ("ACPI: processor: Fix CPU0 wakeup in 
> acpi_idle_play_dead()")
> Signed-off-by: Shixin Liu 
> ---
>  drivers/acpi/processor_idle.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index 19fb28a8005b..0925b1477230 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -539,7 +539,7 @@ static int acpi_idle_play_dead(struct cpuidle_device 
> *dev, int index)
>   } else
>   return -ENODEV;
>  
> -#ifdef defined(CONFIG_X86) && defined(CONFIG_HOTPLUG_CPU)
> +#if defined(CONFIG_X86) && defined(CONFIG_HOTPLUG_CPU)
>   /* If NMI wants to wake up CPU0, start CPU0. */
>   if (wakeup_cpu0())
>   start_cpu0();

Thank you for the patch,

this was already reported by Stephen Rothwell and I suggested Rafael the
exact same fix:

https://lore.kernel.org/lkml/87czvfu9j5@vitty.brq.redhat.com/

It would probably be better if we fold the fix in (if stil possible).

-- 
Vitaly



[PATCH v3 1/2] KVM: x86: Prevent 'hv_clock->system_time' from going negative in kvm_guest_time_update()

2021-03-31 Thread Vitaly Kuznetsov
When guest time is reset with KVM_SET_CLOCK(0), it is possible for
'hv_clock->system_time' to become a small negative number. This happens
because in KVM_SET_CLOCK handling we set 'kvm->arch.kvmclock_offset' based
on get_kvmclock_ns(kvm) but when KVM_REQ_CLOCK_UPDATE is handled,
kvm_guest_time_update() does (masterclock in use case):

hv_clock.system_time = ka->master_kernel_ns + v->kvm->arch.kvmclock_offset;

And 'master_kernel_ns' represents the last time when masterclock
got updated, it can precede KVM_SET_CLOCK() call. Normally, this is not a
problem, the difference is very small, e.g. I'm observing
hv_clock.system_time = -70 ns. The issue comes from the fact that
'hv_clock.system_time' is stored as unsigned and 'system_time / 100' in
compute_tsc_page_parameters() becomes a very big number.

Use 'master_kernel_ns' instead of get_kvmclock_ns() when masterclock is in
use and get_kvmclock_base_ns() when it's not to prevent 'system_time' from
going negative.

Signed-off-by: Vitaly Kuznetsov 
---
 arch/x86/kvm/x86.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2bfd00da465f..2f54beed0105 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5728,6 +5728,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
}
 #endif
case KVM_SET_CLOCK: {
+   struct kvm_arch *ka = >arch;
struct kvm_clock_data user_ns;
u64 now_ns;
 
@@ -5746,8 +5747,22 @@ long kvm_arch_vm_ioctl(struct file *filp,
 * pvclock_update_vm_gtod_copy().
 */
kvm_gen_update_masterclock(kvm);
-   now_ns = get_kvmclock_ns(kvm);
-   kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
+
+   /*
+* This pairs with kvm_guest_time_update(): when masterclock is
+* in use, we use master_kernel_ns + kvmclock_offset to set
+* unsigned 'system_time' so if we use get_kvmclock_ns() (which
+* is slightly ahead) here we risk going negative on unsigned
+* 'system_time' when 'user_ns.clock' is very small.
+*/
+   spin_lock_irq(>pvclock_gtod_sync_lock);
+   if (kvm->arch.use_master_clock)
+   now_ns = ka->master_kernel_ns;
+   else
+   now_ns = get_kvmclock_base_ns();
+   ka->kvmclock_offset = user_ns.clock - now_ns;
+   spin_unlock_irq(>pvclock_gtod_sync_lock);
+
kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE);
break;
}
-- 
2.30.2



  1   2   3   4   5   6   7   8   9   10   >