Re: kvm-unit-tests: psci_cpu_on_test FAILed

2019-08-02 Thread Marc Zyngier
On 02/08/2019 11:56, Zenghui Yu wrote:
> Hi folks,
> 
> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
> the following fail info:
> 
>   [...]
>   FAIL psci (4 tests, 1 unexpected failures)
>   [...]
> and
>   [...]
>   INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
>   FAIL: cpu-on
>   SUMMARY: 4 tests, 1 unexpected failures
> 
> 
> I think this is an issue had been fixed once by commit 6c7a5dce22b3
> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
> does reset on the MPIDR register whilst another reads it.
> 
> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
> itself") later moves the reset work into check_vcpu_requests(), by
> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
> has not been protected by kvm->lock mutex anymore, and the race shows up
> again...
> 
> Do we need a fix for this issue? At least achieve a mutex execution
> between the reset of MPIDR and kvm_mpidr_to_vcpu()?

The thing is that the way we reset registers is marginally insane.
Yes, it catches most reset bugs. It also introduces many more in
the rest of the paths.

The fun part is that there is hardly a need for resetting MPIDR.
It has already been set when we've created the vcpu. It is the
poisoning of the sysreg array that creates a situation where
the MPIDR is temporarily invalid.

So instead of poisoning the array, how about we just keep
track of the registers for which we've called a reset function?
It should be enough to track the most obvious bugs... I've
cobbled the following patch together, which seems to fix the
issue on my TX2 with 64 vcpus.

Thoughts?

M.

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index f26e181d881c..17f46ee7dc83 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2254,13 +2254,17 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
 }
 
 static void reset_sys_reg_descs(struct kvm_vcpu *vcpu,
- const struct sys_reg_desc *table, size_t num)
+   const struct sys_reg_desc *table, size_t num,
+   unsigned long *bmap)
 {
unsigned long i;
 
for (i = 0; i < num; i++)
-   if (table[i].reset)
+   if (table[i].reset) {
table[i].reset(vcpu, [i]);
+   if (bmap)
+   set_bit(i, bmap);
+   }
 }
 
 /**
@@ -2772,21 +2776,23 @@ void kvm_sys_reg_table_init(void)
  */
 void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
 {
+   unsigned long *bmap;
size_t num;
const struct sys_reg_desc *table;
 
-   /* Catch someone adding a register without putting in reset entry. */
-   memset(>arch.ctxt.sys_regs, 0x42, 
sizeof(vcpu->arch.ctxt.sys_regs));
+   bmap = bitmap_alloc(NR_SYS_REGS, GFP_KERNEL);
 
/* Generic chip reset first (so target could override). */
-   reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
+   reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs), 
bmap);
 
table = get_target_table(vcpu->arch.target, true, );
-   reset_sys_reg_descs(vcpu, table, num);
+   reset_sys_reg_descs(vcpu, table, num, bmap);
 
for (num = 1; num < NR_SYS_REGS; num++) {
-   if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
+   if (WARN(bmap && !test_bit(num, bmap),
 "Didn't reset __vcpu_sys_reg(%zi)\n", num))
break;
}
+
+   kfree(bmap);
 }


-- 
Jazz is not dead, it just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 6/9] KVM: arm64: Provide a PV_TIME device to user space

2019-08-02 Thread Steven Price
Allow user space to inform the KVM host where in the physical memory
map the paravirtualized time structures should be located.

A device is created which provides the base address of an array of
Stolen Time (ST) structures, one for each VCPU. There must be (64 *
total number of VCPUs) bytes of memory available at this location.

The address is given in terms of the physical address visible to
the guest and must be 64 byte aligned. The memory should be marked as
reserved to the guest to stop it allocating it for other purposes.

Signed-off-by: Steven Price 
---
 arch/arm64/include/asm/kvm_mmu.h  |   2 +
 arch/arm64/include/uapi/asm/kvm.h |   6 +
 arch/arm64/kvm/Makefile   |   1 +
 include/uapi/linux/kvm.h  |   2 +
 virt/kvm/arm/mmu.c|  44 +++
 virt/kvm/arm/pvtime.c | 190 ++
 6 files changed, 245 insertions(+)
 create mode 100644 virt/kvm/arm/pvtime.c

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index befe37d4bc0e..88c8a4b2836f 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -157,6 +157,8 @@ int kvm_alloc_stage2_pgd(struct kvm *kvm);
 void kvm_free_stage2_pgd(struct kvm *kvm);
 int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
  phys_addr_t pa, unsigned long size, bool writable);
+int kvm_phys_addr_memremap(struct kvm *kvm, phys_addr_t guest_ipa,
+ phys_addr_t pa, unsigned long size, bool writable);
 
 int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run);
 
diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
index 9a507716ae2f..95516a4198ea 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -367,6 +367,12 @@ struct kvm_vcpu_events {
 #define KVM_PSCI_RET_INVAL PSCI_RET_INVALID_PARAMS
 #define KVM_PSCI_RET_DENIEDPSCI_RET_DENIED
 
+/* Device Control API: PV_TIME */
+#define KVM_DEV_ARM_PV_TIME_PADDR  0
+#define  KVM_DEV_ARM_PV_TIME_ST0
+#define KVM_DEV_ARM_PV_TIME_STATE_SIZE 1
+#define KVM_DEV_ARM_PV_TIME_STATE  2
+
 #endif
 
 #endif /* __ARM_KVM_H__ */
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 73dce4d47d47..5ffbdc39e780 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -14,6 +14,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o 
$(KVM)/coalesced_mmio.o $(KVM)/e
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o 
$(KVM)/arm/mmio.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hypercalls.o
+kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/pvtime.o
 
 kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
 kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index a7c19540ce21..04bffafa0708 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1222,6 +1222,8 @@ enum kvm_device_type {
 #define KVM_DEV_TYPE_ARM_VGIC_ITS  KVM_DEV_TYPE_ARM_VGIC_ITS
KVM_DEV_TYPE_XIVE,
 #define KVM_DEV_TYPE_XIVE  KVM_DEV_TYPE_XIVE
+   KVM_DEV_TYPE_ARM_PV_TIME,
+#define KVM_DEV_TYPE_ARM_PV_TIME   KVM_DEV_TYPE_ARM_PV_TIME
KVM_DEV_TYPE_MAX,
 };
 
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 38b4c910b6c3..be28a4aee451 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1368,6 +1368,50 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t 
guest_ipa,
return ret;
 }
 
+/**
+ * kvm_phys_addr_memremap - map a memory range to guest IPA
+ *
+ * @kvm:   The KVM pointer
+ * @guest_ipa: The IPA at which to insert the mapping
+ * @pa:The physical address of the memory
+ * @size:  The size of the mapping
+ */
+int kvm_phys_addr_memremap(struct kvm *kvm, phys_addr_t guest_ipa,
+ phys_addr_t pa, unsigned long size, bool writable)
+{
+   phys_addr_t addr, end;
+   int ret = 0;
+   unsigned long pfn;
+   struct kvm_mmu_memory_cache cache = { 0, };
+
+   end = (guest_ipa + size + PAGE_SIZE - 1) & PAGE_MASK;
+   pfn = __phys_to_pfn(pa);
+
+   for (addr = guest_ipa; addr < end; addr += PAGE_SIZE) {
+   pte_t pte = pfn_pte(pfn, PAGE_S2);
+
+   if (writable)
+   pte = kvm_s2pte_mkwrite(pte);
+
+   ret = mmu_topup_memory_cache(,
+kvm_mmu_cache_min_pages(kvm),
+KVM_NR_MEM_OBJS);
+   if (ret)
+   goto out;
+   spin_lock(>mmu_lock);
+   ret = stage2_set_pte(kvm, , addr, , 0);
+   spin_unlock(>mmu_lock);
+   if (ret)
+   goto out;
+
+   pfn++;
+   }
+
+out:
+   mmu_free_memory_cache();
+   

[PATCH 9/9] arm64: Retrieve stolen time as paravirtualized guest

2019-08-02 Thread Steven Price
Enable paravirtualization features when running under a hypervisor
supporting the PV_TIME_ST hypercall.

For each (v)CPU, we ask the hypervisor for the location of a shared
page which the hypervisor will use to report stolen time to us. We set
pv_time_ops to the stolen time function which simply reads the stolen
value from the shared page for a VCPU. We guarantee single-copy
atomicity using READ_ONCE which means we can also read the stolen
time for another VCPU than the currently running one while it is
potentially being updated by the hypervisor.

Signed-off-by: Steven Price 
---
 arch/arm64/kernel/Makefile |   1 +
 arch/arm64/kernel/kvm.c| 155 +
 include/linux/cpuhotplug.h |   1 +
 3 files changed, 157 insertions(+)
 create mode 100644 arch/arm64/kernel/kvm.c

diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 478491f07b4f..eb36edf9b930 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -63,6 +63,7 @@ obj-$(CONFIG_CRASH_CORE)  += crash_core.o
 obj-$(CONFIG_ARM_SDE_INTERFACE)+= sdei.o
 obj-$(CONFIG_ARM64_SSBD)   += ssbd.o
 obj-$(CONFIG_ARM64_PTR_AUTH)   += pointer_auth.o
+obj-$(CONFIG_PARAVIRT) += kvm.o
 
 obj-y  += vdso/ probes/
 obj-$(CONFIG_COMPAT_VDSO)  += vdso32/
diff --git a/arch/arm64/kernel/kvm.c b/arch/arm64/kernel/kvm.c
new file mode 100644
index ..245398c79dae
--- /dev/null
+++ b/arch/arm64/kernel/kvm.c
@@ -0,0 +1,155 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2019 Arm Ltd.
+
+#define pr_fmt(fmt) "kvmarm-pv: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+struct kvmarm_stolen_time_region {
+   struct pvclock_vcpu_stolen_time_info *kaddr;
+};
+
+static DEFINE_PER_CPU(struct kvmarm_stolen_time_region, stolen_time_region);
+
+static bool steal_acc = true;
+static int __init parse_no_stealacc(char *arg)
+{
+   steal_acc = false;
+   return 0;
+}
+early_param("no-steal-acc", parse_no_stealacc);
+
+/* return stolen time in ns by asking the hypervisor */
+static u64 kvm_steal_clock(int cpu)
+{
+   struct kvmarm_stolen_time_region *reg;
+
+   reg = per_cpu_ptr(_time_region, cpu);
+   if (!reg->kaddr) {
+   pr_warn_once("stolen time enabled but not configured for cpu 
%d\n",
+cpu);
+   return 0;
+   }
+
+   return le64_to_cpu(READ_ONCE(reg->kaddr->stolen_time));
+}
+
+static int disable_stolen_time_current_cpu(void)
+{
+   struct kvmarm_stolen_time_region *reg;
+
+   reg = this_cpu_ptr(_time_region);
+   if (!reg->kaddr)
+   return 0;
+
+   memunmap(reg->kaddr);
+   memset(reg, 0, sizeof(*reg));
+
+   return 0;
+}
+
+static int stolen_time_dying_cpu(unsigned int cpu)
+{
+   return disable_stolen_time_current_cpu();
+}
+
+static int init_stolen_time_cpu(unsigned int cpu)
+{
+   struct kvmarm_stolen_time_region *reg;
+   struct arm_smccc_res res;
+
+   reg = this_cpu_ptr(_time_region);
+
+   if (reg->kaddr)
+   return 0;
+
+   arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_TIME_ST, );
+
+   if ((long)res.a0 < 0)
+   return -EINVAL;
+
+   reg->kaddr = memremap(res.a0,
+   sizeof(struct pvclock_vcpu_stolen_time_info),
+   MEMREMAP_WB);
+
+   if (reg->kaddr == NULL) {
+   pr_warn("Failed to map stolen time data structure\n");
+   return -EINVAL;
+   }
+
+   if (le32_to_cpu(reg->kaddr->revision) != 0 ||
+   le32_to_cpu(reg->kaddr->attributes) != 0) {
+   pr_warn("Unexpected revision or attributes in stolen time 
data\n");
+   return -ENXIO;
+   }
+
+   return 0;
+}
+
+static int kvm_arm_init_stolen_time(void)
+{
+   int ret;
+
+   ret = cpuhp_setup_state(CPUHP_AP_ARM_KVMPV_STARTING,
+   "hypervisor/kvmarm/pv:starting",
+   init_stolen_time_cpu, stolen_time_dying_cpu);
+   if (ret < 0)
+   return ret;
+   return 0;
+}
+
+static bool has_kvm_steal_clock(void)
+{
+   struct arm_smccc_res res;
+
+   /* To detect the presence of PV time support we require SMCCC 1.1+ */
+   if (psci_ops.smccc_version < SMCCC_VERSION_1_1)
+   return false;
+
+   arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
+ARM_SMCCC_HV_PV_FEATURES, );
+
+   if (res.a0 != SMCCC_RET_SUCCESS)
+   return false;
+
+   arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_FEATURES,
+ARM_SMCCC_HV_PV_TIME_ST, );
+
+   if (res.a0 != SMCCC_RET_SUCCESS)
+   return false;
+
+   return true;
+}
+
+static int __init kvm_guest_init(void)
+{
+   int ret = 0;
+
+   if 

[PATCH 7/9] arm/arm64: Provide a wrapper for SMCCC 1.1 calls

2019-08-02 Thread Steven Price
SMCCC 1.1 calls may use either HVC or SMC depending on the PSCI
conduit. Rather than coding this in every call site provide a macro
which uses the correct instruction. The macro also handles the case
where no PSCI conduit is configured returning a not supported error
in res, along with returning the conduit used for the call.

This allow us to remove some duplicated code and will be useful later
when adding paravirtualized time hypervisor calls.

Signed-off-by: Steven Price 
---
 include/linux/arm-smccc.h | 44 +++
 1 file changed, 44 insertions(+)

diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
index e7f129f26ebd..eee1e832221d 100644
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -303,6 +303,50 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned 
long a1,
 #define SMCCC_RET_NOT_SUPPORTED-1
 #define SMCCC_RET_NOT_REQUIRED -2
 
+/* Like arm_smccc_1_1* but always returns SMCCC_RET_NOT_SUPPORTED.
+ * Used when the PSCI conduit is not defined. The empty asm statement
+ * avoids compiler warnings about unused variables.
+ */
+#define __fail_smccc_1_1(...)  \
+   do {\
+   __declare_args(__count_args(__VA_ARGS__), __VA_ARGS__); \
+   asm ("" __constraints(__count_args(__VA_ARGS__)));  \
+   if (___res) \
+   ___res->a0 = SMCCC_RET_NOT_SUPPORTED;   \
+   } while (0)
+
+/*
+ * arm_smccc_1_1_invoke() - make an SMCCC v1.1 compliant call
+ *
+ * This is a variadic macro taking one to eight source arguments, and
+ * an optional return structure.
+ *
+ * @a0-a7: arguments passed in registers 0 to 7
+ * @res: result values from registers 0 to 3
+ *
+ * This macro will make either an HVC call or an SMC call depending on the
+ * current PSCI conduit. If no valid conduit is available then -1
+ * (SMCCC_RET_NOT_SUPPORTED) is returned in @res.a0 (if supplied).
+ *
+ * The return value also provides the conduit that was used.
+ */
+#define arm_smccc_1_1_invoke(...) ({   \
+   int method = psci_ops.conduit;  \
+   switch (method) {   \
+   case PSCI_CONDUIT_HVC:  \
+   arm_smccc_1_1_hvc(__VA_ARGS__); \
+   break;  \
+   case PSCI_CONDUIT_SMC:  \
+   arm_smccc_1_1_smc(__VA_ARGS__); \
+   break;  \
+   default:\
+   __fail_smccc_1_1(__VA_ARGS__);  \
+   method = PSCI_CONDUIT_NONE; \
+   break;  \
+   }   \
+   method; \
+   })
+
 /* Paravirtualised time calls (defined by ARM DEN0057A) */
 #define ARM_SMCCC_HV_PV_FEATURES   \
ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 8/9] arm/arm64: Make use of the SMCCC 1.1 wrapper

2019-08-02 Thread Steven Price
Rather than directly choosing which function to use based on
psci_ops.conduit, use the new arm_smccc_1_1 wrapper instead.

In some cases we still need to do some operations based on the
conduit, but the code duplication is removed.

No functional change.

Signed-off-by: Steven Price 
---
 arch/arm/mm/proc-v7-bugs.c | 13 +++---
 arch/arm64/kernel/cpu_errata.c | 80 --
 2 files changed, 33 insertions(+), 60 deletions(-)

diff --git a/arch/arm/mm/proc-v7-bugs.c b/arch/arm/mm/proc-v7-bugs.c
index 9a07916af8dd..8eb52f3385e7 100644
--- a/arch/arm/mm/proc-v7-bugs.c
+++ b/arch/arm/mm/proc-v7-bugs.c
@@ -78,12 +78,13 @@ static void cpu_v7_spectre_init(void)
if (psci_ops.smccc_version == SMCCC_VERSION_1_0)
break;
 
+   arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
+ARM_SMCCC_ARCH_WORKAROUND_1, );
+   if ((int)res.a0 != 0)
+   return;
+
switch (psci_ops.conduit) {
case PSCI_CONDUIT_HVC:
-   arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
- ARM_SMCCC_ARCH_WORKAROUND_1, );
-   if ((int)res.a0 != 0)
-   break;
per_cpu(harden_branch_predictor_fn, cpu) =
call_hvc_arch_workaround_1;
cpu_do_switch_mm = cpu_v7_hvc_switch_mm;
@@ -91,10 +92,6 @@ static void cpu_v7_spectre_init(void)
break;
 
case PSCI_CONDUIT_SMC:
-   arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
- ARM_SMCCC_ARCH_WORKAROUND_1, );
-   if ((int)res.a0 != 0)
-   break;
per_cpu(harden_branch_predictor_fn, cpu) =
call_smc_arch_workaround_1;
cpu_do_switch_mm = cpu_v7_smc_switch_mm;
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 1e43ba5c79b7..400a49aaae85 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -215,40 +215,31 @@ static int detect_harden_bp_fw(void)
if (psci_ops.smccc_version == SMCCC_VERSION_1_0)
return -1;
 
+   arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
+ARM_SMCCC_ARCH_WORKAROUND_1, );
+
+   switch ((int)res.a0) {
+   case 1:
+   /* Firmware says we're just fine */
+   return 0;
+   case 0:
+   break;
+   default:
+   return -1;
+   }
+
switch (psci_ops.conduit) {
case PSCI_CONDUIT_HVC:
-   arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
- ARM_SMCCC_ARCH_WORKAROUND_1, );
-   switch ((int)res.a0) {
-   case 1:
-   /* Firmware says we're just fine */
-   return 0;
-   case 0:
-   cb = call_hvc_arch_workaround_1;
-   /* This is a guest, no need to patch KVM vectors */
-   smccc_start = NULL;
-   smccc_end = NULL;
-   break;
-   default:
-   return -1;
-   }
+   cb = call_hvc_arch_workaround_1;
+   /* This is a guest, no need to patch KVM vectors */
+   smccc_start = NULL;
+   smccc_end = NULL;
break;
 
case PSCI_CONDUIT_SMC:
-   arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
- ARM_SMCCC_ARCH_WORKAROUND_1, );
-   switch ((int)res.a0) {
-   case 1:
-   /* Firmware says we're just fine */
-   return 0;
-   case 0:
-   cb = call_smc_arch_workaround_1;
-   smccc_start = __smccc_workaround_1_smc_start;
-   smccc_end = __smccc_workaround_1_smc_end;
-   break;
-   default:
-   return -1;
-   }
+   cb = call_smc_arch_workaround_1;
+   smccc_start = __smccc_workaround_1_smc_start;
+   smccc_end = __smccc_workaround_1_smc_end;
break;
 
default:
@@ -338,6 +329,7 @@ void __init arm64_enable_wa2_handling(struct alt_instr *alt,
 
 void arm64_set_ssbd_mitigation(bool state)
 {
+   int conduit;
if (!IS_ENABLED(CONFIG_ARM64_SSBD)) {
pr_info_once("SSBD disabled by kernel configuration\n");
return;
@@ -351,19 +343,10 @@ void arm64_set_ssbd_mitigation(bool state)
return;
}
 
-   switch (psci_ops.conduit) {
-   case PSCI_CONDUIT_HVC:
-

[PATCH 2/9] KVM: arm/arm64: Factor out hypercall handling from PSCI code

2019-08-02 Thread Steven Price
From: Christoffer Dall 

We currently intertwine the KVM PSCI implementation with the general
dispatch of hypercall handling, which makes perfect sense because PSCI
is the only category of hypercalls we support.

However, as we are about to support additional hypercalls, factor out
this functionality into a separate hypercall handler file.

Signed-off-by: Christoffer Dall 
[steven.pr...@arm.com: rebased]
Signed-off-by: Steven Price 
---
 arch/arm/kvm/Makefile|  2 +-
 arch/arm/kvm/handle_exit.c   |  2 +-
 arch/arm64/kvm/Makefile  |  1 +
 arch/arm64/kvm/handle_exit.c |  4 +-
 include/kvm/arm_hypercalls.h | 43 ++
 include/kvm/arm_psci.h   |  2 +-
 virt/kvm/arm/hypercalls.c| 59 +
 virt/kvm/arm/psci.c  | 84 +---
 8 files changed, 110 insertions(+), 87 deletions(-)
 create mode 100644 include/kvm/arm_hypercalls.h
 create mode 100644 virt/kvm/arm/hypercalls.c

diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
index 531e59f5be9c..ef4d01088efc 100644
--- a/arch/arm/kvm/Makefile
+++ b/arch/arm/kvm/Makefile
@@ -23,7 +23,7 @@ obj-y += kvm-arm.o init.o interrupts.o
 obj-y += handle_exit.o guest.o emulate.o reset.o
 obj-y += coproc.o coproc_a15.o coproc_a7.o   vgic-v3-coproc.o
 obj-y += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o
-obj-y += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
+obj-y += $(KVM)/arm/psci.o $(KVM)/arm/perf.o $(KVM)/arm/hypercalls.o
 obj-y += $(KVM)/arm/aarch32.o
 
 obj-y += $(KVM)/arm/vgic/vgic.o
diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
index 2a6a1394d26e..e58a89d2f13f 100644
--- a/arch/arm/kvm/handle_exit.c
+++ b/arch/arm/kvm/handle_exit.c
@@ -9,7 +9,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include "trace.h"
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 3ac1a64d2fb9..73dce4d47d47 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_KVM_ARM_HOST) += hyp/
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o 
$(KVM)/eventfd.o $(KVM)/vfio.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o 
$(KVM)/arm/mmio.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
+kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hypercalls.o
 
 kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
 kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 706cca23f0d2..aacfc55de44c 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -11,8 +11,6 @@
 #include 
 #include 
 
-#include 
-
 #include 
 #include 
 #include 
@@ -22,6 +20,8 @@
 #include 
 #include 
 
+#include 
+
 #define CREATE_TRACE_POINTS
 #include "trace.h"
 
diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
new file mode 100644
index ..35a5abcc4ca3
--- /dev/null
+++ b/include/kvm/arm_hypercalls.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2019 Arm Ltd. */
+
+#ifndef __KVM_ARM_HYPERCALLS_H
+#define __KVM_ARM_HYPERCALLS_H
+
+#include 
+
+int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
+
+static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
+{
+   return vcpu_get_reg(vcpu, 0);
+}
+
+static inline unsigned long smccc_get_arg1(struct kvm_vcpu *vcpu)
+{
+   return vcpu_get_reg(vcpu, 1);
+}
+
+static inline unsigned long smccc_get_arg2(struct kvm_vcpu *vcpu)
+{
+   return vcpu_get_reg(vcpu, 2);
+}
+
+static inline unsigned long smccc_get_arg3(struct kvm_vcpu *vcpu)
+{
+   return vcpu_get_reg(vcpu, 3);
+}
+
+static inline void smccc_set_retval(struct kvm_vcpu *vcpu,
+unsigned long a0,
+unsigned long a1,
+unsigned long a2,
+unsigned long a3)
+{
+   vcpu_set_reg(vcpu, 0, a0);
+   vcpu_set_reg(vcpu, 1, a1);
+   vcpu_set_reg(vcpu, 2, a2);
+   vcpu_set_reg(vcpu, 3, a3);
+}
+
+#endif
diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
index 632e78bdef4d..5b58bd2fe088 100644
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -40,7 +40,7 @@ static inline int kvm_psci_version(struct kvm_vcpu *vcpu, 
struct kvm *kvm)
 }
 
 
-int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
+int kvm_psci_call(struct kvm_vcpu *vcpu);
 
 struct kvm_one_reg;
 
diff --git a/virt/kvm/arm/hypercalls.c b/virt/kvm/arm/hypercalls.c
new file mode 100644
index ..f875241bd030
--- /dev/null
+++ b/virt/kvm/arm/hypercalls.c
@@ -0,0 +1,59 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2019 Arm Ltd.
+
+#include 
+#include 
+
+#include 
+
+#include 
+#include 
+
+int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
+{
+   u32 func_id = smccc_get_function(vcpu);
+   u32 val = SMCCC_RET_NOT_SUPPORTED;
+   u32 feature;
+
+   switch (func_id) {
+   

[PATCH 5/9] KVM: Allow kvm_device_ops to be const

2019-08-02 Thread Steven Price
Currently a kvm_device_ops structure cannot be const without triggering
compiler warnings. However the structure doesn't need to be written to
and, by marking it const, it can be read-only in memory. Add some more
const keywords to allow this.

Signed-off-by: Steven Price 
---
 include/linux/kvm_host.h | 4 ++--
 virt/kvm/kvm_main.c  | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 5c5b5867024c..be31a6f8351a 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1236,7 +1236,7 @@ extern unsigned int halt_poll_ns_grow_start;
 extern unsigned int halt_poll_ns_shrink;
 
 struct kvm_device {
-   struct kvm_device_ops *ops;
+   const struct kvm_device_ops *ops;
struct kvm *kvm;
void *private;
struct list_head vm_node;
@@ -1289,7 +1289,7 @@ struct kvm_device_ops {
 void kvm_device_get(struct kvm_device *dev);
 void kvm_device_put(struct kvm_device *dev);
 struct kvm_device *kvm_device_from_filp(struct file *filp);
-int kvm_register_device_ops(struct kvm_device_ops *ops, u32 type);
+int kvm_register_device_ops(const struct kvm_device_ops *ops, u32 type);
 void kvm_unregister_device_ops(u32 type);
 
 extern struct kvm_device_ops kvm_mpic_ops;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 887f3b0c2b60..8c12110ec87a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3035,14 +3035,14 @@ struct kvm_device *kvm_device_from_filp(struct file 
*filp)
return filp->private_data;
 }
 
-static struct kvm_device_ops *kvm_device_ops_table[KVM_DEV_TYPE_MAX] = {
+static const struct kvm_device_ops *kvm_device_ops_table[KVM_DEV_TYPE_MAX] = {
 #ifdef CONFIG_KVM_MPIC
[KVM_DEV_TYPE_FSL_MPIC_20]  = _mpic_ops,
[KVM_DEV_TYPE_FSL_MPIC_42]  = _mpic_ops,
 #endif
 };
 
-int kvm_register_device_ops(struct kvm_device_ops *ops, u32 type)
+int kvm_register_device_ops(const struct kvm_device_ops *ops, u32 type)
 {
if (type >= ARRAY_SIZE(kvm_device_ops_table))
return -ENOSPC;
@@ -3063,7 +3063,7 @@ void kvm_unregister_device_ops(u32 type)
 static int kvm_ioctl_create_device(struct kvm *kvm,
   struct kvm_create_device *cd)
 {
-   struct kvm_device_ops *ops = NULL;
+   const struct kvm_device_ops *ops = NULL;
struct kvm_device *dev;
bool test = cd->flags & KVM_CREATE_DEVICE_TEST;
int type;
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 4/9] KVM: arm64: Support stolen time reporting via shared structure

2019-08-02 Thread Steven Price
Implement the service call for configuring a shared structre between a
VCPU and the hypervisor in which the hypervisor can write the time
stolen from the VCPU's execution time by other tasks on the host.

The hypervisor allocates memory which is placed at an IPA chosen by user
space. The hypervisor then uses WRITE_ONCE() to update the shared
structre ensuring single copy atomicity of the 64-bit unsigned value
that reports stolen time in nanoseconds.

Whenever stolen time is enabled by the guest, the stolen time counter is
reset.

The stolen time itself is retrieved from the sched_info structure
maintained by the Linux scheduler code. We enable SCHEDSTATS when
selecting KVM Kconfig to ensure this value is meaningful.

Signed-off-by: Steven Price 
---
 arch/arm64/include/asm/kvm_host.h | 13 +-
 arch/arm64/kvm/Kconfig|  1 +
 include/kvm/arm_hypercalls.h  |  1 +
 include/linux/kvm_types.h |  2 +
 virt/kvm/arm/arm.c| 18 
 virt/kvm/arm/hypercalls.c | 70 +++
 6 files changed, 104 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index f656169db8c3..78f270190d43 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -44,6 +44,7 @@
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_IRQ_PENDINGKVM_ARCH_REQ(1)
 #define KVM_REQ_VCPU_RESET KVM_ARCH_REQ(2)
+#define KVM_REQ_RECORD_STEAL   KVM_ARCH_REQ(3)
 
 DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
 
@@ -83,6 +84,11 @@ struct kvm_arch {
 
/* Mandated version of PSCI */
u32 psci_version;
+
+   struct kvm_arch_pvtime {
+   void *st;
+   gpa_t st_base;
+   } pvtime;
 };
 
 #define KVM_NR_MEM_OBJS 40
@@ -338,8 +344,13 @@ struct kvm_vcpu_arch {
/* True when deferrable sysregs are loaded on the physical CPU,
 * see kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs. */
bool sysregs_loaded_on_cpu;
-};
 
+   /* Guest PV state */
+   struct {
+   u64 steal;
+   u64 last_steal;
+   } steal;
+};
 /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
 #define vcpu_sve_pffr(vcpu) ((void *)((char *)((vcpu)->arch.sve_state) + \
  sve_ffr_offset((vcpu)->arch.sve_max_vl)))
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index a67121d419a2..d8b88e40d223 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -39,6 +39,7 @@ config KVM
select IRQ_BYPASS_MANAGER
select HAVE_KVM_IRQ_BYPASS
select HAVE_KVM_VCPU_RUN_PID_CHANGE
+   select SCHEDSTATS
---help---
  Support hosting virtualized guest machines.
  We don't support KVM with 16K page tables yet, due to the multiple
diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
index 35a5abcc4ca3..9f0710ab4292 100644
--- a/include/kvm/arm_hypercalls.h
+++ b/include/kvm/arm_hypercalls.h
@@ -7,6 +7,7 @@
 #include 
 
 int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
+int kvm_update_stolen_time(struct kvm_vcpu *vcpu);
 
 static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
 {
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index bde5374ae021..1c88e69db3d9 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -35,6 +35,8 @@ typedef unsigned long  gva_t;
 typedef u64gpa_t;
 typedef u64gfn_t;
 
+#define GPA_INVALID(~(gpa_t)0)
+
 typedef unsigned long  hva_t;
 typedef u64hpa_t;
 typedef u64hfn_t;
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index f645c0fbf7ec..ebd963d2580b 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -40,6 +40,10 @@
 #include 
 #include 
 
+#include 
+#include 
+#include 
+
 #ifdef REQUIRES_VIRT
 __asm__(".arch_extension   virt");
 #endif
@@ -135,6 +139,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
kvm->arch.max_vcpus = vgic_present ?
kvm_vgic_get_max_vcpus() : KVM_MAX_VCPUS;
 
+   kvm->arch.pvtime.st_base = GPA_INVALID;
return ret;
 out_free_stage2_pgd:
kvm_free_stage2_pgd(kvm);
@@ -371,6 +376,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
kvm_vcpu_load_sysregs(vcpu);
kvm_arch_vcpu_load_fp(vcpu);
kvm_vcpu_pmu_restore_guest(vcpu);
+   kvm_make_request(KVM_REQ_RECORD_STEAL, vcpu);
 
if (single_task_running())
vcpu_clear_wfe_traps(vcpu);
@@ -617,6 +623,15 @@ static void vcpu_req_sleep(struct kvm_vcpu *vcpu)
smp_rmb();
 }
 
+static void vcpu_req_record_steal(struct kvm_vcpu *vcpu)
+{
+   int idx;
+
+   idx = srcu_read_lock(>kvm->srcu);
+   kvm_update_stolen_time(vcpu);
+   srcu_read_unlock(>kvm->srcu, idx);
+}
+
 static int kvm_vcpu_initialized(struct kvm_vcpu *vcpu)
 {
 

[PATCH 3/9] KVM: arm64: Implement PV_FEATURES call

2019-08-02 Thread Steven Price
This provides a mechanism for querying which paravirtualized features
are available in this hypervisor.

Also add the header file which defines the ABI for the paravirtualized
clock features we're about to add.

Signed-off-by: Steven Price 
---
 arch/arm64/include/asm/pvclock-abi.h | 20 
 include/linux/arm-smccc.h| 14 ++
 virt/kvm/arm/hypercalls.c|  9 +
 3 files changed, 43 insertions(+)
 create mode 100644 arch/arm64/include/asm/pvclock-abi.h

diff --git a/arch/arm64/include/asm/pvclock-abi.h 
b/arch/arm64/include/asm/pvclock-abi.h
new file mode 100644
index ..1f7cdc102691
--- /dev/null
+++ b/arch/arm64/include/asm/pvclock-abi.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2019 Arm Ltd. */
+
+#ifndef __ASM_PVCLOCK_ABI_H
+#define __ASM_PVCLOCK_ABI_H
+
+/* The below structures and constants are defined in ARM DEN0057A */
+
+struct pvclock_vcpu_stolen_time_info {
+   __le32 revision;
+   __le32 attributes;
+   __le64 stolen_time;
+   /* Structure must be 64 byte aligned, pad to that size */
+   u8 padding[48];
+} __packed;
+
+#define PV_VM_TIME_NOT_SUPPORTED   -1
+#define PV_VM_TIME_INVALID_PARAMETERS  -2
+
+#endif
diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
index 080012a6f025..e7f129f26ebd 100644
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -45,6 +45,7 @@
 #define ARM_SMCCC_OWNER_SIP2
 #define ARM_SMCCC_OWNER_OEM3
 #define ARM_SMCCC_OWNER_STANDARD   4
+#define ARM_SMCCC_OWNER_STANDARD_HYP   5
 #define ARM_SMCCC_OWNER_TRUSTED_APP48
 #define ARM_SMCCC_OWNER_TRUSTED_APP_END49
 #define ARM_SMCCC_OWNER_TRUSTED_OS 50
@@ -302,5 +303,18 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned 
long a1,
 #define SMCCC_RET_NOT_SUPPORTED-1
 #define SMCCC_RET_NOT_REQUIRED -2
 
+/* Paravirtualised time calls (defined by ARM DEN0057A) */
+#define ARM_SMCCC_HV_PV_FEATURES   \
+   ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+  ARM_SMCCC_SMC_64,\
+  ARM_SMCCC_OWNER_STANDARD_HYP,\
+  0x20)
+
+#define ARM_SMCCC_HV_PV_TIME_ST\
+   ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+  ARM_SMCCC_SMC_64,\
+  ARM_SMCCC_OWNER_STANDARD_HYP,\
+  0x22)
+
 #endif /*__ASSEMBLY__*/
 #endif /*__LINUX_ARM_SMCCC_H*/
diff --git a/virt/kvm/arm/hypercalls.c b/virt/kvm/arm/hypercalls.c
index f875241bd030..2906b2df99df 100644
--- a/virt/kvm/arm/hypercalls.c
+++ b/virt/kvm/arm/hypercalls.c
@@ -5,6 +5,7 @@
 #include 
 
 #include 
+#include 
 
 #include 
 #include 
@@ -48,6 +49,14 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
break;
}
break;
+   case ARM_SMCCC_HV_PV_FEATURES:
+   val = SMCCC_RET_SUCCESS;
+   break;
+   }
+   break;
+   case ARM_SMCCC_HV_PV_FEATURES:
+   feature = smccc_get_arg1(vcpu);
+   switch (feature) {
}
break;
default:
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 0/9] arm64: Stolen time support

2019-08-02 Thread Steven Price
This series add support for paravirtualized time for arm64 guests and
KVM hosts following the specification in Arm's document DEN 0057A:

https://developer.arm.com/docs/den0057/a

It implements support for stolen time, allowing the guest to
identify time when it is forcibly not executing.

It doesn't implement support for Live Physical Time (LPT) as there are
some concerns about the overheads and approach in the above
specification, and I expect an updated version of the specification to
be released soon with just the stolen time parts.

I previously posted a series including LPT (as well as stolen time):
https://lore.kernel.org/kvmarm/20181212150226.38051-1-steven.pr...@arm.com/

Patches 2, 5, 7 and 8 are cleanup patches and could be taken separately.

Christoffer Dall (1):
  KVM: arm/arm64: Factor out hypercall handling from PSCI code

Steven Price (8):
  KVM: arm64: Document PV-time interface
  KVM: arm64: Implement PV_FEATURES call
  KVM: arm64: Support stolen time reporting via shared structure
  KVM: Allow kvm_device_ops to be const
  KVM: arm64: Provide a PV_TIME device to user space
  arm/arm64: Provide a wrapper for SMCCC 1.1 calls
  arm/arm64: Make use of the SMCCC 1.1 wrapper
  arm64: Retrieve stolen time as paravirtualized guest

 Documentation/virtual/kvm/arm/pvtime.txt | 107 +
 arch/arm/kvm/Makefile|   2 +-
 arch/arm/kvm/handle_exit.c   |   2 +-
 arch/arm/mm/proc-v7-bugs.c   |  13 +-
 arch/arm64/include/asm/kvm_host.h|  13 +-
 arch/arm64/include/asm/kvm_mmu.h |   2 +
 arch/arm64/include/asm/pvclock-abi.h |  20 +++
 arch/arm64/include/uapi/asm/kvm.h|   6 +
 arch/arm64/kernel/Makefile   |   1 +
 arch/arm64/kernel/cpu_errata.c   |  80 --
 arch/arm64/kernel/kvm.c  | 155 ++
 arch/arm64/kvm/Kconfig   |   1 +
 arch/arm64/kvm/Makefile  |   2 +
 arch/arm64/kvm/handle_exit.c |   4 +-
 include/kvm/arm_hypercalls.h |  44 ++
 include/kvm/arm_psci.h   |   2 +-
 include/linux/arm-smccc.h|  58 +++
 include/linux/cpuhotplug.h   |   1 +
 include/linux/kvm_host.h |   4 +-
 include/linux/kvm_types.h|   2 +
 include/uapi/linux/kvm.h |   2 +
 virt/kvm/arm/arm.c   |  18 +++
 virt/kvm/arm/hypercalls.c| 138 
 virt/kvm/arm/mmu.c   |  44 ++
 virt/kvm/arm/psci.c  |  84 +-
 virt/kvm/arm/pvtime.c| 190 +++
 virt/kvm/kvm_main.c  |   6 +-
 27 files changed, 848 insertions(+), 153 deletions(-)
 create mode 100644 Documentation/virtual/kvm/arm/pvtime.txt
 create mode 100644 arch/arm64/include/asm/pvclock-abi.h
 create mode 100644 arch/arm64/kernel/kvm.c
 create mode 100644 include/kvm/arm_hypercalls.h
 create mode 100644 virt/kvm/arm/hypercalls.c
 create mode 100644 virt/kvm/arm/pvtime.c

-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 1/9] KVM: arm64: Document PV-time interface

2019-08-02 Thread Steven Price
Introduce a paravirtualization interface for KVM/arm64 based on the
"Arm Paravirtualized Time for Arm-Base Systems" specification DEN 0057A.

This only adds the details about "Stolen Time" as the details of "Live
Physical Time" have not been fully agreed.

User space can specify a reserved area of memory for the guest and
inform KVM to populate the memory with information on time that the host
kernel has stolen from the guest.

A hypercall interface is provided for the guest to interrogate the
hypervisor's support for this interface and the location of the shared
memory structures.

Signed-off-by: Steven Price 
---
 Documentation/virtual/kvm/arm/pvtime.txt | 107 +++
 1 file changed, 107 insertions(+)
 create mode 100644 Documentation/virtual/kvm/arm/pvtime.txt

diff --git a/Documentation/virtual/kvm/arm/pvtime.txt 
b/Documentation/virtual/kvm/arm/pvtime.txt
new file mode 100644
index ..e6ae9799e1d5
--- /dev/null
+++ b/Documentation/virtual/kvm/arm/pvtime.txt
@@ -0,0 +1,107 @@
+Paravirtualized time support for arm64
+==
+
+Arm specification DEN0057/A defined a standard for paravirtualised time
+support for Aarch64 guests:
+
+https://developer.arm.com/docs/den0057/a
+
+KVM/Arm64 implements the stolen time part of this specification by providing
+some hypervisor service calls to support a paravirtualized guest obtaining a
+view of the amount of time stolen from its execution.
+
+Two new SMCCC compatible hypercalls are defined:
+
+PV_FEATURES 0xC520
+PV_TIME_ST  0xC522
+
+These are only available in the SMC64/HVC64 calling convention as
+paravirtualized time is not available to 32 bit Arm guests.
+
+PV_FEATURES
+Function ID:  (uint32)  : 0xC520
+PV_func_id:   (uint32)  : Either PV_TIME_LPT or PV_TIME_ST
+Return value: (int32)   : NOT_SUPPORTED (-1) or SUCCESS (0) if the relevant
+  PV-time feature is supported by the hypervisor.
+
+PV_TIME_ST
+Function ID:  (uint32)  : 0xC522
+Return value: (int64)   : IPA of the stolen time data structure for this
+  (V)CPU. On failure:
+  NOT_SUPPORTED (-1)
+
+Stolen Time
+---
+
+The structure pointed to by the PV_TIME_ST hypercall is as follows:
+
+  Field   | Byte Length | Byte Offset | Description
+  --- | --- | --- | --
+  Revision|  4  |  0  | Must be 0 for version 0.1
+  Attributes  |  4  |  4  | Must be 0
+  Stolen time |  8  |  8  | Stolen time in unsigned
+  | | | nanoseconds indicating how
+  | | | much time this VCPU thread
+  | | | was involuntarily not
+  | | | running on a physical CPU.
+
+The structure will be updated by the hypervisor periodically as time is stolen
+from the VCPU. It will be present within a reserved region of the normal
+memory given to the guest. The guest should not attempt to write into this
+memory. There is a structure by VCPU of the guest.
+
+User space interface
+
+
+User space can request that KVM provide the paravirtualized time interface to
+a guest by creating a KVM_DEV_TYPE_ARM_PV_TIME device, for example:
+
+struct kvm_create_device pvtime_device = {
+.type = KVM_DEV_TYPE_ARM_PV_TIME,
+.attr = 0,
+.flags = 0,
+};
+
+pvtime_fd = ioctl(vm_fd, KVM_CREATE_DEVICE, _device);
+
+The guest IPA of the structures must be given to KVM. This is the base address
+of an array of stolen time structures (one for each VCPU). For example:
+
+struct kvm_device_attr st_base = {
+.group = KVM_DEV_ARM_PV_TIME_PADDR,
+.attr = KVM_DEV_ARM_PV_TIME_ST,
+.addr = (u64)(unsigned long)_paddr
+};
+
+ioctl(pvtime_fd, KVM_SET_DEVICE_ATTR, _base);
+
+For migration (or save/restore) of a guest it is necessary to save the contents
+of the shared page(s) and later restore them. KVM_DEV_ARM_PV_TIME_STATE_SIZE
+provides the size of this data and KVM_DEV_ARM_PV_TIME_STATE allows the state
+to be read/written.
+
+It is also necessary for the physical address to be set identically when
+restoring.
+
+void *save_state(int fd, u64 attr, u32 *size) {
+struct kvm_device_attr get_size = {
+.group = KVM_DEV_ARM_PV_TIME_STATE_SIZE,
+.attr = attr,
+.addr = (u64)(unsigned long)size
+};
+
+ioctl(fd, KVM_GET_DEVICE_ATTR, get_size);
+
+void *buffer = malloc(*size);
+
+struct kvm_device_attr get_state = {
+.group = KVM_DEV_ARM_PV_TIME_STATE,
+.attr = attr,
+.addr = (u64)(unsigned long)size
+};
+
+ioctl(fd, KVM_GET_DEVICE_ATTR, buffer);
+}
+
+void *st_state = 

Re: [PATCH] arm64/kvm: fix -Wimplicit-fallthrough warnings

2019-08-02 Thread Marc Zyngier
On 02/08/2019 15:23, Qian Cai wrote:
> The commit a892819560c4 ("KVM: arm64: Prepare to handle deferred
> save/restore of 32-bit registers") introduced vcpu_write_spsr32() but
> seems forgot to add "break" between the switch statements and generates
> compilation warnings below. Also, adding a default statement as in
> vcpu_read_spsr32().

See
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git/commit/?id=3d584a3c85d6fe2cf878f220d4ad7145e7f89218

The default statement is pretty pointless by construction.

Thanks,

M.
-- 
Jazz is not dead, it just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH] arm64/kvm: fix -Wimplicit-fallthrough warnings

2019-08-02 Thread Qian Cai
The commit a892819560c4 ("KVM: arm64: Prepare to handle deferred
save/restore of 32-bit registers") introduced vcpu_write_spsr32() but
seems forgot to add "break" between the switch statements and generates
compilation warnings below. Also, adding a default statement as in
vcpu_read_spsr32().

In file included from ./arch/arm64/include/asm/kvm_emulate.h:19,
 from arch/arm64/kvm/regmap.c:13:
arch/arm64/kvm/regmap.c: In function 'vcpu_write_spsr32':
./arch/arm64/include/asm/kvm_hyp.h:31:3: warning: this statement may
fall through [-Wimplicit-fallthrough=]
   asm volatile(ALTERNATIVE(__msr_s(r##nvh, "%x0"), \
   ^~~
./arch/arm64/include/asm/kvm_hyp.h:46:31: note: in expansion of macro
'write_sysreg_elx'
 #define write_sysreg_el1(v,r) write_sysreg_elx(v, r, _EL1, _EL12)
   ^~~~
arch/arm64/kvm/regmap.c:180:3: note: in expansion of macro
'write_sysreg_el1'
   write_sysreg_el1(v, SYS_SPSR);
   ^~~~
arch/arm64/kvm/regmap.c:181:2: note: here
  case KVM_SPSR_ABT:
  ^~~~
In file included from ./arch/arm64/include/asm/cputype.h:132,
 from ./arch/arm64/include/asm/cache.h:8,
 from ./include/linux/cache.h:6,
 from ./include/linux/printk.h:9,
 from ./include/linux/kernel.h:15,
 from ./include/asm-generic/bug.h:18,
 from ./arch/arm64/include/asm/bug.h:26,
 from ./include/linux/bug.h:5,
 from ./include/linux/mmdebug.h:5,
 from ./include/linux/mm.h:9,
 from arch/arm64/kvm/regmap.c:11:
./arch/arm64/include/asm/sysreg.h:837:2: warning: this statement may
fall through [-Wimplicit-fallthrough=]
  asm volatile("msr " __stringify(r) ", %x0"  \
  ^~~
arch/arm64/kvm/regmap.c:182:3: note: in expansion of macro
'write_sysreg'
   write_sysreg(v, spsr_abt);
   ^~~~
arch/arm64/kvm/regmap.c:183:2: note: here
  case KVM_SPSR_UND:
  ^~~~
In file included from ./arch/arm64/include/asm/cputype.h:132,
 from ./arch/arm64/include/asm/cache.h:8,
 from ./include/linux/cache.h:6,
 from ./include/linux/printk.h:9,
 from ./include/linux/kernel.h:15,
 from ./include/asm-generic/bug.h:18,
 from ./arch/arm64/include/asm/bug.h:26,
 from ./include/linux/bug.h:5,
 from ./include/linux/mmdebug.h:5,
 from ./include/linux/mm.h:9,
 from arch/arm64/kvm/regmap.c:11:
./arch/arm64/include/asm/sysreg.h:837:2: warning: this statement may
fall through [-Wimplicit-fallthrough=]
  asm volatile("msr " __stringify(r) ", %x0"  \
  ^~~
arch/arm64/kvm/regmap.c:184:3: note: in expansion of macro
'write_sysreg'
   write_sysreg(v, spsr_und);
   ^~~~
arch/arm64/kvm/regmap.c:185:2: note: here
  case KVM_SPSR_IRQ:
  ^~~~
In file included from ./arch/arm64/include/asm/cputype.h:132,
 from ./arch/arm64/include/asm/cache.h:8,
 from ./include/linux/cache.h:6,
 from ./include/linux/printk.h:9,
 from ./include/linux/kernel.h:15,
 from ./include/asm-generic/bug.h:18,
 from ./arch/arm64/include/asm/bug.h:26,
 from ./include/linux/bug.h:5,
 from ./include/linux/mmdebug.h:5,
 from ./include/linux/mm.h:9,
 from arch/arm64/kvm/regmap.c:11:
./arch/arm64/include/asm/sysreg.h:837:2: warning: this statement may
fall through [-Wimplicit-fallthrough=]
  asm volatile("msr " __stringify(r) ", %x0"  \
  ^~~
arch/arm64/kvm/regmap.c:186:3: note: in expansion of macro
'write_sysreg'
   write_sysreg(v, spsr_irq);
   ^~~~
arch/arm64/kvm/regmap.c:187:2: note: here
  case KVM_SPSR_FIQ:
  ^~~~

Fixes: a892819560c4 ("KVM: arm64: Prepare to handle deferred save/restore of 
32-bit registers")
Signed-off-by: Qian Cai 
---
 arch/arm64/kvm/regmap.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm64/kvm/regmap.c b/arch/arm64/kvm/regmap.c
index 0d60e4f0af66..c94e9bc3e8eb 100644
--- a/arch/arm64/kvm/regmap.c
+++ b/arch/arm64/kvm/regmap.c
@@ -178,13 +178,20 @@ void vcpu_write_spsr32(struct kvm_vcpu *vcpu, unsigned 
long v)
switch (spsr_idx) {
case KVM_SPSR_SVC:
write_sysreg_el1(v, SYS_SPSR);
+   break;
case KVM_SPSR_ABT:
write_sysreg(v, spsr_abt);
+   break;
case KVM_SPSR_UND:
write_sysreg(v, spsr_und);
+   break;
case KVM_SPSR_IRQ:
write_sysreg(v, spsr_irq);
+   break;
case KVM_SPSR_FIQ:
write_sysreg(v, spsr_fiq);
+   break;
+   default:
+   BUG();
}
 }
-- 
1.8.3.1



Re: kvm-unit-tests: psci_cpu_on_test FAILed

2019-08-02 Thread Andrew Jones
On Fri, Aug 02, 2019 at 06:56:51PM +0800, Zenghui Yu wrote:
> Hi folks,
> 
> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
> the following fail info:
> 
>   [...]
>   FAIL psci (4 tests, 1 unexpected failures)
>   [...]
> and
>   [...]
>   INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
>   FAIL: cpu-on
>   SUMMARY: 4 tests, 1 unexpected failures
> 
> 
> I think this is an issue had been fixed once by commit 6c7a5dce22b3
> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
> does reset on the MPIDR register whilst another reads it.
> 
> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
> itself") later moves the reset work into check_vcpu_requests(), by
> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
> has not been protected by kvm->lock mutex anymore, and the race shows up
> again...
> 
> Do we need a fix for this issue? At least achieve a mutex execution
> between the reset of MPIDR and kvm_mpidr_to_vcpu()?
> 
>

I noticed this too, but I put it pretty low on my TODO because it's a
safe failure (no host crash, just an unexpected PSCI_RET_INVALID_PARAMS
gets returned because the valid MPIDR doesn't look valid for a moment.)
Also, the test is quite pathological, especially when the host has many
CPUs, so I wouldn't expect this to show up on a sane guest. I agree
it would be nice to get it fixed eventually though.

Thanks,
drew
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


kvm-unit-tests: psci_cpu_on_test FAILed

2019-08-02 Thread Zenghui Yu

Hi folks,

Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
the following fail info:

[...]
FAIL psci (4 tests, 1 unexpected failures)
[...]
and
[...]
INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
FAIL: cpu-on
SUMMARY: 4 tests, 1 unexpected failures


I think this is an issue had been fixed once by commit 6c7a5dce22b3
("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
does reset on the MPIDR register whilst another reads it.

But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
itself") later moves the reset work into check_vcpu_requests(), by
making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
has not been protected by kvm->lock mutex anymore, and the race shows up
again...

Do we need a fix for this issue? At least achieve a mutex execution
between the reset of MPIDR and kvm_mpidr_to_vcpu()?


Thanks,
zenghui

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 2/2] KVM: Call kvm_arch_vcpu_blocking early into the blocking sequence

2019-08-02 Thread Paolo Bonzini
On 02/08/19 12:37, Marc Zyngier wrote:
> When a vpcu is about to block by calling kvm_vcpu_block, we call
> back into the arch code to allow any form of synchronization that
> may be required at this point (SVN stops the AVIC, ARM synchronises
> the VMCR and enables GICv4 doorbells). But this synchronization
> comes in quite late, as we've potentially waited for halt_poll_ns
> to expire.
> 
> Instead, let's move kvm_arch_vcpu_blocking() to the beginning of
> kvm_vcpu_block(), which on ARM has several benefits:
> 
> - VMCR gets synchronised early, meaning that any interrupt delivered
>   during the polling window will be evaluated with the correct guest
>   PMR
> - GICv4 doorbells are enabled, which means that any guest interrupt
>   directly injected during that window will be immediately recognised
> 
> Tang Nianyao ran some tests on a GICv4 machine to evaluate such
> change, and reported up to a 10% improvement for netperf:
> 
> 
>   netperf result:
>   D06 as server, intel 8180 server as client
>   with change:
>   package 512 bytes - 5500 Mbits/s
>   package 64 bytes - 760 Mbits/s
>   without change:
>   package 512 bytes - 5000 Mbits/s
>   package 64 bytes - 710 Mbits/s
> 
> 
> Signed-off-by: Marc Zyngier 
> ---
>  virt/kvm/kvm_main.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 887f3b0c2b60..90d429c703cb 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2322,6 +2322,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>   bool waited = false;
>   u64 block_ns;
>  
> + kvm_arch_vcpu_blocking(vcpu);
> +
>   start = cur = ktime_get();
>   if (vcpu->halt_poll_ns && !kvm_arch_no_poll(vcpu)) {
>   ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
> @@ -2342,8 +2344,6 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>   } while (single_task_running() && ktime_before(cur, stop));
>   }
>  
> - kvm_arch_vcpu_blocking(vcpu);
> -
>   for (;;) {
>   prepare_to_swait_exclusive(>wq, , 
> TASK_INTERRUPTIBLE);
>  
> @@ -2356,9 +2356,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>  
>   finish_swait(>wq, );
>   cur = ktime_get();
> -
> - kvm_arch_vcpu_unblocking(vcpu);
>  out:
> + kvm_arch_vcpu_unblocking(vcpu);
>   block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
>  
>   if (!vcpu_valid_wakeup(vcpu))
> 

Acked-by: Paolo Bonzini 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 1/2] KVM: arm/arm64: Sync ICH_VMCR_EL2 back when about to block

2019-08-02 Thread Marc Zyngier
Since commit commit 328e56647944 ("KVM: arm/arm64: vgic: Defer
touching GICH_VMCR to vcpu_load/put"), we leave ICH_VMCR_EL2 (or
its GICv2 equivalent) loaded as long as we can, only syncing it
back when we're scheduled out.

There is a small snag with that though: kvm_vgic_vcpu_pending_irq(),
which is indirectly called from kvm_vcpu_check_block(), needs to
evaluate the guest's view of ICC_PMR_EL1. At the point were we
call kvm_vcpu_check_block(), the vcpu is still loaded, and whatever
changes to PMR is not visible in memory until we do a vcpu_put().

Things go really south if the guest does the following:

mov x0, #0  // or any small value masking interrupts
msr ICC_PMR_EL1, x0

[vcpu preempted, then rescheduled, VMCR sampled]

mov x0, #ff // allow all interrupts
msr ICC_PMR_EL1, x0
wfi // traps to EL2, so samping of VMCR

[interrupt arrives just after WFI]

Here, the hypervisor's view of PMR is zero, while the guest has enabled
its interrupts. kvm_vgic_vcpu_pending_irq() will then say that no
interrupts are pending (despite an interrupt being received) and we'll
block for no reason. If the guest doesn't have a periodic interrupt
firing once it has blocked, it will stay there forever.

To avoid this unfortuante situation, let's resync VMCR from
kvm_arch_vcpu_blocking(), ensuring that a following kvm_vcpu_check_block()
will observe the latest value of PMR.

This has been found by booting an arm64 Linux guest with the pseudo NMI
feature, and thus using interrupt priorities to mask interrupts instead
of the usual PSTATE masking.

Cc: sta...@vger.kernel.org # 4.12
Fixes: 328e56647944 ("KVM: arm/arm64: vgic: Defer touching GICH_VMCR to 
vcpu_load/put")
Signed-off-by: Marc Zyngier 
---
 include/kvm/arm_vgic.h  |  1 +
 virt/kvm/arm/arm.c  | 11 +++
 virt/kvm/arm/vgic/vgic-v2.c |  9 -
 virt/kvm/arm/vgic/vgic-v3.c |  7 ++-
 virt/kvm/arm/vgic/vgic.c| 11 +++
 virt/kvm/arm/vgic/vgic.h|  2 ++
 6 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 46bbc949c20a..7a30524a80ee 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -350,6 +350,7 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
 
 void kvm_vgic_load(struct kvm_vcpu *vcpu);
 void kvm_vgic_put(struct kvm_vcpu *vcpu);
+void kvm_vgic_vmcr_sync(struct kvm_vcpu *vcpu);
 
 #define irqchip_in_kernel(k)   (!!((k)->arch.vgic.in_kernel))
 #define vgic_initialized(k)((k)->arch.vgic.initialized)
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index acc43242a310..d9a650bfaf22 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -323,6 +323,17 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
 
 void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
 {
+   /*
+* If we're about to block (most likely because we've just hit a
+* WFI), we need to sync back the state of the GIC CPU interface
+* so that we have the lastest PMR and group enables. This ensures
+* that kvm_arch_vcpu_runnable has up-to-date data to decide
+* whether we have pending interrupts.
+*/
+   preempt_disable();
+   kvm_vgic_vmcr_sync(vcpu);
+   preempt_enable();
+
kvm_vgic_v4_enable_doorbell(vcpu);
 }
 
diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
index 6dd5ad706c92..96aab77d0471 100644
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -484,10 +484,17 @@ void vgic_v2_load(struct kvm_vcpu *vcpu)
   kvm_vgic_global_state.vctrl_base + GICH_APR);
 }
 
-void vgic_v2_put(struct kvm_vcpu *vcpu)
+void vgic_v2_vmcr_sync(struct kvm_vcpu *vcpu)
 {
struct vgic_v2_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v2;
 
cpu_if->vgic_vmcr = readl_relaxed(kvm_vgic_global_state.vctrl_base + 
GICH_VMCR);
+}
+
+void vgic_v2_put(struct kvm_vcpu *vcpu)
+{
+   struct vgic_v2_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v2;
+
+   vgic_v2_vmcr_sync(vcpu);
cpu_if->vgic_apr = readl_relaxed(kvm_vgic_global_state.vctrl_base + 
GICH_APR);
 }
diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index c2c9ce009f63..0c653a1e5215 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -662,12 +662,17 @@ void vgic_v3_load(struct kvm_vcpu *vcpu)
__vgic_v3_activate_traps(vcpu);
 }
 
-void vgic_v3_put(struct kvm_vcpu *vcpu)
+void vgic_v3_vmcr_sync(struct kvm_vcpu *vcpu)
 {
struct vgic_v3_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v3;
 
if (likely(cpu_if->vgic_sre))
cpu_if->vgic_vmcr = kvm_call_hyp_ret(__vgic_v3_read_vmcr);
+}
+
+void vgic_v3_put(struct kvm_vcpu *vcpu)
+{
+   vgic_v3_vmcr_sync(vcpu);
 
kvm_call_hyp(__vgic_v3_save_aprs, vcpu);
 
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 04786c8ec77e..13d4b38a94ec 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ 

[PATCH 0/2] KVM: arm/arm64: Fix guest's PMR synchronization when blocking on WFI

2019-08-02 Thread Marc Zyngier
It recently came to light that if we run a guest that actively uses
interrupt priorities to block interrupts, vcpus can end-up being
blocked while they shouldn't, leading to an unresponsive guest (a
slightly less than desirable outcome).

Patch #1 fixes the issue (which has been with us since 4.12), which I plan
to take in for 5.3 with immediate backport to stable.

Patch #2 is more of an RFC, as it also impacts the SVN AVIC support. It
moves the kvm_arch_vcpu_blocking callback to happen earlier, leading to
much better performances on ARM, and leading to the above fix to be
applied at the best possible spot. I'd welcome any comment/testing on
this, specially on non-ARM systems.

Marc Zyngier (2):
  KVM: arm/arm64: Sync ICH_VMCR_EL2 back when about to block
  KVM: Call kvm_arch_vcpu_blocking early into the blocking sequence

 include/kvm/arm_vgic.h  |  1 +
 virt/kvm/arm/arm.c  | 11 +++
 virt/kvm/arm/vgic/vgic-v2.c |  9 -
 virt/kvm/arm/vgic/vgic-v3.c |  7 ++-
 virt/kvm/arm/vgic/vgic.c| 11 +++
 virt/kvm/arm/vgic/vgic.h|  2 ++
 virt/kvm/kvm_main.c |  7 +++
 7 files changed, 42 insertions(+), 6 deletions(-)

-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 2/2] KVM: Call kvm_arch_vcpu_blocking early into the blocking sequence

2019-08-02 Thread Marc Zyngier
When a vpcu is about to block by calling kvm_vcpu_block, we call
back into the arch code to allow any form of synchronization that
may be required at this point (SVN stops the AVIC, ARM synchronises
the VMCR and enables GICv4 doorbells). But this synchronization
comes in quite late, as we've potentially waited for halt_poll_ns
to expire.

Instead, let's move kvm_arch_vcpu_blocking() to the beginning of
kvm_vcpu_block(), which on ARM has several benefits:

- VMCR gets synchronised early, meaning that any interrupt delivered
  during the polling window will be evaluated with the correct guest
  PMR
- GICv4 doorbells are enabled, which means that any guest interrupt
  directly injected during that window will be immediately recognised

Tang Nianyao ran some tests on a GICv4 machine to evaluate such
change, and reported up to a 10% improvement for netperf:


netperf result:
D06 as server, intel 8180 server as client
with change:
package 512 bytes - 5500 Mbits/s
package 64 bytes - 760 Mbits/s
without change:
package 512 bytes - 5000 Mbits/s
package 64 bytes - 710 Mbits/s


Signed-off-by: Marc Zyngier 
---
 virt/kvm/kvm_main.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 887f3b0c2b60..90d429c703cb 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2322,6 +2322,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
bool waited = false;
u64 block_ns;
 
+   kvm_arch_vcpu_blocking(vcpu);
+
start = cur = ktime_get();
if (vcpu->halt_poll_ns && !kvm_arch_no_poll(vcpu)) {
ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
@@ -2342,8 +2344,6 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
} while (single_task_running() && ktime_before(cur, stop));
}
 
-   kvm_arch_vcpu_blocking(vcpu);
-
for (;;) {
prepare_to_swait_exclusive(>wq, , 
TASK_INTERRUPTIBLE);
 
@@ -2356,9 +2356,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 
finish_swait(>wq, );
cur = ktime_get();
-
-   kvm_arch_vcpu_unblocking(vcpu);
 out:
+   kvm_arch_vcpu_unblocking(vcpu);
block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
 
if (!vcpu_valid_wakeup(vcpu))
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 00/59] KVM: arm64: ARMv8.3 Nested Virtualization support

2019-08-02 Thread Andrew Jones
On Fri, Aug 02, 2019 at 10:11:38AM +, Alexandru Elisei wrote:
> These are the changes that I made to kvm-unit-tests (the diff can be applied 
> on
> top of upstream master, 2130fd4154ad ("tscdeadline_latency: Check condition
> first before loop")):

It's great to hear that you're doing this. You may find these bit-rotted
commits useful too

https://github.com/rhdrjones/kvm-unit-tests/commits/arm64/hyp-mode

Thanks,
drew

> 
> diff --git a/arm/cstart64.S b/arm/cstart64.S
> index b0e8baa1a23a..a7631b5a1801 100644
> --- a/arm/cstart64.S
> +++ b/arm/cstart64.S
> @@ -51,6 +51,17 @@ start:
> b   1b
> 
>  1:
> +   mrs x4, CurrentEL
> +   cmp x4, CurrentEL_EL2
> +   b.ne1f
> +   mrs x4, mpidr_el1
> +   msr vmpidr_el2, x4
> +   mrs x4, midr_el1
> +   msr vpidr_el2, x4
> +   ldr x4, =(HCR_EL2_TGE | HCR_EL2_E2H)
> +   msr hcr_el2, x4
> +   isb
> +1:
> /* set up stack */
> mov x4, #1
> msr spsel, x4
> @@ -101,6 +112,17 @@ get_mmu_off:
> 
>  .globl secondary_entry
>  secondary_entry:
> +   mrs x0, CurrentEL
> +   cmp x0, CurrentEL_EL2
> +   b.ne1f
> +   mrs x0, mpidr_el1
> +   msr vmpidr_el2, x0
> +   mrs x0, midr_el1
> +   msr vpidr_el2, x0
> +   ldr x0, =(HCR_EL2_TGE | HCR_EL2_E2H)
> +   msr hcr_el2, x0
> +   isb
> +1:
> /* Enable FP/ASIMD */
> mov x0, #(3 << 20)
> msr cpacr_el1, x0
> diff --git a/lib/arm/asm/psci.h b/lib/arm/asm/psci.h
> index 7b956bf5987d..07297a27e0ce 100644
> --- a/lib/arm/asm/psci.h
> +++ b/lib/arm/asm/psci.h
> @@ -3,6 +3,15 @@
>  #include 
>  #include 
> 
> +enum psci_conduit {
> +   PSCI_CONDUIT_HVC,
> +   PSCI_CONDUIT_SMC,
> +};
> +
> +extern void psci_init(void);
> +extern void psci_set_conduit(enum psci_conduit conduit);
> +extern enum psci_conduit psci_get_conduit(void);
> +
>  extern int psci_invoke(unsigned long function_id, unsigned long arg0,
>unsigned long arg1, unsigned long arg2);
>  extern int psci_cpu_on(unsigned long cpuid, unsigned long entry_point);
> diff --git a/lib/arm/psci.c b/lib/arm/psci.c
> index c3d399064ae3..20ad4b944738 100644
> --- a/lib/arm/psci.c
> +++ b/lib/arm/psci.c
> @@ -6,13 +6,14 @@
>   *
>   * This work is licensed under the terms of the GNU LGPL, version 2.
>   */
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
> 
> -__attribute__((noinline))
> -int psci_invoke(unsigned long function_id, unsigned long arg0,
> +static int psci_invoke_hvc(unsigned long function_id, unsigned long arg0,
> unsigned long arg1, unsigned long arg2)
>  {
> asm volatile(
> @@ -22,6 +23,63 @@ int psci_invoke(unsigned long function_id, unsigned long 
> arg0,
> return function_id;
>  }
> 
> +static int psci_invoke_smc(unsigned long function_id, unsigned long arg0,
> +   unsigned long arg1, unsigned long arg2)
> +{
> +   asm volatile(
> +   "smc #0"
> +   : "+r" (function_id)
> +   : "r" (arg0), "r" (arg1), "r" (arg2));
> +   return function_id;
> +}
> +
> +/*
> + * Initialize to something sensible, so the exit fallback psci_system_off 
> still
> + * works before calling psci_init when booted at EL1.
> + */
> +static enum psci_conduit psci_conduit = PSCI_CONDUIT_HVC;
> +static int (*psci_fn)(unsigned long, unsigned long, unsigned long,
> +   unsigned long) = _invoke_hvc;
> +
> +void psci_set_conduit(enum psci_conduit conduit)
> +{
> +   psci_conduit = conduit;
> +   if (conduit == PSCI_CONDUIT_HVC)
> +   psci_fn = _invoke_hvc;
> +   else
> +   psci_fn = _invoke_smc;
> +}
> +
> +enum psci_conduit psci_get_conduit(void)
> +{
> +   return psci_conduit;
> +}
> +
> +int psci_invoke(unsigned long function_id, unsigned long arg0,
> +   unsigned long arg1, unsigned long arg2)
> +{
> +   return psci_fn(function_id, arg0, arg1, arg2);
> +}
> +
> +void psci_init(void)
> +{
> +   const char *conduit;
> +   int ret;
> +
> +   ret = dt_get_psci_conduit();
> +   assert(ret == 0 || ret == -FDT_ERR_NOTFOUND);
> +
> +   if (ret == -FDT_ERR_NOTFOUND)
> +   conduit = "hvc";
> +
> +   assert(strcmp(conduit, "hvc") == 0 || strcmp(conduit, "smc") == 0);
> +
> +   if (strcmp(conduit, "hvc") == 0)
> +   psci_set_conduit(PSCI_CONDUIT_HVC);
> +   else
> +   psci_set_conduit(PSCI_CONDUIT_SMC);
> +}
> +
>  int psci_cpu_on(unsigned long cpuid, unsigned long entry_point)
>  {
>  #ifdef __arm__
> diff --git a/lib/arm/setup.c b/lib/arm/setup.c
> index 4f02fca85607..e0dc9e4801b0 100644
> --- a/lib/arm/setup.c
> +++ b/lib/arm/setup.c
> @@ -21,6 +21,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include "io.h"
> 
> @@ -164,7 +165,11 @@ void setup(const void *fdt)
> freemem += initrd_size;
>

Re: [PATCH 00/59] KVM: arm64: ARMv8.3 Nested Virtualization support

2019-08-02 Thread Alexandru Elisei
Hi,

On 6/21/19 10:37 AM, Marc Zyngier wrote:
> I've taken over the maintenance of this series originally written by
> Jintack and Christoffer. Since then, the series has been substantially
> reworked, new features (and most probably bugs) have been added, and
> the whole thing rebased multiple times. If anything breaks, please
> blame me, and nobody else.
>
> As you can tell, this is quite big. It is also remarkably incomplete
> (we're missing many critical bits for fully emulate EL2), but the idea
> is to start merging things early in order to reduce the maintenance
> headache. What we want to achieve is that with NV disabled, there is
> no performance overhead and no regression. The only thing I intend to
> merge ASAP is the first patch in the series, because it should have
> zero effect and is a reasonable cleanup.
>
> The series is roughly divided in 4 parts: exception handling, memory
> virtualization, interrupts and timers. There are of course some
> dependencies, but you'll hopefully get the gist of it.
>
> For the most courageous of you, I've put out a branch[1] containing this
> and a bit more. Of course, you'll need some userspace. Andre maintains
> a hacked version of kvmtool[1] that takes a --nested option, allowing
> the guest to be started at EL2. You can run the whole stack in the
> Foundation model. Don't be in a hurry ;-).
>
> [1] git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git 
> kvm-arm64/nv-wip-5.2-rc5
> [2] git://linux-arm.org/kvmtool.git nv/nv-wip-5.2-rc5
>
> Andre Przywara (4):
>   KVM: arm64: nv: Handle virtual EL2 registers in
> vcpu_read/write_sys_reg()
>   KVM: arm64: nv: Save/Restore vEL2 sysregs
>   KVM: arm64: nv: Handle traps for timer _EL02 and _EL2 sysregs
> accessors
>   KVM: arm64: nv: vgic: Allow userland to set VGIC maintenance IRQ
>
> Christoffer Dall (16):
>   KVM: arm64: nv: Introduce nested virtualization VCPU feature
>   KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set
>   KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x
>   KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state
>   KVM: arm64: nv: Handle trapped ERET from virtual EL2
>   KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor
>   KVM: arm64: nv: Trap EL1 VM register accesses in virtual EL2
>   KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2
> changes
>   KVM: arm/arm64: nv: Support multiple nested stage 2 mmu structures
>   KVM: arm64: nv: Implement nested Stage-2 page table walk logic
>   KVM: arm64: nv: Handle shadow stage 2 page faults
>   KVM: arm64: nv: Unmap/flush shadow stage 2 page tables
>   KVM: arm64: nv: arch_timer: Support hyp timer emulation
>   KVM: arm64: nv: vgic-v3: Take cpu_if pointer directly instead of vcpu
>   KVM: arm64: nv: vgic: Emulate the HW bit in software
>   KVM: arm64: nv: Add nested GICv3 tracepoints
>
> Dave Martin (1):
>   KVM: arm64: Migrate _elx sysreg accessors to msr_s/mrs_s
>
> Jintack Lim (21):
>   arm64: Add ARM64_HAS_NESTED_VIRT cpufeature
>   KVM: arm64: nv: Add EL2 system registers to vcpu context
>   KVM: arm64: nv: Support virtual EL2 exceptions
>   KVM: arm64: nv: Inject HVC exceptions to the virtual EL2
>   KVM: arm64: nv: Trap SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2
>   KVM: arm64: nv: Trap CPACR_EL1 access in virtual EL2
>   KVM: arm64: nv: Set a handler for the system instruction traps
>   KVM: arm64: nv: Handle PSCI call via smc from the guest
>   KVM: arm64: nv: Respect virtual HCR_EL2.TWX setting
>   KVM: arm64: nv: Respect virtual CPTR_EL2.TFP setting
>   KVM: arm64: nv: Respect the virtual HCR_EL2.NV bit setting
>   KVM: arm64: nv: Respect virtual HCR_EL2.TVM and TRVM settings
>   KVM: arm64: nv: Respect the virtual HCR_EL2.NV1 bit setting
>   KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2
>   KVM: arm64: nv: Configure HCR_EL2 for nested virtualization
>   KVM: arm64: nv: Pretend we only support larger-than-host page sizes
>   KVM: arm64: nv: Introduce sys_reg_desc.forward_trap
>   KVM: arm64: nv: Rework the system instruction emulation framework
>   KVM: arm64: nv: Trap and emulate AT instructions from virtual EL2
>   KVM: arm64: nv: Trap and emulate TLBI instructions from virtual EL2
>   KVM: arm64: nv: Nested GICv3 Support
>
> Marc Zyngier (17):
>   KVM: arm64: Move __load_guest_stage2 to kvm_mmu.h
>   KVM: arm64: nv: Reset VMPIDR_EL2 and VPIDR_EL2 to sane values
>   KVM: arm64: nv: Handle SPSR_EL2 specially
>   KVM: arm64: nv: Refactor vcpu_{read,write}_sys_reg
>   KVM: arm64: nv: Don't expose SVE to nested guests
>   KVM: arm64: nv: Hide RAS from nested guests
>   KVM: arm/arm64: nv: Factor out stage 2 page table data from struct kvm
>   KVM: arm64: nv: Move last_vcpu_ran to be per s2 mmu
>   KVM: arm64: nv: Don't always start an S2 MMU search from the beginning
>   KVM: arm64: nv: Propagate CNTVOFF_EL2 to the virtual EL1 timer
>   KVM: arm64: nv: Load timer before the GIC
>   KVM: arm64: nv: Implement maintenance