Re: [PATCH v2 4/9] KVM: arm64: Support stolen time reporting via shared structure

2019-08-21 Thread Steven Price
On 19/08/2019 17:40, Marc Zyngier wrote:
> Hi Steven,
> 
> On 19/08/2019 15:04, Steven Price wrote:
>> Implement the service call for configuring a shared structure between a
>> VCPU and the hypervisor in which the hypervisor can write the time
>> stolen from the VCPU's execution time by other tasks on the host.
>>
>> The hypervisor allocates memory which is placed at an IPA chosen by user
>> space. The hypervisor then uses WRITE_ONCE() to update the shared
>> structure ensuring single copy atomicity of the 64-bit unsigned value
>> that reports stolen time in nanoseconds.
>>
>> Whenever stolen time is enabled by the guest, the stolen time counter is
>> reset.
>>
>> The stolen time itself is retrieved from the sched_info structure
>> maintained by the Linux scheduler code. We enable SCHEDSTATS when
>> selecting KVM Kconfig to ensure this value is meaningful.
>>
>> Signed-off-by: Steven Price 
>> ---
>>  arch/arm/include/asm/kvm_host.h   | 15 +++
>>  arch/arm64/include/asm/kvm_host.h | 16 ++-
>>  arch/arm64/kvm/Kconfig|  1 +
>>  include/linux/kvm_types.h |  2 +
>>  virt/kvm/arm/arm.c| 19 +
>>  virt/kvm/arm/hypercalls.c |  3 ++
>>  virt/kvm/arm/pvtime.c | 71 +++
>>  7 files changed, 126 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_host.h 
>> b/arch/arm/include/asm/kvm_host.h
>> index 369b5d2d54bf..14d61a84c270 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -39,6 +39,7 @@
>>  KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>>  #define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1)
>>  #define KVM_REQ_VCPU_RESET  KVM_ARCH_REQ(2)
>> +#define KVM_REQ_RECORD_STEALKVM_ARCH_REQ(3)
>>  
>>  DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
>>  
>> @@ -77,6 +78,12 @@ struct kvm_arch {
>>  
>>  /* Mandated version of PSCI */
>>  u32 psci_version;
>> +
>> +struct kvm_arch_pvtime {
>> +struct gfn_to_hva_cache st_ghc;
>> +gpa_t st_base;
>> +u64 st_size;
>> +} pvtime;
> 
> It'd be good if we could avoid having this in the 32bit vcpu structure,
> given that it serves no real purpose (other than being able to compile
> things).

Good point - I think I can fix that with a couple more static inline
functions... It's a little tricky due to header file include order, but
I think I can make it work.

[...]
>> +int kvm_update_stolen_time(struct kvm_vcpu *vcpu, bool init)
>> +{
>> +struct kvm *kvm = vcpu->kvm;
>> +struct kvm_arch_pvtime *pvtime = &kvm->arch.pvtime;
>> +u64 steal;
>> +u64 steal_le;
>> +u64 offset;
>> +int idx;
>> +const int stride = sizeof(struct pvclock_vcpu_stolen_time);
>> +
>> +if (pvtime->st_base == GPA_INVALID)
>> +return -ENOTSUPP;
>> +
>> +/* Let's do the local bookkeeping */
>> +steal = vcpu->arch.steal.steal;
>> +steal += current->sched_info.run_delay - vcpu->arch.steal.last_steal;
>> +vcpu->arch.steal.last_steal = current->sched_info.run_delay;
>> +vcpu->arch.steal.steal = steal;
>> +
>> +offset = stride * kvm_vcpu_get_idx(vcpu);
>> +
>> +if (unlikely(offset + stride > pvtime->st_size))
>> +return -EINVAL;
>> +
>> +steal_le = cpu_to_le64(steal);
>> +pagefault_disable();
> 
> What's the reason for doing a pagefault_disable()? What I'd expect is
> for the userspace page to be faulted in and written to, and doing a
> pagefault_disable() seems to be going against this idea.

Umm... this is me screwing up the locking...

The current code is very confused about which locks should/can be held
when kvm_update_stolen_time() is called. vcpu_req_record_steal()
explicitly takes the kvm->srcu read lock - which is then taken again
here. But kvm_hypercall_stolen_time doesn't hold any lock. And obviously
at some point in time I expected this to be called in atomic context...

In general the page is likely to be faulted in (as a guest which is
using stolen time is surely looking at the numbers there). But there's
no need for the pagefault_disable(). It also shouldn't be the callers
responsibility to hold kvm->srcu.

Steve


Re: [PATCH 16/59] KVM: arm64: nv: Save/Restore vEL2 sysregs

2019-08-21 Thread Alexandru Elisei
On 6/21/19 10:38 AM, Marc Zyngier wrote:
> From: Andre Przywara 
>
> Whenever we need to restore the guest's system registers to the CPU, we
> now need to take care of the EL2 system registers as well. Most of them
> are accessed via traps only, but some have an immediate effect and also
> a guest running in VHE mode would expect them to be accessible via their
> EL1 encoding, which we do not trap.
>
> Split the current __sysreg_{save,restore}_el1_state() functions into
> handling common sysregs, then differentiate between the guest running in
> vEL2 and vEL1.
>
> For vEL2 we write the virtual EL2 registers with an identical format directly
> into their EL1 counterpart, and translate the few registers that have a
> different format for the same effect on the execution when running a
> non-VHE guest guest hypervisor.
>
>   [ Commit message reworked and many bug fixes applied by Marc Zyngier
> and Christoffer Dall. ]
>
> Signed-off-by: Andre Przywara 
> Signed-off-by: Marc Zyngier 
> Signed-off-by: Christoffer Dall 
> ---
>  arch/arm64/kvm/hyp/sysreg-sr.c | 160 +++--
>  1 file changed, 153 insertions(+), 7 deletions(-)
>
> diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
> index 62866a68e852..2abb9c3ff24f 100644
> --- a/arch/arm64/kvm/hyp/sysreg-sr.c
> +++ b/arch/arm64/kvm/hyp/sysreg-sr.c
> @@ -22,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  /*
>   * Non-VHE: Both host and guest must save everything.
> @@ -51,11 +52,9 @@ static void __hyp_text __sysreg_save_user_state(struct 
> kvm_cpu_context *ctxt)
>   ctxt->sys_regs[TPIDRRO_EL0] = read_sysreg(tpidrro_el0);
>  }
>  
> -static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
> +static void __hyp_text __sysreg_save_vel1_state(struct kvm_cpu_context *ctxt)
>  {
> - ctxt->sys_regs[CSSELR_EL1]  = read_sysreg(csselr_el1);
>   ctxt->sys_regs[SCTLR_EL1]   = read_sysreg_el1(SYS_SCTLR);
> - ctxt->sys_regs[ACTLR_EL1]   = read_sysreg(actlr_el1);
>   ctxt->sys_regs[CPACR_EL1]   = read_sysreg_el1(SYS_CPACR);
>   ctxt->sys_regs[TTBR0_EL1]   = read_sysreg_el1(SYS_TTBR0);
>   ctxt->sys_regs[TTBR1_EL1]   = read_sysreg_el1(SYS_TTBR1);
> @@ -69,14 +68,58 @@ static void __hyp_text __sysreg_save_el1_state(struct 
> kvm_cpu_context *ctxt)
>   ctxt->sys_regs[CONTEXTIDR_EL1]  = read_sysreg_el1(SYS_CONTEXTIDR);
>   ctxt->sys_regs[AMAIR_EL1]   = read_sysreg_el1(SYS_AMAIR);
>   ctxt->sys_regs[CNTKCTL_EL1] = read_sysreg_el1(SYS_CNTKCTL);
> - ctxt->sys_regs[PAR_EL1] = read_sysreg(par_el1);
> - ctxt->sys_regs[TPIDR_EL1]   = read_sysreg(tpidr_el1);
>  
>   ctxt->gp_regs.sp_el1= read_sysreg(sp_el1);
>   ctxt->gp_regs.elr_el1   = read_sysreg_el1(SYS_ELR);
>   ctxt->gp_regs.spsr[KVM_SPSR_EL1]= read_sysreg_el1(SYS_SPSR);
>  }
>  
> +static void __sysreg_save_vel2_state(struct kvm_cpu_context *ctxt)
> +{
> + ctxt->sys_regs[ESR_EL2] = read_sysreg_el1(SYS_ESR);
> + ctxt->sys_regs[AFSR0_EL2]   = read_sysreg_el1(SYS_AFSR0);
> + ctxt->sys_regs[AFSR1_EL2]   = read_sysreg_el1(SYS_AFSR1);
> + ctxt->sys_regs[FAR_EL2] = read_sysreg_el1(SYS_FAR);
> + ctxt->sys_regs[MAIR_EL2]= read_sysreg_el1(SYS_MAIR);
> + ctxt->sys_regs[VBAR_EL2]= read_sysreg_el1(SYS_VBAR);
> + ctxt->sys_regs[CONTEXTIDR_EL2]  = read_sysreg_el1(SYS_CONTEXTIDR);
> + ctxt->sys_regs[AMAIR_EL2]   = read_sysreg_el1(SYS_AMAIR);
> +
> + /*
> +  * In VHE mode those registers are compatible between EL1 and EL2,
> +  * and the guest uses the _EL1 versions on the CPU naturally.
> +  * So we save them into their _EL2 versions here.
> +  * For nVHE mode we trap accesses to those registers, so our
> +  * _EL2 copy in sys_regs[] is always up-to-date and we don't need
> +  * to save anything here.
> +  */
> + if (__vcpu_el2_e2h_is_set(ctxt)) {
> + ctxt->sys_regs[SCTLR_EL2]   = read_sysreg_el1(SYS_SCTLR);
> + ctxt->sys_regs[CPTR_EL2]= read_sysreg_el1(SYS_CPACR);
> + ctxt->sys_regs[TTBR0_EL2]   = read_sysreg_el1(SYS_TTBR0);
> + ctxt->sys_regs[TTBR1_EL2]   = read_sysreg_el1(SYS_TTBR1);
> + ctxt->sys_regs[TCR_EL2] = read_sysreg_el1(SYS_TCR);
> + ctxt->sys_regs[CNTHCTL_EL2] = read_sysreg_el1(SYS_CNTKCTL);
> + }

This can break guests that run with VHE on, then disable it. I stumbled into
this while working on kvm-unit-tests, which uses TTBR0 for the translation
tables. Let's consider the following scenario:

1. Guest sets HCR_EL2.E2H
2. Guest programs translation tables in TTBR0_EL1, which should reflect in
TTBR0_EL2.
3. Guest enabled MMU and does stuff.
4. Guest disables MMU and clears HCR_EL2.E2H
5. Guest turns MMU on. It doesn't change TTBR0_EL2, because it will use the same
translation tables as when running

[PATCH v3 00/10] arm64: Stolen time support

2019-08-21 Thread Steven Price
This series add support for paravirtualized time for arm64 guests and
KVM hosts following the specification in Arm's document DEN 0057A:

https://developer.arm.com/docs/den0057/a

It implements support for stolen time, allowing the guest to
identify time when it is forcibly not executing.

It doesn't implement support for Live Physical Time (LPT) as there are
some concerns about the overheads and approach in the above
specification, and I expect an updated version of the specification to
be released soon with just the stolen time parts.

NOTE: Patches 8 and 9 will conflict with Mark Rutland's series[1] cleaning
up the SMCCC conduit. I do feel that the addition of an _invoke() call
makes a number of call sites cleaner and it should be possible to
integrate both this and Mark's other cleanups.

[1] 
https://lore.kernel.org/linux-arm-kernel/20190809132245.43505-1-mark.rutl...@arm.com/

Also available as a git tree:
git://linux-arm.org/linux-sp.git stolen_time/v3

Changes from v2:
https://lore.kernel.org/lkml/20190819140436.12207-1-steven.pr...@arm.com/
 * Switched from using gfn_to_hva_cache to a new macro kvm_put_guest()
   that can provide the single-copy atomicity required (on arm64). This
   macro is added in patch 4.
 * Tidied up the locking for kvm_update_stolen_time().
   pagefault_disable() was unnecessary and the caller didn't need to
   take kvm->srcu as the function does it itself.
 * Removed struct kvm_arch_pvtime from the arm implementation, replaced
   instead with inline static functions which are empty for arm.
 * Fixed a few checkpatch --strict warnings.

Changes from v1:
https://lore.kernel.org/lkml/20190802145017.42543-1-steven.pr...@arm.com/
 * Host kernel no longer allocates the stolen time structure, instead it
   is allocated by user space. This means the save/restore functionality
   can be removed.
 * Refactored the code so arm has stub implementations and to avoid
   initcall
 * Rebased to pick up Documentation/{virt->virtual} change
 * Bunch of typo fixes

Christoffer Dall (1):
  KVM: arm/arm64: Factor out hypercall handling from PSCI code

Steven Price (9):
  KVM: arm64: Document PV-time interface
  KVM: arm64: Implement PV_FEATURES call
  KVM: Implement kvm_put_guest()
  KVM: arm64: Support stolen time reporting via shared structure
  KVM: Allow kvm_device_ops to be const
  KVM: arm64: Provide a PV_TIME device to user space
  arm/arm64: Provide a wrapper for SMCCC 1.1 calls
  arm/arm64: Make use of the SMCCC 1.1 wrapper
  arm64: Retrieve stolen time as paravirtualized guest

 Documentation/virt/kvm/arm/pvtime.txt | 100 ++
 arch/arm/include/asm/kvm_host.h   |  30 +
 arch/arm/kvm/Makefile |   2 +-
 arch/arm/kvm/handle_exit.c|   2 +-
 arch/arm/mm/proc-v7-bugs.c|  13 +-
 arch/arm64/include/asm/kvm_host.h |  28 +++-
 arch/arm64/include/asm/paravirt.h |   9 +-
 arch/arm64/include/asm/pvclock-abi.h  |  17 +++
 arch/arm64/include/uapi/asm/kvm.h |   8 ++
 arch/arm64/kernel/cpu_errata.c|  80 ---
 arch/arm64/kernel/paravirt.c  | 148 +
 arch/arm64/kernel/time.c  |   3 +
 arch/arm64/kvm/Kconfig|   1 +
 arch/arm64/kvm/Makefile   |   2 +
 arch/arm64/kvm/handle_exit.c  |   4 +-
 include/kvm/arm_hypercalls.h  |  43 ++
 include/kvm/arm_psci.h|   2 +-
 include/linux/arm-smccc.h |  58 
 include/linux/cpuhotplug.h|   1 +
 include/linux/kvm_host.h  |  28 +++-
 include/linux/kvm_types.h |   2 +
 include/uapi/linux/kvm.h  |   2 +
 virt/kvm/arm/arm.c|  11 ++
 virt/kvm/arm/hypercalls.c |  68 ++
 virt/kvm/arm/psci.c   |  84 +---
 virt/kvm/arm/pvtime.c | 182 ++
 virt/kvm/kvm_main.c   |   6 +-
 27 files changed, 780 insertions(+), 154 deletions(-)
 create mode 100644 Documentation/virt/kvm/arm/pvtime.txt
 create mode 100644 arch/arm64/include/asm/pvclock-abi.h
 create mode 100644 include/kvm/arm_hypercalls.h
 create mode 100644 virt/kvm/arm/hypercalls.c
 create mode 100644 virt/kvm/arm/pvtime.c

-- 
2.20.1



[PATCH v3 02/10] KVM: arm/arm64: Factor out hypercall handling from PSCI code

2019-08-21 Thread Steven Price
From: Christoffer Dall 

We currently intertwine the KVM PSCI implementation with the general
dispatch of hypercall handling, which makes perfect sense because PSCI
is the only category of hypercalls we support.

However, as we are about to support additional hypercalls, factor out
this functionality into a separate hypercall handler file.

Signed-off-by: Christoffer Dall 
[steven.pr...@arm.com: rebased]
Signed-off-by: Steven Price 
---
 arch/arm/kvm/Makefile|  2 +-
 arch/arm/kvm/handle_exit.c   |  2 +-
 arch/arm64/kvm/Makefile  |  1 +
 arch/arm64/kvm/handle_exit.c |  4 +-
 include/kvm/arm_hypercalls.h | 43 ++
 include/kvm/arm_psci.h   |  2 +-
 virt/kvm/arm/hypercalls.c| 59 +
 virt/kvm/arm/psci.c  | 84 +---
 8 files changed, 110 insertions(+), 87 deletions(-)
 create mode 100644 include/kvm/arm_hypercalls.h
 create mode 100644 virt/kvm/arm/hypercalls.c

diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
index 531e59f5be9c..ef4d01088efc 100644
--- a/arch/arm/kvm/Makefile
+++ b/arch/arm/kvm/Makefile
@@ -23,7 +23,7 @@ obj-y += kvm-arm.o init.o interrupts.o
 obj-y += handle_exit.o guest.o emulate.o reset.o
 obj-y += coproc.o coproc_a15.o coproc_a7.o   vgic-v3-coproc.o
 obj-y += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o
-obj-y += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
+obj-y += $(KVM)/arm/psci.o $(KVM)/arm/perf.o $(KVM)/arm/hypercalls.o
 obj-y += $(KVM)/arm/aarch32.o
 
 obj-y += $(KVM)/arm/vgic/vgic.o
diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
index 2a6a1394d26e..e58a89d2f13f 100644
--- a/arch/arm/kvm/handle_exit.c
+++ b/arch/arm/kvm/handle_exit.c
@@ -9,7 +9,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include "trace.h"
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 3ac1a64d2fb9..73dce4d47d47 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_KVM_ARM_HOST) += hyp/
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o 
$(KVM)/eventfd.o $(KVM)/vfio.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o 
$(KVM)/arm/mmio.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
+kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hypercalls.o
 
 kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
 kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 706cca23f0d2..aacfc55de44c 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -11,8 +11,6 @@
 #include 
 #include 
 
-#include 
-
 #include 
 #include 
 #include 
@@ -22,6 +20,8 @@
 #include 
 #include 
 
+#include 
+
 #define CREATE_TRACE_POINTS
 #include "trace.h"
 
diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
new file mode 100644
index ..0e2509d27910
--- /dev/null
+++ b/include/kvm/arm_hypercalls.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2019 Arm Ltd. */
+
+#ifndef __KVM_ARM_HYPERCALLS_H
+#define __KVM_ARM_HYPERCALLS_H
+
+#include 
+
+int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
+
+static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
+{
+   return vcpu_get_reg(vcpu, 0);
+}
+
+static inline unsigned long smccc_get_arg1(struct kvm_vcpu *vcpu)
+{
+   return vcpu_get_reg(vcpu, 1);
+}
+
+static inline unsigned long smccc_get_arg2(struct kvm_vcpu *vcpu)
+{
+   return vcpu_get_reg(vcpu, 2);
+}
+
+static inline unsigned long smccc_get_arg3(struct kvm_vcpu *vcpu)
+{
+   return vcpu_get_reg(vcpu, 3);
+}
+
+static inline void smccc_set_retval(struct kvm_vcpu *vcpu,
+   unsigned long a0,
+   unsigned long a1,
+   unsigned long a2,
+   unsigned long a3)
+{
+   vcpu_set_reg(vcpu, 0, a0);
+   vcpu_set_reg(vcpu, 1, a1);
+   vcpu_set_reg(vcpu, 2, a2);
+   vcpu_set_reg(vcpu, 3, a3);
+}
+
+#endif
diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
index 632e78bdef4d..5b58bd2fe088 100644
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -40,7 +40,7 @@ static inline int kvm_psci_version(struct kvm_vcpu *vcpu, 
struct kvm *kvm)
 }
 
 
-int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
+int kvm_psci_call(struct kvm_vcpu *vcpu);
 
 struct kvm_one_reg;
 
diff --git a/virt/kvm/arm/hypercalls.c b/virt/kvm/arm/hypercalls.c
new file mode 100644
index ..f875241bd030
--- /dev/null
+++ b/virt/kvm/arm/hypercalls.c
@@ -0,0 +1,59 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2019 Arm Ltd.
+
+#include 
+#include 
+
+#include 
+
+#include 
+#include 
+
+int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
+{
+   u32 func_id = smccc_get_function(vcpu);
+   u32 val = SMCCC_RET_NOT_SUPPORTED;
+   u32 feature;
+
+   swi

[PATCH v3 03/10] KVM: arm64: Implement PV_FEATURES call

2019-08-21 Thread Steven Price
This provides a mechanism for querying which paravirtualized features
are available in this hypervisor.

Also add the header file which defines the ABI for the paravirtualized
clock features we're about to add.

Signed-off-by: Steven Price 
---
 arch/arm/include/asm/kvm_host.h  |  6 ++
 arch/arm64/include/asm/kvm_host.h|  2 ++
 arch/arm64/include/asm/pvclock-abi.h | 17 +
 arch/arm64/kvm/Makefile  |  1 +
 include/linux/arm-smccc.h| 14 ++
 virt/kvm/arm/hypercalls.c|  6 ++
 virt/kvm/arm/pvtime.c| 21 +
 7 files changed, 67 insertions(+)
 create mode 100644 arch/arm64/include/asm/pvclock-abi.h
 create mode 100644 virt/kvm/arm/pvtime.c

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 8a37c8e89777..369b5d2d54bf 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -7,6 +7,7 @@
 #ifndef __ARM_KVM_HOST_H__
 #define __ARM_KVM_HOST_H__
 
+#include 
 #include 
 #include 
 #include 
@@ -323,6 +324,11 @@ static inline int kvm_arch_vm_ioctl_check_extension(struct 
kvm *kvm, long ext)
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
+static inline int kvm_hypercall_pv_features(struct kvm_vcpu *vcpu)
+{
+   return SMCCC_RET_NOT_SUPPORTED;
+}
+
 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index f656169db8c3..583b3639062a 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -478,6 +478,8 @@ void handle_exit_early(struct kvm_vcpu *vcpu, struct 
kvm_run *run,
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
+int kvm_hypercall_pv_features(struct kvm_vcpu *vcpu);
+
 void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
 
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
diff --git a/arch/arm64/include/asm/pvclock-abi.h 
b/arch/arm64/include/asm/pvclock-abi.h
new file mode 100644
index ..c4f1c0a0789c
--- /dev/null
+++ b/arch/arm64/include/asm/pvclock-abi.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2019 Arm Ltd. */
+
+#ifndef __ASM_PVCLOCK_ABI_H
+#define __ASM_PVCLOCK_ABI_H
+
+/* The below structure is defined in ARM DEN0057A */
+
+struct pvclock_vcpu_stolen_time {
+   __le32 revision;
+   __le32 attributes;
+   __le64 stolen_time;
+   /* Structure must be 64 byte aligned, pad to that size */
+   u8 padding[48];
+} __packed;
+
+#endif
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 73dce4d47d47..5ffbdc39e780 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -14,6 +14,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o 
$(KVM)/coalesced_mmio.o $(KVM)/e
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o 
$(KVM)/arm/mmio.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hypercalls.o
+kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/pvtime.o
 
 kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
 kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
index 080012a6f025..e7f129f26ebd 100644
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -45,6 +45,7 @@
 #define ARM_SMCCC_OWNER_SIP2
 #define ARM_SMCCC_OWNER_OEM3
 #define ARM_SMCCC_OWNER_STANDARD   4
+#define ARM_SMCCC_OWNER_STANDARD_HYP   5
 #define ARM_SMCCC_OWNER_TRUSTED_APP48
 #define ARM_SMCCC_OWNER_TRUSTED_APP_END49
 #define ARM_SMCCC_OWNER_TRUSTED_OS 50
@@ -302,5 +303,18 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned 
long a1,
 #define SMCCC_RET_NOT_SUPPORTED-1
 #define SMCCC_RET_NOT_REQUIRED -2
 
+/* Paravirtualised time calls (defined by ARM DEN0057A) */
+#define ARM_SMCCC_HV_PV_FEATURES   \
+   ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+  ARM_SMCCC_SMC_64,\
+  ARM_SMCCC_OWNER_STANDARD_HYP,\
+  0x20)
+
+#define ARM_SMCCC_HV_PV_TIME_ST\
+   ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+  ARM_SMCCC_SMC_64,\
+  ARM_SMCCC_OWNER_STANDARD_HYP,\
+  0x22)
+
 #endif /*__ASSEMBLY__*/
 #endif /*__LINUX_ARM_SMCCC_H*/
diff --git a/virt/kvm/arm/hypercalls.c b/virt/kvm/arm/hypercalls.c
index f875241bd030..63ae629c466a 100644
--- a/virt/kvm/arm/hypercalls.c
+++ b/virt/kvm/arm/hypercalls.c
@@ -48,8 +48,14 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 

[PATCH v3 01/10] KVM: arm64: Document PV-time interface

2019-08-21 Thread Steven Price
Introduce a paravirtualization interface for KVM/arm64 based on the
"Arm Paravirtualized Time for Arm-Base Systems" specification DEN 0057A.

This only adds the details about "Stolen Time" as the details of "Live
Physical Time" have not been fully agreed.

User space can specify a reserved area of memory for the guest and
inform KVM to populate the memory with information on time that the host
kernel has stolen from the guest.

A hypercall interface is provided for the guest to interrogate the
hypervisor's support for this interface and the location of the shared
memory structures.

Signed-off-by: Steven Price 
---
 Documentation/virt/kvm/arm/pvtime.txt | 100 ++
 1 file changed, 100 insertions(+)
 create mode 100644 Documentation/virt/kvm/arm/pvtime.txt

diff --git a/Documentation/virt/kvm/arm/pvtime.txt 
b/Documentation/virt/kvm/arm/pvtime.txt
new file mode 100644
index ..1ceb118694e7
--- /dev/null
+++ b/Documentation/virt/kvm/arm/pvtime.txt
@@ -0,0 +1,100 @@
+Paravirtualized time support for arm64
+==
+
+Arm specification DEN0057/A defined a standard for paravirtualised time
+support for AArch64 guests:
+
+https://developer.arm.com/docs/den0057/a
+
+KVM/arm64 implements the stolen time part of this specification by providing
+some hypervisor service calls to support a paravirtualized guest obtaining a
+view of the amount of time stolen from its execution.
+
+Two new SMCCC compatible hypercalls are defined:
+
+PV_FEATURES 0xC520
+PV_TIME_ST  0xC522
+
+These are only available in the SMC64/HVC64 calling convention as
+paravirtualized time is not available to 32 bit Arm guests. The existence of
+the PV_FEATURES hypercall should be probed using the SMCCC 1.1 ARCH_FEATURES
+mechanism before calling it.
+
+PV_FEATURES
+Function ID:  (uint32)  : 0xC520
+PV_func_id:   (uint32)  : Either PV_TIME_LPT or PV_TIME_ST
+Return value: (int32)   : NOT_SUPPORTED (-1) or SUCCESS (0) if the relevant
+  PV-time feature is supported by the hypervisor.
+
+PV_TIME_ST
+Function ID:  (uint32)  : 0xC522
+Return value: (int64)   : IPA of the stolen time data structure for this
+  (V)CPU. On failure:
+  NOT_SUPPORTED (-1)
+
+The IPA returned by PV_TIME_ST should be mapped by the guest as normal memory
+with inner and outer write back caching attributes, in the inner shareable
+domain. A total of 16 bytes from the IPA returned are guaranteed to be
+meaningfully filled by the hypervisor (see structure below).
+
+PV_TIME_ST returns the structure for the calling VCPU.
+
+Stolen Time
+---
+
+The structure pointed to by the PV_TIME_ST hypercall is as follows:
+
+  Field   | Byte Length | Byte Offset | Description
+  --- | --- | --- | --
+  Revision|  4  |  0  | Must be 0 for version 0.1
+  Attributes  |  4  |  4  | Must be 0
+  Stolen time |  8  |  8  | Stolen time in unsigned
+  | | | nanoseconds indicating how
+  | | | much time this VCPU thread
+  | | | was involuntarily not
+  | | | running on a physical CPU.
+
+The structure will be updated by the hypervisor prior to scheduling a VCPU. It
+will be present within a reserved region of the normal memory given to the
+guest. The guest should not attempt to write into this memory. There is a
+structure per VCPU of the guest.
+
+User space interface
+
+
+User space can request that KVM provide the paravirtualized time interface to
+a guest by creating a KVM_DEV_TYPE_ARM_PV_TIME device, for example:
+
+struct kvm_create_device pvtime_device = {
+.type = KVM_DEV_TYPE_ARM_PV_TIME,
+.attr = 0,
+.flags = 0,
+};
+
+pvtime_fd = ioctl(vm_fd, KVM_CREATE_DEVICE, &pvtime_device);
+
+Creation of the device should be done after creating the vCPUs of the virtual
+machine.
+
+The IPA of the structures must be given to KVM. This is the base address
+of an array of stolen time structures (one for each VCPU). The base address
+must be page aligned. The size must be at least 64 * number of VCPUs and be a
+multiple of PAGE_SIZE.
+
+The memory for these structures should be added to the guest in the usual
+manner (e.g. using KVM_SET_USER_MEMORY_REGION).
+
+For example:
+
+struct kvm_dev_arm_st_region region = {
+.gpa = ,
+.size = 
+};
+
+struct kvm_device_attr st_base = {
+.group = KVM_DEV_ARM_PV_TIME_PADDR,
+.attr = KVM_DEV_ARM_PV_TIME_ST,
+.addr = (u64)®ion
+};
+
+ioctl(pvtime_fd, KVM_SET_DEVICE_ATTR, &st_base);
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
http

[PATCH v3 04/10] KVM: Implement kvm_put_guest()

2019-08-21 Thread Steven Price
kvm_put_guest() is analogous to put_user() - it writes a single value to
the guest physical address. The implementation is built upon put_user()
and so it has the same single copy atomic properties.

Signed-off-by: Steven Price 
---
 include/linux/kvm_host.h | 24 
 1 file changed, 24 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index fcb46b3374c6..e154a1897e20 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -746,6 +746,30 @@ int kvm_write_guest_offset_cached(struct kvm *kvm, struct 
gfn_to_hva_cache *ghc,
  unsigned long len);
 int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
  gpa_t gpa, unsigned long len);
+
+#define __kvm_put_guest(kvm, gfn, offset, value, type) \
+({ \
+   unsigned long __addr = gfn_to_hva(kvm, gfn);\
+   type __user *__uaddr = (type __user *)(__addr + offset);\
+   int __ret = 0;  \
+   \
+   if (kvm_is_error_hva(__addr))   \
+   __ret = -EFAULT;\
+   else\
+   __ret = put_user(value, __uaddr);   \
+   if (!__ret) \
+   mark_page_dirty(kvm, gfn);  \
+   __ret;  \
+})
+
+#define kvm_put_guest(kvm, gpa, value, type)   \
+({ \
+   gpa_t __gpa = gpa;  \
+   struct kvm *__kvm = kvm;\
+   __kvm_put_guest(__kvm, __gpa >> PAGE_SHIFT, \
+   offset_in_page(__gpa), (value), type);  \
+})
+
 int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len);
 int kvm_clear_guest(struct kvm *kvm, gpa_t gpa, unsigned long len);
 struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn);
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 08/10] arm/arm64: Provide a wrapper for SMCCC 1.1 calls

2019-08-21 Thread Steven Price
SMCCC 1.1 calls may use either HVC or SMC depending on the PSCI
conduit. Rather than coding this in every call site provide a macro
which uses the correct instruction. The macro also handles the case
where no PSCI conduit is configured returning a not supported error
in res, along with returning the conduit used for the call.

This allow us to remove some duplicated code and will be useful later
when adding paravirtualized time hypervisor calls.

Signed-off-by: Steven Price 
Acked-by: Will Deacon 
---
 include/linux/arm-smccc.h | 44 +++
 1 file changed, 44 insertions(+)

diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
index e7f129f26ebd..eee1e832221d 100644
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -303,6 +303,50 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned 
long a1,
 #define SMCCC_RET_NOT_SUPPORTED-1
 #define SMCCC_RET_NOT_REQUIRED -2
 
+/* Like arm_smccc_1_1* but always returns SMCCC_RET_NOT_SUPPORTED.
+ * Used when the PSCI conduit is not defined. The empty asm statement
+ * avoids compiler warnings about unused variables.
+ */
+#define __fail_smccc_1_1(...)  \
+   do {\
+   __declare_args(__count_args(__VA_ARGS__), __VA_ARGS__); \
+   asm ("" __constraints(__count_args(__VA_ARGS__)));  \
+   if (___res) \
+   ___res->a0 = SMCCC_RET_NOT_SUPPORTED;   \
+   } while (0)
+
+/*
+ * arm_smccc_1_1_invoke() - make an SMCCC v1.1 compliant call
+ *
+ * This is a variadic macro taking one to eight source arguments, and
+ * an optional return structure.
+ *
+ * @a0-a7: arguments passed in registers 0 to 7
+ * @res: result values from registers 0 to 3
+ *
+ * This macro will make either an HVC call or an SMC call depending on the
+ * current PSCI conduit. If no valid conduit is available then -1
+ * (SMCCC_RET_NOT_SUPPORTED) is returned in @res.a0 (if supplied).
+ *
+ * The return value also provides the conduit that was used.
+ */
+#define arm_smccc_1_1_invoke(...) ({   \
+   int method = psci_ops.conduit;  \
+   switch (method) {   \
+   case PSCI_CONDUIT_HVC:  \
+   arm_smccc_1_1_hvc(__VA_ARGS__); \
+   break;  \
+   case PSCI_CONDUIT_SMC:  \
+   arm_smccc_1_1_smc(__VA_ARGS__); \
+   break;  \
+   default:\
+   __fail_smccc_1_1(__VA_ARGS__);  \
+   method = PSCI_CONDUIT_NONE; \
+   break;  \
+   }   \
+   method; \
+   })
+
 /* Paravirtualised time calls (defined by ARM DEN0057A) */
 #define ARM_SMCCC_HV_PV_FEATURES   \
ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
-- 
2.20.1



[PATCH v3 10/10] arm64: Retrieve stolen time as paravirtualized guest

2019-08-21 Thread Steven Price
Enable paravirtualization features when running under a hypervisor
supporting the PV_TIME_ST hypercall.

For each (v)CPU, we ask the hypervisor for the location of a shared
page which the hypervisor will use to report stolen time to us. We set
pv_time_ops to the stolen time function which simply reads the stolen
value from the shared page for a VCPU. We guarantee single-copy
atomicity using READ_ONCE which means we can also read the stolen
time for another VCPU than the currently running one while it is
potentially being updated by the hypervisor.

Signed-off-by: Steven Price 
---
 arch/arm64/include/asm/paravirt.h |   9 +-
 arch/arm64/kernel/paravirt.c  | 148 ++
 arch/arm64/kernel/time.c  |   3 +
 include/linux/cpuhotplug.h|   1 +
 4 files changed, 160 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/paravirt.h 
b/arch/arm64/include/asm/paravirt.h
index 799d9dd6f7cc..125c26c42902 100644
--- a/arch/arm64/include/asm/paravirt.h
+++ b/arch/arm64/include/asm/paravirt.h
@@ -21,6 +21,13 @@ static inline u64 paravirt_steal_clock(int cpu)
 {
return pv_ops.time.steal_clock(cpu);
 }
-#endif
+
+int __init kvm_guest_init(void);
+
+#else
+
+#define kvm_guest_init()
+
+#endif // CONFIG_PARAVIRT
 
 #endif
diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c
index 4cfed91fe256..ea8dbbbd3293 100644
--- a/arch/arm64/kernel/paravirt.c
+++ b/arch/arm64/kernel/paravirt.c
@@ -6,13 +6,161 @@
  * Author: Stefano Stabellini 
  */
 
+#define pr_fmt(fmt) "kvmarm-pv: " fmt
+
+#include 
+#include 
 #include 
+#include 
 #include 
+#include 
+#include 
+#include 
+#include 
 #include 
+
 #include 
+#include 
+#include 
 
 struct static_key paravirt_steal_enabled;
 struct static_key paravirt_steal_rq_enabled;
 
 struct paravirt_patch_template pv_ops;
 EXPORT_SYMBOL_GPL(pv_ops);
+
+struct kvmarm_stolen_time_region {
+   struct pvclock_vcpu_stolen_time *kaddr;
+};
+
+static DEFINE_PER_CPU(struct kvmarm_stolen_time_region, stolen_time_region);
+
+static bool steal_acc = true;
+static int __init parse_no_stealacc(char *arg)
+{
+   steal_acc = false;
+   return 0;
+}
+
+early_param("no-steal-acc", parse_no_stealacc);
+
+/* return stolen time in ns by asking the hypervisor */
+static u64 kvm_steal_clock(int cpu)
+{
+   struct kvmarm_stolen_time_region *reg;
+
+   reg = per_cpu_ptr(&stolen_time_region, cpu);
+   if (!reg->kaddr) {
+   pr_warn_once("stolen time enabled but not configured for cpu 
%d\n",
+cpu);
+   return 0;
+   }
+
+   return le64_to_cpu(READ_ONCE(reg->kaddr->stolen_time));
+}
+
+static int disable_stolen_time_current_cpu(void)
+{
+   struct kvmarm_stolen_time_region *reg;
+
+   reg = this_cpu_ptr(&stolen_time_region);
+   if (!reg->kaddr)
+   return 0;
+
+   memunmap(reg->kaddr);
+   memset(reg, 0, sizeof(*reg));
+
+   return 0;
+}
+
+static int stolen_time_dying_cpu(unsigned int cpu)
+{
+   return disable_stolen_time_current_cpu();
+}
+
+static int init_stolen_time_cpu(unsigned int cpu)
+{
+   struct kvmarm_stolen_time_region *reg;
+   struct arm_smccc_res res;
+
+   reg = this_cpu_ptr(&stolen_time_region);
+
+   arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_TIME_ST, &res);
+
+   if ((long)res.a0 < 0)
+   return -EINVAL;
+
+   reg->kaddr = memremap(res.a0,
+ sizeof(struct pvclock_vcpu_stolen_time),
+ MEMREMAP_WB);
+
+   if (!reg->kaddr) {
+   pr_warn("Failed to map stolen time data structure\n");
+   return -ENOMEM;
+   }
+
+   if (le32_to_cpu(reg->kaddr->revision) != 0 ||
+   le32_to_cpu(reg->kaddr->attributes) != 0) {
+   pr_warn("Unexpected revision or attributes in stolen time 
data\n");
+   return -ENXIO;
+   }
+
+   return 0;
+}
+
+static int kvm_arm_init_stolen_time(void)
+{
+   int ret;
+
+   ret = cpuhp_setup_state(CPUHP_AP_ARM_KVMPV_STARTING,
+   "hypervisor/kvmarm/pv:starting",
+   init_stolen_time_cpu, stolen_time_dying_cpu);
+   if (ret < 0)
+   return ret;
+   return 0;
+}
+
+static bool has_kvm_steal_clock(void)
+{
+   struct arm_smccc_res res;
+
+   /* To detect the presence of PV time support we require SMCCC 1.1+ */
+   if (psci_ops.smccc_version < SMCCC_VERSION_1_1)
+   return false;
+
+   arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
+ARM_SMCCC_HV_PV_FEATURES, &res);
+
+   if (res.a0 != SMCCC_RET_SUCCESS)
+   return false;
+
+   arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_FEATURES,
+ARM_SMCCC_HV_PV_TIME_ST, &res);
+
+   if (res.a0 != SMCCC_RET_SUCCESS)
+   return false;
+
+   return true;
+}
+
+int __init kvm_guest_init(void)
+{
+ 

[PATCH v3 05/10] KVM: arm64: Support stolen time reporting via shared structure

2019-08-21 Thread Steven Price
Implement the service call for configuring a shared structure between a
VCPU and the hypervisor in which the hypervisor can write the time
stolen from the VCPU's execution time by other tasks on the host.

The hypervisor allocates memory which is placed at an IPA chosen by user
space. The hypervisor then updates the shared structure using
kvm_put_guest() to ensure single copy atomicity of the 64-bit value
reporting the stolen time in nanoseconds.

Whenever stolen time is enabled by the guest, the stolen time counter is
reset.

The stolen time itself is retrieved from the sched_info structure
maintained by the Linux scheduler code. We enable SCHEDSTATS when
selecting KVM Kconfig to ensure this value is meaningful.

Signed-off-by: Steven Price 
---
 arch/arm/include/asm/kvm_host.h   | 20 +
 arch/arm64/include/asm/kvm_host.h | 25 +++-
 arch/arm64/kvm/Kconfig|  1 +
 include/linux/kvm_types.h |  2 +
 virt/kvm/arm/arm.c| 10 +
 virt/kvm/arm/hypercalls.c |  3 ++
 virt/kvm/arm/pvtime.c | 67 +++
 7 files changed, 127 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 369b5d2d54bf..47d2ced99421 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -39,6 +39,7 @@
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_IRQ_PENDINGKVM_ARCH_REQ(1)
 #define KVM_REQ_VCPU_RESET KVM_ARCH_REQ(2)
+#define KVM_REQ_RECORD_STEAL   KVM_ARCH_REQ(3)
 
 DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
 
@@ -329,6 +330,25 @@ static inline int kvm_hypercall_pv_features(struct 
kvm_vcpu *vcpu)
return SMCCC_RET_NOT_SUPPORTED;
 }
 
+static inline int kvm_hypercall_stolen_time(struct kvm_vcpu *vcpu)
+{
+   return SMCCC_RET_NOT_SUPPORTED;
+}
+
+static inline int kvm_update_stolen_time(struct kvm_vcpu *vcpu, bool init)
+{
+   return -ENOTSUPP;
+}
+
+static inline void kvm_pvtime_init_vm(struct kvm_arch *kvm_arch)
+{
+}
+
+static inline bool kvm_is_pvtime_enabled(struct kvm_arch *kvm_arch)
+{
+   return false;
+}
+
 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 583b3639062a..b6fa7beffd8a 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -44,6 +44,7 @@
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_IRQ_PENDINGKVM_ARCH_REQ(1)
 #define KVM_REQ_VCPU_RESET KVM_ARCH_REQ(2)
+#define KVM_REQ_RECORD_STEAL   KVM_ARCH_REQ(3)
 
 DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
 
@@ -83,6 +84,11 @@ struct kvm_arch {
 
/* Mandated version of PSCI */
u32 psci_version;
+
+   struct kvm_arch_pvtime {
+   gpa_t st_base;
+   u64 st_size;
+   } pvtime;
 };
 
 #define KVM_NR_MEM_OBJS 40
@@ -338,8 +344,13 @@ struct kvm_vcpu_arch {
/* True when deferrable sysregs are loaded on the physical CPU,
 * see kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs. */
bool sysregs_loaded_on_cpu;
-};
 
+   /* Guest PV state */
+   struct {
+   u64 steal;
+   u64 last_steal;
+   } steal;
+};
 /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
 #define vcpu_sve_pffr(vcpu) ((void *)((char *)((vcpu)->arch.sve_state) + \
  sve_ffr_offset((vcpu)->arch.sve_max_vl)))
@@ -479,6 +490,18 @@ int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
 int kvm_hypercall_pv_features(struct kvm_vcpu *vcpu);
+int kvm_hypercall_stolen_time(struct kvm_vcpu *vcpu);
+int kvm_update_stolen_time(struct kvm_vcpu *vcpu, bool init);
+
+static inline void kvm_pvtime_init_vm(struct kvm_arch *kvm_arch)
+{
+   kvm_arch->pvtime.st_base = GPA_INVALID;
+}
+
+static inline bool kvm_is_pvtime_enabled(struct kvm_arch *kvm_arch)
+{
+   return (kvm_arch->pvtime.st_base != GPA_INVALID);
+}
 
 void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
 
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index a67121d419a2..d8b88e40d223 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -39,6 +39,7 @@ config KVM
select IRQ_BYPASS_MANAGER
select HAVE_KVM_IRQ_BYPASS
select HAVE_KVM_VCPU_RUN_PID_CHANGE
+   select SCHEDSTATS
---help---
  Support hosting virtualized guest machines.
  We don't support KVM with 16K page tables yet, due to the multiple
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index bde5374ae021..1c88e69db3d9 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -35,6 +35,8 @@ typedef unsigned long  gva_t;
 typedef u64gpa_t;
 typedef u64gfn_t;
 
+#define GPA_INVALID(~(gpa_t)0

[PATCH v3 06/10] KVM: Allow kvm_device_ops to be const

2019-08-21 Thread Steven Price
Currently a kvm_device_ops structure cannot be const without triggering
compiler warnings. However the structure doesn't need to be written to
and, by marking it const, it can be read-only in memory. Add some more
const keywords to allow this.

Signed-off-by: Steven Price 
---
 include/linux/kvm_host.h | 4 ++--
 virt/kvm/kvm_main.c  | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e154a1897e20..c5c1a923f21b 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1262,7 +1262,7 @@ extern unsigned int halt_poll_ns_grow_start;
 extern unsigned int halt_poll_ns_shrink;
 
 struct kvm_device {
-   struct kvm_device_ops *ops;
+   const struct kvm_device_ops *ops;
struct kvm *kvm;
void *private;
struct list_head vm_node;
@@ -1315,7 +1315,7 @@ struct kvm_device_ops {
 void kvm_device_get(struct kvm_device *dev);
 void kvm_device_put(struct kvm_device *dev);
 struct kvm_device *kvm_device_from_filp(struct file *filp);
-int kvm_register_device_ops(struct kvm_device_ops *ops, u32 type);
+int kvm_register_device_ops(const struct kvm_device_ops *ops, u32 type);
 void kvm_unregister_device_ops(u32 type);
 
 extern struct kvm_device_ops kvm_mpic_ops;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c6a91b044d8d..75488ebb87c9 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3046,14 +3046,14 @@ struct kvm_device *kvm_device_from_filp(struct file 
*filp)
return filp->private_data;
 }
 
-static struct kvm_device_ops *kvm_device_ops_table[KVM_DEV_TYPE_MAX] = {
+static const struct kvm_device_ops *kvm_device_ops_table[KVM_DEV_TYPE_MAX] = {
 #ifdef CONFIG_KVM_MPIC
[KVM_DEV_TYPE_FSL_MPIC_20]  = &kvm_mpic_ops,
[KVM_DEV_TYPE_FSL_MPIC_42]  = &kvm_mpic_ops,
 #endif
 };
 
-int kvm_register_device_ops(struct kvm_device_ops *ops, u32 type)
+int kvm_register_device_ops(const struct kvm_device_ops *ops, u32 type)
 {
if (type >= ARRAY_SIZE(kvm_device_ops_table))
return -ENOSPC;
@@ -3074,7 +3074,7 @@ void kvm_unregister_device_ops(u32 type)
 static int kvm_ioctl_create_device(struct kvm *kvm,
   struct kvm_create_device *cd)
 {
-   struct kvm_device_ops *ops = NULL;
+   const struct kvm_device_ops *ops = NULL;
struct kvm_device *dev;
bool test = cd->flags & KVM_CREATE_DEVICE_TEST;
int type;
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 09/10] arm/arm64: Make use of the SMCCC 1.1 wrapper

2019-08-21 Thread Steven Price
Rather than directly choosing which function to use based on
psci_ops.conduit, use the new arm_smccc_1_1 wrapper instead.

In some cases we still need to do some operations based on the
conduit, but the code duplication is removed.

No functional change.

Signed-off-by: Steven Price 
---
 arch/arm/mm/proc-v7-bugs.c | 13 +++---
 arch/arm64/kernel/cpu_errata.c | 80 --
 2 files changed, 33 insertions(+), 60 deletions(-)

diff --git a/arch/arm/mm/proc-v7-bugs.c b/arch/arm/mm/proc-v7-bugs.c
index 9a07916af8dd..8eb52f3385e7 100644
--- a/arch/arm/mm/proc-v7-bugs.c
+++ b/arch/arm/mm/proc-v7-bugs.c
@@ -78,12 +78,13 @@ static void cpu_v7_spectre_init(void)
if (psci_ops.smccc_version == SMCCC_VERSION_1_0)
break;
 
+   arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
+ARM_SMCCC_ARCH_WORKAROUND_1, &res);
+   if ((int)res.a0 != 0)
+   return;
+
switch (psci_ops.conduit) {
case PSCI_CONDUIT_HVC:
-   arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
- ARM_SMCCC_ARCH_WORKAROUND_1, &res);
-   if ((int)res.a0 != 0)
-   break;
per_cpu(harden_branch_predictor_fn, cpu) =
call_hvc_arch_workaround_1;
cpu_do_switch_mm = cpu_v7_hvc_switch_mm;
@@ -91,10 +92,6 @@ static void cpu_v7_spectre_init(void)
break;
 
case PSCI_CONDUIT_SMC:
-   arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
- ARM_SMCCC_ARCH_WORKAROUND_1, &res);
-   if ((int)res.a0 != 0)
-   break;
per_cpu(harden_branch_predictor_fn, cpu) =
call_smc_arch_workaround_1;
cpu_do_switch_mm = cpu_v7_smc_switch_mm;
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 1e43ba5c79b7..400a49aaae85 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -215,40 +215,31 @@ static int detect_harden_bp_fw(void)
if (psci_ops.smccc_version == SMCCC_VERSION_1_0)
return -1;
 
+   arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
+ARM_SMCCC_ARCH_WORKAROUND_1, &res);
+
+   switch ((int)res.a0) {
+   case 1:
+   /* Firmware says we're just fine */
+   return 0;
+   case 0:
+   break;
+   default:
+   return -1;
+   }
+
switch (psci_ops.conduit) {
case PSCI_CONDUIT_HVC:
-   arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
- ARM_SMCCC_ARCH_WORKAROUND_1, &res);
-   switch ((int)res.a0) {
-   case 1:
-   /* Firmware says we're just fine */
-   return 0;
-   case 0:
-   cb = call_hvc_arch_workaround_1;
-   /* This is a guest, no need to patch KVM vectors */
-   smccc_start = NULL;
-   smccc_end = NULL;
-   break;
-   default:
-   return -1;
-   }
+   cb = call_hvc_arch_workaround_1;
+   /* This is a guest, no need to patch KVM vectors */
+   smccc_start = NULL;
+   smccc_end = NULL;
break;
 
case PSCI_CONDUIT_SMC:
-   arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
- ARM_SMCCC_ARCH_WORKAROUND_1, &res);
-   switch ((int)res.a0) {
-   case 1:
-   /* Firmware says we're just fine */
-   return 0;
-   case 0:
-   cb = call_smc_arch_workaround_1;
-   smccc_start = __smccc_workaround_1_smc_start;
-   smccc_end = __smccc_workaround_1_smc_end;
-   break;
-   default:
-   return -1;
-   }
+   cb = call_smc_arch_workaround_1;
+   smccc_start = __smccc_workaround_1_smc_start;
+   smccc_end = __smccc_workaround_1_smc_end;
break;
 
default:
@@ -338,6 +329,7 @@ void __init arm64_enable_wa2_handling(struct alt_instr *alt,
 
 void arm64_set_ssbd_mitigation(bool state)
 {
+   int conduit;
if (!IS_ENABLED(CONFIG_ARM64_SSBD)) {
pr_info_once("SSBD disabled by kernel configuration\n");
return;
@@ -351,19 +343,10 @@ void arm64_set_ssbd_mitigation(bool state)
return;
}
 
-   switch (psci_ops.conduit) {
-   case PSCI_CO

[PATCH v3 07/10] KVM: arm64: Provide a PV_TIME device to user space

2019-08-21 Thread Steven Price
Allow user space to inform the KVM host where in the physical memory
map the paravirtualized time structures should be located.

A device is created which provides the base address of an array of
Stolen Time (ST) structures, one for each VCPU. There must be (64 *
total number of VCPUs) bytes of memory available at this location.

The address is given in terms of the physical address visible to
the guest and must be page aligned. The guest will discover the address
via a hypercall.

Signed-off-by: Steven Price 
---
 arch/arm/include/asm/kvm_host.h   |  4 ++
 arch/arm64/include/asm/kvm_host.h |  1 +
 arch/arm64/include/uapi/asm/kvm.h |  8 +++
 include/uapi/linux/kvm.h  |  2 +
 virt/kvm/arm/arm.c|  1 +
 virt/kvm/arm/pvtime.c | 94 +++
 6 files changed, 110 insertions(+)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 47d2ced99421..b6c8dbc0556b 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -325,6 +325,10 @@ static inline int kvm_arch_vm_ioctl_check_extension(struct 
kvm *kvm, long ext)
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
+static inline void kvm_pvtime_init(void)
+{
+}
+
 static inline int kvm_hypercall_pv_features(struct kvm_vcpu *vcpu)
 {
return SMCCC_RET_NOT_SUPPORTED;
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index b6fa7beffd8a..7b2147f62c16 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -489,6 +489,7 @@ void handle_exit_early(struct kvm_vcpu *vcpu, struct 
kvm_run *run,
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
+void kvm_pvtime_init(void);
 int kvm_hypercall_pv_features(struct kvm_vcpu *vcpu);
 int kvm_hypercall_stolen_time(struct kvm_vcpu *vcpu);
 int kvm_update_stolen_time(struct kvm_vcpu *vcpu, bool init);
diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
index 9a507716ae2f..209c4de67306 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -367,6 +367,14 @@ struct kvm_vcpu_events {
 #define KVM_PSCI_RET_INVAL PSCI_RET_INVALID_PARAMS
 #define KVM_PSCI_RET_DENIEDPSCI_RET_DENIED
 
+/* Device Control API: PV_TIME */
+#define KVM_DEV_ARM_PV_TIME_REGION 0
+#define  KVM_DEV_ARM_PV_TIME_ST0
+struct kvm_dev_arm_st_region {
+   __u64 gpa;
+   __u64 size;
+};
+
 #endif
 
 #endif /* __ARM_KVM_H__ */
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 5e3f12d5359e..265156a984f2 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1222,6 +1222,8 @@ enum kvm_device_type {
 #define KVM_DEV_TYPE_ARM_VGIC_ITS  KVM_DEV_TYPE_ARM_VGIC_ITS
KVM_DEV_TYPE_XIVE,
 #define KVM_DEV_TYPE_XIVE  KVM_DEV_TYPE_XIVE
+   KVM_DEV_TYPE_ARM_PV_TIME,
+#define KVM_DEV_TYPE_ARM_PV_TIME   KVM_DEV_TYPE_ARM_PV_TIME
KVM_DEV_TYPE_MAX,
 };
 
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 5e8343e2dd62..bfb5a842e6ab 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -1494,6 +1494,7 @@ static int init_subsystems(void)
 
kvm_perf_init();
kvm_coproc_table_init();
+   kvm_pvtime_init();
 
 out:
on_each_cpu(_kvm_arch_hardware_disable, NULL, 1);
diff --git a/virt/kvm/arm/pvtime.c b/virt/kvm/arm/pvtime.c
index 28603689f6e0..3e55c1fb6a49 100644
--- a/virt/kvm/arm/pvtime.c
+++ b/virt/kvm/arm/pvtime.c
@@ -2,7 +2,9 @@
 // Copyright (C) 2019 Arm Ltd.
 
 #include 
+#include 
 
+#include 
 #include 
 
 #include 
@@ -86,3 +88,95 @@ int kvm_hypercall_stolen_time(struct kvm_vcpu *vcpu)
 
return ret;
 }
+
+static int kvm_arm_pvtime_create(struct kvm_device *dev, u32 type)
+{
+   return 0;
+}
+
+static void kvm_arm_pvtime_destroy(struct kvm_device *dev)
+{
+   struct kvm_arch_pvtime *pvtime = &dev->kvm->arch.pvtime;
+
+   pvtime->st_base = GPA_INVALID;
+   kfree(dev);
+}
+
+static int kvm_arm_pvtime_set_attr(struct kvm_device *dev,
+  struct kvm_device_attr *attr)
+{
+   struct kvm *kvm = dev->kvm;
+   struct kvm_arch_pvtime *pvtime = &kvm->arch.pvtime;
+   u64 __user *user = (u64 __user *)attr->addr;
+   struct kvm_dev_arm_st_region region;
+
+   switch (attr->group) {
+   case KVM_DEV_ARM_PV_TIME_REGION:
+   if (copy_from_user(®ion, user, sizeof(region)))
+   return -EFAULT;
+   if (region.gpa & ~PAGE_MASK)
+   return -EINVAL;
+   if (region.size & ~PAGE_MASK)
+   return -EINVAL;
+   switch (attr->attr) {
+   case KVM_DEV_ARM_PV_TIME_ST:
+   if (pvtime->st_base != GPA_INVALID)
+   return -EEXIST;
+   pvtime->st_base = region.gpa;
+   pvtime->st_size = region.size;
+

[RESEND PATCH] KVM: arm: VGIC: properly initialise private IRQ affinity

2019-08-21 Thread Andre Przywara
At the moment we initialise the target *mask* of a virtual IRQ to the
VCPU it belongs to, even though this mask is only defined for GICv2 and
quickly runs out of bits for many GICv3 guests.
This behaviour triggers an UBSAN complaint for more than 32 VCPUs:
--
[ 5659.462377] UBSAN: Undefined behaviour in 
virt/kvm/arm/vgic/vgic-init.c:223:21
[ 5659.471689] shift exponent 32 is too large for 32-bit type 'unsigned int'
--
Also for GICv3 guests the reporting of TARGET in the "vgic-state" debugfs
dump is wrong, due to this very same problem.

Fix both issues by only initialising vgic_irq->targets for a vGICv2 guest,
and by initialising vgic_irq->mpdir for vGICv3 guests instead. We can't
use the actual MPIDR for that, as the VCPU's system register is not
initialised at this point yet. This is not really an issue, as ->mpidr
is just used for the debugfs output and the IROUTER MMIO register, which
does not exist in redistributors (dealing with SGIs and PPIs).

Signed-off-by: Andre Przywara 
Reported-by: Dave Martin 
---
Hi,

this came up here again, I think it fell through the cracks back in
March:
http://lists.infradead.org/pipermail/linux-arm-kernel/2019-March/637209.html

Cheers,
Andre.

 virt/kvm/arm/vgic/vgic-init.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index 80127ca9269f..8bce2f75e0c1 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -210,7 +210,6 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
irq->intid = i;
irq->vcpu = NULL;
irq->target_vcpu = vcpu;
-   irq->targets = 1U << vcpu->vcpu_id;
kref_init(&irq->refcount);
if (vgic_irq_is_sgi(i)) {
/* SGIs */
@@ -221,10 +220,14 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
irq->config = VGIC_CONFIG_LEVEL;
}
 
-   if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)
+   if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
irq->group = 1;
-   else
+   /* The actual MPIDR is not initialised at this point. */
+   irq->mpidr = 0;
+   } else {
irq->group = 0;
+   irq->targets = 1U << vcpu->vcpu_id;
+   }
}
 
if (!irqchip_in_kernel(vcpu->kvm))
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RESEND PATCH] KVM: arm: VGIC: properly initialise private IRQ affinity

2019-08-21 Thread Julien Grall

Hi Andre,

On 21/08/2019 18:00, Andre Przywara wrote:

At the moment we initialise the target *mask* of a virtual IRQ to the
VCPU it belongs to, even though this mask is only defined for GICv2 and
quickly runs out of bits for many GICv3 guests.
This behaviour triggers an UBSAN complaint for more than 32 VCPUs:
--
[ 5659.462377] UBSAN: Undefined behaviour in 
virt/kvm/arm/vgic/vgic-init.c:223:21
[ 5659.471689] shift exponent 32 is too large for 32-bit type 'unsigned int'
--
Also for GICv3 guests the reporting of TARGET in the "vgic-state" debugfs
dump is wrong, due to this very same problem.

Fix both issues by only initialising vgic_irq->targets for a vGICv2 guest,
and by initialising vgic_irq->mpdir for vGICv3 guests instead. We can't
use the actual MPIDR for that, as the VCPU's system register is not
initialised at this point yet. This is not really an issue, as ->mpidr
is just used for the debugfs output and the IROUTER MMIO register, which
does not exist in redistributors (dealing with SGIs and PPIs).

Signed-off-by: Andre Przywara 
Reported-by: Dave Martin 


Tested-by: Julien Grall 

Cheers,


---
Hi,

this came up here again, I think it fell through the cracks back in
March:
http://lists.infradead.org/pipermail/linux-arm-kernel/2019-March/637209.html

Cheers,
Andre.

  virt/kvm/arm/vgic/vgic-init.c | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index 80127ca9269f..8bce2f75e0c1 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -210,7 +210,6 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
irq->intid = i;
irq->vcpu = NULL;
irq->target_vcpu = vcpu;
-   irq->targets = 1U << vcpu->vcpu_id;
kref_init(&irq->refcount);
if (vgic_irq_is_sgi(i)) {
/* SGIs */
@@ -221,10 +220,14 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
irq->config = VGIC_CONFIG_LEVEL;
}
  
-		if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)

+   if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
irq->group = 1;
-   else
+   /* The actual MPIDR is not initialised at this point. */
+   irq->mpidr = 0;
+   } else {
irq->group = 0;
+   irq->targets = 1U << vcpu->vcpu_id;
+   }
}
  
  	if (!irqchip_in_kernel(vcpu->kvm))




--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RESEND PATCH] KVM: arm: VGIC: properly initialise private IRQ affinity

2019-08-21 Thread Marc Zyngier
On 21/08/2019 18:01, Julien Grall wrote:
> Hi Andre,
> 
> On 21/08/2019 18:00, Andre Przywara wrote:
>> At the moment we initialise the target *mask* of a virtual IRQ to the
>> VCPU it belongs to, even though this mask is only defined for GICv2 and
>> quickly runs out of bits for many GICv3 guests.
>> This behaviour triggers an UBSAN complaint for more than 32 VCPUs:
>> --
>> [ 5659.462377] UBSAN: Undefined behaviour in 
>> virt/kvm/arm/vgic/vgic-init.c:223:21
>> [ 5659.471689] shift exponent 32 is too large for 32-bit type 'unsigned int'
>> --
>> Also for GICv3 guests the reporting of TARGET in the "vgic-state" debugfs
>> dump is wrong, due to this very same problem.
>>
>> Fix both issues by only initialising vgic_irq->targets for a vGICv2 guest,
>> and by initialising vgic_irq->mpdir for vGICv3 guests instead. We can't
>> use the actual MPIDR for that, as the VCPU's system register is not
>> initialised at this point yet. This is not really an issue, as ->mpidr
>> is just used for the debugfs output and the IROUTER MMIO register, which
>> does not exist in redistributors (dealing with SGIs and PPIs).
>>
>> Signed-off-by: Andre Przywara 
>> Reported-by: Dave Martin 
> 
> Tested-by: Julien Grall 
Sorry for having dropped the ball on that one. Now applied to
kvmarm/next, with Julien's TB and a Cc: stable.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RESEND PATCH] KVM: arm: VGIC: properly initialise private IRQ affinity

2019-08-21 Thread Zenghui Yu




On 2019/8/22 1:00, Andre Przywara wrote:

At the moment we initialise the target *mask* of a virtual IRQ to the
VCPU it belongs to, even though this mask is only defined for GICv2 and
quickly runs out of bits for many GICv3 guests.
This behaviour triggers an UBSAN complaint for more than 32 VCPUs:
--
[ 5659.462377] UBSAN: Undefined behaviour in 
virt/kvm/arm/vgic/vgic-init.c:223:21
[ 5659.471689] shift exponent 32 is too large for 32-bit type 'unsigned int'
--
Also for GICv3 guests the reporting of TARGET in the "vgic-state" debugfs
dump is wrong, due to this very same problem.

Fix both issues by only initialising vgic_irq->targets for a vGICv2 guest,
and by initialising vgic_irq->mpdir for vGICv3 guests instead. We can't
use the actual MPIDR for that, as the VCPU's system register is not
initialised at this point yet. This is not really an issue, as ->mpidr
is just used for the debugfs output and the IROUTER MMIO register, which
does not exist in redistributors (dealing with SGIs and PPIs).

Signed-off-by: Andre Przywara 
Reported-by: Dave Martin 
---
Hi,

this came up here again, I think it fell through the cracks back in
March:
http://lists.infradead.org/pipermail/linux-arm-kernel/2019-March/637209.html

Cheers,
Andre.

  virt/kvm/arm/vgic/vgic-init.c | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index 80127ca9269f..8bce2f75e0c1 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -210,7 +210,6 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
irq->intid = i;
irq->vcpu = NULL;
irq->target_vcpu = vcpu;
-   irq->targets = 1U << vcpu->vcpu_id;
kref_init(&irq->refcount);
if (vgic_irq_is_sgi(i)) {
/* SGIs */
@@ -221,10 +220,14 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
irq->config = VGIC_CONFIG_LEVEL;
}
  
-		if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)

+   if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {


I still think that if user-space create VCPUs before vGIC (like what
Qemu does), the actual vGIC model will be unknown here. The UBSAN
warning will still show up when booting a vGIC-v3 guest (with Qemu).


Thanks,
zenghui


irq->group = 1;
-   else
+   /* The actual MPIDR is not initialised at this point. */
+   irq->mpidr = 0;
+   } else {
irq->group = 0;
+   irq->targets = 1U << vcpu->vcpu_id;
+   }
}
  
  	if (!irqchip_in_kernel(vcpu->kvm))




___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH] arm64: KVM: Only skip MMIO insn once

2019-08-21 Thread Andrew Jones
If after an MMIO exit to userspace a VCPU is immediately run with an
immediate_exit request, such as when a signal is delivered or an MMIO
emulation completion is needed, then the VCPU completes the MMIO
emulation and immediately returns to userspace. As the exit_reason
does not get changed from KVM_EXIT_MMIO in these cases we have to
be careful not to complete the MMIO emulation again, when the VCPU is
eventually run again, because the emulation does an instruction skip
(and doing too many skips would be a waste of guest code :-) We need
to use additional VCPU state to track if the emulation is complete.
As luck would have it, we already have 'mmio_needed', which even
appears to be used in this way by other architectures already.

Fixes: 0d640732dbeb ("arm64: KVM: Skip MMIO insn after emulation")
Signed-off-by: Andrew Jones 
---
 virt/kvm/arm/arm.c  | 3 ++-
 virt/kvm/arm/mmio.c | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 35a069815baf..322cf9030bbe 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -669,7 +669,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
if (ret)
return ret;
 
-   if (run->exit_reason == KVM_EXIT_MMIO) {
+   if (vcpu->mmio_needed) {
+   vcpu->mmio_needed = 0;
ret = kvm_handle_mmio_return(vcpu, vcpu->run);
if (ret)
return ret;
diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c
index a8a6a0c883f1..2d9b5e064ae0 100644
--- a/virt/kvm/arm/mmio.c
+++ b/virt/kvm/arm/mmio.c
@@ -201,6 +201,7 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run,
if (is_write)
memcpy(run->mmio.data, data_buf, len);
vcpu->stat.mmio_exit_user++;
+   vcpu->mmio_needed   = 1;
run->exit_reason= KVM_EXIT_MMIO;
return 0;
 }
-- 
2.18.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm