Re: [PATCH v3 25/28] arm64/sve: Detect SVE and activate runtime support
On Tue, Oct 10, 2017 at 07:38:42PM +0100, Dave P Martin wrote: > This patch enables detection of hardware SVE support via the > cpufeatures framework, and reports its presence to the kernel and > userspace via the new ARM64_SVE cpucap and HWCAP_SVE hwcap > respectively. > > Userspace can also detect SVE using ID_AA64PFR0_EL1, using the > cpufeatures MRS emulation. > > When running on hardware that supports SVE, this enables runtime > kernel support for SVE, and allows user tasks to execute SVE > instructions and make of the of the SVE-specific user/kernel > interface extensions implemented by this series. > > Signed-off-by: Dave Martin> Cc: Suzuki K Poulose Reviewed-by: Catalin Marinas ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v3 24/28] arm64/sve: KVM: Hide SVE from CPU features exposed to guests
On Tue, Oct 10, 2017 at 07:38:41PM +0100, Dave P Martin wrote: > KVM guests cannot currently use SVE, because SVE is always > configured to trap to EL2. > > However, a guest that sees SVE reported as present in > ID_AA64PFR0_EL1 may legitimately expect that SVE works and try to > use it. Instead of working, the guest will receive an injected > undef exception, which may cause the guest to oops or go into a > spin. > > To avoid misleading the guest into believing that SVE will work, > this patch masks out the SVE field from ID_AA64PFR0_EL1 when a > guest attempts to read this register. No support is explicitly > added for ID_AA64ZFR0_EL1 either, so that is still emulated as > reading as zero, which is consistent with SVE not being > implemented. > > This is a temporary measure, and will be removed in a later series > when full KVM support for SVE is implemented. > > Signed-off-by: Dave Martin> Reviewed-by: Alex Bennée > Cc: Marc Zyngier Acked-by: Catalin Marinas ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v3 23/28] arm64/sve: KVM: Treat guest SVE use as undefined instruction execution
On Tue, Oct 10, 2017 at 07:38:40PM +0100, Dave P Martin wrote: > When trapping forbidden attempts by a guest to use SVE, we want the > guest to see a trap consistent with SVE not being implemented. > > This patch injects an undefined instruction exception into the > guest in response to such an exception. > > Signed-off-by: Dave Martin> Reviewed-by: Alex Bennée Acked-by: Catalin Marinas ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v3 21/28] arm64/sve: Add sysctl to set the default vector length for new processes
On Tue, Oct 10, 2017 at 07:38:38PM +0100, Dave P Martin wrote: > Because of the effect of SVE on the size of the signal frame, the > default vector length used for new processes involves a tradeoff > between performance of SVE-enabled software on the one hand, and > reliability of non-SVE-aware software on the other hand. > > For this reason, the best choice depends on the repertoire of > userspace software in use and is thus best left up to distro > maintainers, sysadmins and developers. > > If CONFIG_SYSCTL is enabled, this patch exposes the default vector > length in /proc/sys/abi/sve_default_vector_length, where boot > scripts or the adventurous can poke it. > > In common with other arm64 ABI sysctls, this control is currently > global: setting it requires CAP_SYS_ADMIN in the root user > namespace, but the value set is effective for subsequent execs in > all namespaces. The control only affects _new_ processes, however: > changing it does not affect the vector length of any existing > process. > > The intended usage model is that if userspace is known to be fully > SVE-tolerant (or a developer is curious to find out) then init > scripts can crank this up during startup. > > Signed-off-by: Dave Martin> Reviewed-by: Alex Bennée Reviewed-by: Catalin Marinas ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v3 20/28] arm64/sve: Add prctl controls for userspace vector length management
On Tue, Oct 10, 2017 at 07:38:37PM +0100, Dave P Martin wrote: > This patch adds two arm64-specific prctls, to permit userspace to > control its vector length: > > * PR_SVE_SET_VL: set the thread's SVE vector length and vector >length inheritance mode. > > * PR_SVE_GET_VL: get the same information. > > Although these calls shadow instruction set features in the SVE > architecture, these prctls provide additional control: the vector > length inheritance mode is Linux-specific and nothing to do with > the architecture, and the architecture does not permit EL0 to set > its own vector length directly. Both can be used in portable tools > without requiring the use of SVE instructions. > > Signed-off-by: Dave Martin> Cc: Alex Bennée Reviewed-by: Catalin Marinas ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v3 19/28] arm64/sve: ptrace and ELF coredump support
On Tue, Oct 10, 2017 at 07:38:36PM +0100, Dave P Martin wrote: > @@ -702,6 +737,211 @@ static int system_call_set(struct task_struct *target, > return ret; > } > > +#ifdef CONFIG_ARM64_SVE > + > +static void sve_init_header_from_task(struct user_sve_header *header, > + struct task_struct *target) > +{ > + unsigned int vq; > + > + memset(header, 0, sizeof(*header)); > + > + header->flags = test_tsk_thread_flag(target, TIF_SVE) ? > + SVE_PT_REGS_SVE : SVE_PT_REGS_FPSIMD; For PTRACE_SYSCALL, we may or may not have TIF_SVE depending on what happened with the target. Just a thought: shall we clear TIF_SVE (and sync it to fpsimd) in syscall_trace_enter()? > + if (test_tsk_thread_flag(target, TIF_SVE_VL_INHERIT)) > + header->flags |= SVE_PT_VL_INHERIT; > + > + header->vl = target->thread.sve_vl; > + vq = sve_vq_from_vl(header->vl); > + > + header->max_vl = sve_max_vl; > + if (WARN_ON(!sve_vl_valid(sve_max_vl))) > + header->max_vl = header->vl; > + > + header->size = SVE_PT_SIZE(vq, header->flags); > + header->max_size = SVE_PT_SIZE(sve_vq_from_vl(header->max_vl), > + SVE_PT_REGS_SVE); > +} [...] > +static int sve_set(struct task_struct *target, > +const struct user_regset *regset, > +unsigned int pos, unsigned int count, > +const void *kbuf, const void __user *ubuf) > +{ > + int ret; > + struct user_sve_header header; > + unsigned int vq; > + unsigned long start, end; > + > + if (!system_supports_sve()) > + return -EINVAL; > + > + /* Header */ > + if (count < sizeof(header)) > + return -EINVAL; > + ret = user_regset_copyin(, , , , , > + 0, sizeof(header)); > + if (ret) > + goto out; > + > + /* > + * Apart from PT_SVE_REGS_MASK, all PT_SVE_* flags are consumed by > + * sve_set_vector_length(), which will also validate them for us: > + */ > + ret = sve_set_vector_length(target, header.vl, > + ((unsigned long)header.flags & ~SVE_PT_REGS_MASK) << 16); > + if (ret) > + goto out; > + > + /* Actual VL set may be less than the user asked for: */ > + vq = sve_vq_from_vl(target->thread.sve_vl); > + > + /* Registers: FPSIMD-only case */ > + > + BUILD_BUG_ON(SVE_PT_FPSIMD_OFFSET != sizeof(header)); > + if ((header.flags & SVE_PT_REGS_MASK) == SVE_PT_REGS_FPSIMD) { > + sve_sync_to_fpsimd(target); > + > + ret = __fpr_set(target, regset, pos, count, kbuf, ubuf, > + SVE_PT_FPSIMD_OFFSET); > + clear_tsk_thread_flag(target, TIF_SVE); > + goto out; > + } __fpr_set() already calls sve_sync_to_fpsimd(). Anyway, do you actually need this since we are going to override the FPSIMD state anyway here. > + > + /* Otherwise: full SVE case */ > + > + /* > + * If setting a different VL from the requested VL and there is > + * register data, the data layout will be wrong: don't even > + * try to set the registers in this case. > + */ > + if (count && vq != sve_vq_from_vl(header.vl)) { > + ret = -EIO; > + goto out; > + } > + > + sve_alloc(target); > + fpsimd_sync_to_sve(target); Similarly here, it's a full SVE case, so we are going to override it anyway. > + set_tsk_thread_flag(target, TIF_SVE); This has the side-effect of enabling TIF_SVE even for PTRACE_SYSCALL which may be cleared in some circumstances. It may not be an issue though. -- Catalin ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 02/37] KVM: arm64: Rework hyp_panic for VHE and non-VHE
On Thu, Oct 12, 2017 at 04:55:16PM +0100, Marc Zyngier wrote: > On 12/10/17 11:41, Christoffer Dall wrote: > > VHE actually doesn't rely on clearing the VTTBR when returning to the > > hsot kernel, and that is the current key mechanism of hyp_panic to > > host > > > figure out how to attempt to return to a state good enough to print a > > panic statement. > > > > Therefore, we split the hyp_panic function into two functions, a VHE and > > a non-VHE, keeping the non-VHE version intact, but changing the VHE > > behavior. > > > > The vttbr_el2 check on VHE doesn't really make that much sense, because > > the only situation where we can get here on VHE is when the hypervisor > > assembly code actually caleld into hyp_panic, which only happens when > > called > > > VBAR_EL2 has been set to the KVM exception vectors. On VHE, we can > > always safely disable the traps and restore the host registers at this > > point, so we simply do that unconditionally and call into the panic > > function directly. > > > > Signed-off-by: Christoffer Dall> > --- > > arch/arm64/kvm/hyp/switch.c | 45 > > +++-- > > 1 file changed, 23 insertions(+), 22 deletions(-) > > > > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c > > index a0123ad..a50ddf3 100644 > > --- a/arch/arm64/kvm/hyp/switch.c > > +++ b/arch/arm64/kvm/hyp/switch.c > > @@ -394,10 +394,20 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) > > static const char __hyp_panic_string[] = "HYP panic:\nPS:%08llx PC:%016llx > > ESR:%08llx\nFAR:%016llx HPFAR:%016llx PAR:%016llx\nVCPU:%p\n"; > > > > static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par, > > -struct kvm_vcpu *vcpu) > > +struct kvm_cpu_context > > *__host_ctxt) > > { > > + struct kvm_vcpu *vcpu; > > unsigned long str_va; > > > > + vcpu = __host_ctxt->__hyp_running_vcpu; > > + > > + if (read_sysreg(vttbr_el2)) { > > + __timer_disable_traps(vcpu); > > + __deactivate_traps(vcpu); > > + __deactivate_vm(vcpu); > > + __sysreg_restore_host_state(__host_ctxt); > > + } > > + > > /* > > * Force the panic string to be loaded from the literal pool, > > * making sure it is a kernel address and not a PC-relative > > @@ -411,40 +421,31 @@ static void __hyp_text __hyp_call_panic_nvhe(u64 > > spsr, u64 elr, u64 par, > >read_sysreg(hpfar_el2), par, vcpu); > > } > > > > -static void __hyp_text __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par, > > - struct kvm_vcpu *vcpu) > > +static void __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par, > > +struct kvm_cpu_context *host_ctxt) > > { > > + struct kvm_vcpu *vcpu; > > + vcpu = host_ctxt->__hyp_running_vcpu; > > + > > + __deactivate_traps_vhe(); > > Is there a reason why we can't just call __deactivate_traps(), rather > than the VHE-specific subset? It doesn't really matter, as we're about > to panic, but still... > It doesn't really matter, especially as later patches will change this, but this patch would be slightly nicer keeping the __deactivate_traps. I don't mind changing that around in the next version. > > + __sysreg_restore_host_state(host_ctxt); > > + > > panic(__hyp_panic_string, > > spsr, elr, > > read_sysreg_el2(esr), read_sysreg_el2(far), > > read_sysreg(hpfar_el2), par, vcpu); > > } > > > > -static hyp_alternate_select(__hyp_call_panic, > > - __hyp_call_panic_nvhe, __hyp_call_panic_vhe, > > - ARM64_HAS_VIRT_HOST_EXTN); > > - > > void __hyp_text __noreturn hyp_panic(struct kvm_cpu_context *__host_ctxt) > > { > > - struct kvm_vcpu *vcpu = NULL; > > - > > u64 spsr = read_sysreg_el2(spsr); > > u64 elr = read_sysreg_el2(elr); > > u64 par = read_sysreg(par_el1); > > > > - if (read_sysreg(vttbr_el2)) { > > - struct kvm_cpu_context *host_ctxt; > > - > > - host_ctxt = __host_ctxt; > > - vcpu = host_ctxt->__hyp_running_vcpu; > > - __timer_disable_traps(vcpu); > > - __deactivate_traps(vcpu); > > - __deactivate_vm(vcpu); > > - __sysreg_restore_host_state(host_ctxt); > > - } > > - > > - /* Call panic for real */ > > - __hyp_call_panic()(spsr, elr, par, vcpu); > > + if (!has_vhe()) > > + __hyp_call_panic_nvhe(spsr, elr, par, __host_ctxt); > > + else > > + __hyp_call_panic_vhe(spsr, elr, par, __host_ctxt); > > > > unreachable(); > > } > > > > Otherwise looks good. > Thanks, -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 01/37] KVM: arm64: Avoid storing the vcpu pointer on the stack
On Thu, Oct 12, 2017 at 04:49:44PM +0100, Marc Zyngier wrote: > On 12/10/17 11:41, Christoffer Dall wrote: > > We already have the percpu area for the host cpu state, which points to > > the VCPU, so there's no need to store the VCPU pointer on the stack on > > every context switch. We can be a little more clever and just use > > tpidr_el2 for the percpu offset and load the VCPU pointer from the host > > context. > > > > This requires us to have a scratch register though, so we take the > > chance to rearrange some of the el1_sync code to only look at the > > vttbr_el2 to determine if this is a trap from the guest or an HVC from > > the host. We do add an extra check to call the panic code if the kernel > > is configured with debugging enabled and we saw a trap from the host > > which wasn't an HVC, indicating that we left some EL2 trap configured by > > mistake. > > > > Signed-off-by: Christoffer Dall> > --- > > arch/arm64/include/asm/kvm_asm.h | 20 > > arch/arm64/kernel/asm-offsets.c | 1 + > > arch/arm64/kvm/hyp/entry.S | 5 + > > arch/arm64/kvm/hyp/hyp-entry.S | 39 > > ++- > > arch/arm64/kvm/hyp/switch.c | 2 +- > > 5 files changed, 41 insertions(+), 26 deletions(-) > > > > diff --git a/arch/arm64/include/asm/kvm_asm.h > > b/arch/arm64/include/asm/kvm_asm.h > > index ab4d0a9..7e48a39 100644 > > --- a/arch/arm64/include/asm/kvm_asm.h > > +++ b/arch/arm64/include/asm/kvm_asm.h > > @@ -70,4 +70,24 @@ extern u32 __init_stage2_translation(void); > > > > #endif > > > > +#ifdef __ASSEMBLY__ > > +.macro get_host_ctxt reg, tmp > > + /* > > +* '=kvm_host_cpu_state' is a host VA from the constant pool, it may > > +* not be accessible by this address from EL2, hyp_panic() converts > > +* it with kern_hyp_va() before use. > > +*/ > > This really looks like a stale comment, as there is no hyp_panic > involved here anymore (thankfully!). > yeah, I suppose. > > + ldr \reg, =kvm_host_cpu_state > > + mrs \tmp, tpidr_el2 > > + add \reg, \reg, \tmp > > + kern_hyp_va \reg > > Here, we're trading a load from the stack for a load from the constant > pool. Can't we do something like: > > adr_l \reg, kvm_host_cpu_state > msr \tmp, tpidr_el2 > add \reg, \reg, \tmp > > and that's it? That's definitely what the compiler generates from C code... > This relies on the property that the kernel/hyp offset is > constant, and that it doesn't matter if we add the offset to a kernel VA > or a HYP VA... Completely untested of course! > You're the hyp VA expert. Is it valid to rely on that assumption? > > +.endm > > + > > +.macro get_vcpu vcpu, ctxt > > + ldr \vcpu, [\ctxt, #HOST_CONTEXT_VCPU] > > + kern_hyp_va \vcpu > > +.endm > > + > > +#endif > > + > > #endif /* __ARM_KVM_ASM_H__ */ > > diff --git a/arch/arm64/kernel/asm-offsets.c > > b/arch/arm64/kernel/asm-offsets.c > > index 71bf088..612021d 100644 > > --- a/arch/arm64/kernel/asm-offsets.c > > +++ b/arch/arm64/kernel/asm-offsets.c > > @@ -135,6 +135,7 @@ int main(void) > >DEFINE(CPU_FP_REGS, offsetof(struct kvm_regs, fp_regs)); > >DEFINE(VCPU_FPEXC32_EL2, offsetof(struct kvm_vcpu, > > arch.ctxt.sys_regs[FPEXC32_EL2])); > >DEFINE(VCPU_HOST_CONTEXT,offsetof(struct kvm_vcpu, > > arch.host_cpu_context)); > > + DEFINE(HOST_CONTEXT_VCPU,offsetof(struct kvm_cpu_context, > > __hyp_running_vcpu)); > > #endif > > #ifdef CONFIG_CPU_PM > >DEFINE(CPU_SUSPEND_SZ, sizeof(struct cpu_suspend_ctx)); > > diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S > > index 9a8ab5d..76cd48f 100644 > > --- a/arch/arm64/kvm/hyp/entry.S > > +++ b/arch/arm64/kvm/hyp/entry.S > > @@ -62,9 +62,6 @@ ENTRY(__guest_enter) > > // Store the host regs > > save_callee_saved_regs x1 > > > > - // Store host_ctxt and vcpu for use at exit time > > - stp x1, x0, [sp, #-16]! > > - > > add x18, x0, #VCPU_CONTEXT > > > > // Restore guest regs x0-x17 > > @@ -119,7 +116,7 @@ ENTRY(__guest_exit) > > save_callee_saved_regs x1 > > > > // Restore the host_ctxt from the stack > > Stale comment again. > yeah... > > - ldr x2, [sp], #16 > > + get_host_ctxt x2, x3 > > > > // Now restore the host regs > > restore_callee_saved_regs x2 > > diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S > > index e4f37b9..2950f26 100644 > > --- a/arch/arm64/kvm/hyp/hyp-entry.S > > +++ b/arch/arm64/kvm/hyp/hyp-entry.S > > @@ -56,19 +56,16 @@ ENDPROC(__vhe_hyp_call) > > el1_sync: // Guest trapped into EL2 > > stp x0, x1, [sp, #-16]! > > > > -alternative_if_not ARM64_HAS_VIRT_HOST_EXTN > > - mrs x1, esr_el2 > > -alternative_else > > - mrs x1, esr_el1 > > -alternative_endif > > - lsr x0, x1, #ESR_ELx_EC_SHIFT > > - > > - cmp x0,
[PATCH] arm64: KVM: set right LR register value for 32 bit guest when inject abort
When a exception is trapped to EL2, hardware uses ELR_ELx to hold the current fault instruction address. If KVM wants to inject a abort to 32 bit guest, it needs to set the LR register for the guest to emulate this abort happened in the guest. Because ARM32 architecture is Multi-pipeline, so the LR value has an offset to the fault instruction address. The offsets applied to Link value for exceptions as shown below, which should be added for the ARM32 link register(LR). Exception Offset, for PE state of: A32 T32 Undefined Instruction +4+2 Prefetch Abort +4+4 Data Abort +8+8 IRQ or FIQ +4+4 Signed-off-by: Dongjiu GengSigned-off-by: Haibin Zhang --- For example, to the undefined instruction injection: 1. Guest OS call SMC(Secure Monitor Call) instruction in the address 0xc025405c, then Guest traps to hypervisor c0254050: e59d5028ldr r5, [sp, #40] ; 0x28 c0254054: e3a03001mov r3, #1 c0254058: e1a01003mov r1, r3 c025405c: e1600070smc 0 c0254060: e30a0270movwr0, #41584 ; 0xa270 c0254064: e34c00bfmovtr0, #49343 ; 0xc0bf 2. KVM injects undefined abort to guest 3. We will find the fault PC is 0xc0254058, not 0xc025405c. [ 12.348072] Internal error: Oops - undefined instruction: 0 [#1] SMP ARM [ 12.349786] Modules linked in: [ 12.350563] CPU: 1 PID: 71 Comm: cat Not tainted 4.1.0-dirty #25 [ 12.352061] Hardware name: Generic DT based system [ 12.353275] task: d9d08000 ti: d9cfe000 task.ti: d9cfe000 [ 12.354637] PC is at proc_dointvec+0x20/0x60 [ 12.355717] LR is at proc_sys_call_handler+0xb0/0xc4 [ 12.356972] pc : []lr : []psr: a0060013 [ 12.356972] sp : d9cffe90 ip : c0254038 fp : 0001 [ 12.359824] r10: d9cfff80 r9 : 0004 r8 : [ 12.361132] r7 : bec21cb0 r6 : d9cffec4 r5 : d9cfff80 r4 : c0e82de0 [ 12.362766] r3 : 0001 r2 : bec21cb0 r1 : 0001 r0 : c0e82de0 [ 12.364400] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [ 12.366183] Control: 10c5383d Table: 59d3406a DAC: 0015 [ 12.367623] Process cat (pid: 71, stack limit = 0xd9cfe220) 4. After correct the LR register, it will have right value [ 125.763370] Internal error: Oops - undefined instruction: 0 [#2] SMP ARM [ 125.767010] Modules linked in: [ 125.768472] CPU: 1 PID: 74 Comm: cat Tainted: G D 4.1.0-dirty #25 [ 125.771854] Hardware name: Generic DT based system [ 125.774053] task: db0bb900 ti: d9d1 task.ti: d9d1 [ 125.776821] PC is at proc_dointvec+0x24/0x60 [ 125.778919] LR is at proc_sys_call_handler+0xb0/0xc4 [ 125.781269] pc : []lr : []psr: a0060013 [ 125.781269] sp : d9d11e90 ip : c0254038 fp : 0001 [ 125.786581] r10: d9d11f80 r9 : 0004 r8 : [ 125.789673] r7 : be92ccb0 r6 : d9d11ec4 r5 : d9d11f80 r4 : c0e82de0 [ 125.792828] r3 : 0001 r2 : be92ccb0 r1 : 0001 r0 : c0e82de0 [ 125.795890] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user For other exception injection, such as Data/Prefetch abort, also needs to correct --- arch/arm64/kvm/inject_fault.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c index da6a8cf..da93508 100644 --- a/arch/arm64/kvm/inject_fault.c +++ b/arch/arm64/kvm/inject_fault.c @@ -33,12 +33,11 @@ #define LOWER_EL_AArch64_VECTOR0x400 #define LOWER_EL_AArch32_VECTOR0x600 -static void prepare_fault32(struct kvm_vcpu *vcpu, u32 mode, u32 vect_offset) +static void prepare_fault32(struct kvm_vcpu *vcpu, u32 mode, u32 vect_offset, + u32 return_offset) { unsigned long cpsr; unsigned long new_spsr_value = *vcpu_cpsr(vcpu); - bool is_thumb = (new_spsr_value & COMPAT_PSR_T_BIT); - u32 return_offset = (is_thumb) ? 4 : 0; u32 sctlr = vcpu_cp15(vcpu, c1_SCTLR); cpsr = mode | COMPAT_PSR_I_BIT; @@ -65,7 +64,11 @@ static void prepare_fault32(struct kvm_vcpu *vcpu, u32 mode, u32 vect_offset) static void inject_undef32(struct kvm_vcpu *vcpu) { - prepare_fault32(vcpu, COMPAT_PSR_MODE_UND, 4); + unsigned long spsr_value = *vcpu_cpsr(vcpu); + bool is_thumb = (spsr_value & COMPAT_PSR_T_BIT); + u32 return_offset = (is_thumb) ? 2 : 4; + + prepare_fault32(vcpu, COMPAT_PSR_MODE_UND, 4, return_offset); } /* @@ -75,21 +78,24 @@ static void inject_undef32(struct kvm_vcpu *vcpu) static void inject_abt32(struct kvm_vcpu *vcpu, bool is_pabt, unsigned long addr) { - u32 vect_offset; + u32 vect_offset, return_offset; u32 *far, *fsr; bool is_lpae; if (is_pabt) { vect_offset = 12; +
Re: [PATCH 04/37] KVM: arm/arm64: Get rid of vcpu->arch.irq_lines
On 12/10/17 11:41, Christoffer Dall wrote: > We currently have a separate read-modify-write of the HCR_EL2 on entry > to the guest for the sole purpose of setting the VF and VI bits, if set. > Since this is most rarely the case (only when using userspace IRQ chip > and interrupts are in flight), let's get rid of this operation and > instead modify the bits in the vcpu->arch.hcr[_el2] directly when > needed. > > Signed-off-by: Christoffer Dall> --- > arch/arm/include/asm/kvm_emulate.h | 9 ++--- > arch/arm/include/asm/kvm_host.h | 3 --- > arch/arm/kvm/emulate.c | 2 +- > arch/arm/kvm/hyp/switch.c| 2 +- > arch/arm64/include/asm/kvm_emulate.h | 9 ++--- > arch/arm64/include/asm/kvm_host.h| 3 --- > arch/arm64/kvm/hyp/switch.c | 6 -- > arch/arm64/kvm/inject_fault.c| 2 +- > virt/kvm/arm/arm.c | 11 ++- > virt/kvm/arm/mmu.c | 6 +++--- > 10 files changed, 16 insertions(+), 37 deletions(-) > > diff --git a/arch/arm/include/asm/kvm_emulate.h > b/arch/arm/include/asm/kvm_emulate.h > index 98089ff..34663a8 100644 > --- a/arch/arm/include/asm/kvm_emulate.h > +++ b/arch/arm/include/asm/kvm_emulate.h > @@ -62,14 +62,9 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu) > vcpu->arch.hcr = HCR_GUEST_MASK; > } > > -static inline unsigned long vcpu_get_hcr(const struct kvm_vcpu *vcpu) > +static inline unsigned long *vcpu_hcr(const struct kvm_vcpu *vcpu) > { > - return vcpu->arch.hcr; > -} > - > -static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr) > -{ > - vcpu->arch.hcr = hcr; > + return (unsigned long *)>arch.hcr; > } > > static inline bool vcpu_mode_is_32bit(const struct kvm_vcpu *vcpu) > diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h > index 4a879f6..1100170 100644 > --- a/arch/arm/include/asm/kvm_host.h > +++ b/arch/arm/include/asm/kvm_host.h > @@ -153,9 +153,6 @@ struct kvm_vcpu_arch { > /* HYP trapping configuration */ > u32 hcr; > > - /* Interrupt related fields */ > - u32 irq_lines; /* IRQ and FIQ levels */ > - > /* Exception Information */ > struct kvm_vcpu_fault_info fault; > > diff --git a/arch/arm/kvm/emulate.c b/arch/arm/kvm/emulate.c > index 0064b86..4286a89 100644 > --- a/arch/arm/kvm/emulate.c > +++ b/arch/arm/kvm/emulate.c > @@ -313,5 +313,5 @@ void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long > addr) > */ > void kvm_inject_vabt(struct kvm_vcpu *vcpu) > { > - vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VA); > + *vcpu_hcr(vcpu) |= HCR_VA; > } > diff --git a/arch/arm/kvm/hyp/switch.c b/arch/arm/kvm/hyp/switch.c > index 330c9ce..c3b9799 100644 > --- a/arch/arm/kvm/hyp/switch.c > +++ b/arch/arm/kvm/hyp/switch.c > @@ -43,7 +43,7 @@ static void __hyp_text __activate_traps(struct kvm_vcpu > *vcpu, u32 *fpexc_host) > isb(); > } > > - write_sysreg(vcpu->arch.hcr | vcpu->arch.irq_lines, HCR); > + write_sysreg(vcpu->arch.hcr, HCR); > /* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */ > write_sysreg(HSTR_T(15), HSTR); > write_sysreg(HCPTR_TTA | HCPTR_TCP(10) | HCPTR_TCP(11), HCPTR); > diff --git a/arch/arm64/include/asm/kvm_emulate.h > b/arch/arm64/include/asm/kvm_emulate.h > index e5df3fc..1fbfe96 100644 > --- a/arch/arm64/include/asm/kvm_emulate.h > +++ b/arch/arm64/include/asm/kvm_emulate.h > @@ -51,14 +51,9 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu) > vcpu->arch.hcr_el2 &= ~HCR_RW; > } > > -static inline unsigned long vcpu_get_hcr(struct kvm_vcpu *vcpu) > +static inline unsigned long *vcpu_hcr(struct kvm_vcpu *vcpu) > { > - return vcpu->arch.hcr_el2; > -} > - > -static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr) > -{ > - vcpu->arch.hcr_el2 = hcr; > + return (unsigned long *)>arch.hcr_el2; > } > > static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu) > diff --git a/arch/arm64/include/asm/kvm_host.h > b/arch/arm64/include/asm/kvm_host.h > index 806ccef..27305e7 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -266,9 +266,6 @@ struct kvm_vcpu_arch { > /* IO related fields */ > struct kvm_decode mmio_decode; > > - /* Interrupt related fields */ > - u64 irq_lines; /* IRQ and FIQ levels */ > - > /* Cache some mmu pages needed inside spinlock regions */ > struct kvm_mmu_memory_cache mmu_page_cache; > > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c > index bcf1a79..7703d63 100644 > --- a/arch/arm64/kvm/hyp/switch.c > +++ b/arch/arm64/kvm/hyp/switch.c > @@ -168,12 +168,6 @@ static void __hyp_text __vgic_save_state(struct kvm_vcpu > *vcpu) > > static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu) > { > - u64 val; > - > - val =
Re: [PATCH 03/37] KVM: arm64: Move HCR_INT_OVERRIDE to default HCR_EL2 guest flag
On 12/10/17 11:41, Christoffer Dall wrote: > From: Shih-Wei Li> > We always set the IMO and FMO bits in the HCR_EL2 when running the > guest, regardless if we use the vgic or not. By moving these flags to > HCR_GUEST_FLAGS we can avoid one of the extra save/restore operations of > HCR_EL2 in the world switch code, and we can also soon get rid of the > other one. > > Signed-off-by: Shih-Wei Li > Signed-off-by: Christoffer Dall > --- > arch/arm64/include/asm/kvm_arm.h | 4 ++-- > arch/arm64/kvm/hyp/switch.c | 3 --- > 2 files changed, 2 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/include/asm/kvm_arm.h > b/arch/arm64/include/asm/kvm_arm.h > index 61d694c..e67e279 100644 > --- a/arch/arm64/include/asm/kvm_arm.h > +++ b/arch/arm64/include/asm/kvm_arm.h > @@ -79,9 +79,9 @@ > */ > #define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \ >HCR_TVM | HCR_BSU_IS | HCR_FB | HCR_TAC | \ > - HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW) > + HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | \ > + HCR_FMO | HCR_IMO) > #define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF) > -#define HCR_INT_OVERRIDE (HCR_FMO | HCR_IMO) > #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H) > > /* TCR_EL2 Registers bits */ > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c > index a50ddf3..bcf1a79 100644 > --- a/arch/arm64/kvm/hyp/switch.c > +++ b/arch/arm64/kvm/hyp/switch.c > @@ -164,8 +164,6 @@ static void __hyp_text __vgic_save_state(struct kvm_vcpu > *vcpu) > __vgic_v3_save_state(vcpu); > else > __vgic_v2_save_state(vcpu); > - > - write_sysreg(read_sysreg(hcr_el2) & ~HCR_INT_OVERRIDE, hcr_el2); > } > > static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu) > @@ -173,7 +171,6 @@ static void __hyp_text __vgic_restore_state(struct > kvm_vcpu *vcpu) > u64 val; > > val = read_sysreg(hcr_el2); > - val |= HCR_INT_OVERRIDE; > val |= vcpu->arch.irq_lines; > write_sysreg(val, hcr_el2); > > To expand on why this is actually safe: IMO/FMO control both taking the interrupts to EL2 and remapping ICC_*_EL1 to ICV_*_EL1 executed at EL1. As long as we ensure that these bits are clear when returning to the EL1 host, we're OK. Reviewed-by: Marc Zyngier M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v3 13/28] arm64/sve: Signal handling support
On Wed, Oct 11, 2017 at 05:40:52PM +0100, Catalin Marinas wrote: > On Tue, Oct 10, 2017 at 07:38:30PM +0100, Dave P Martin wrote: > > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c > > index aabeaee..fa4ed34 100644 > > --- a/arch/arm64/kernel/fpsimd.c > > +++ b/arch/arm64/kernel/fpsimd.c > > @@ -310,6 +310,32 @@ static void fpsimd_to_sve(struct task_struct *task) > >sizeof(fst->vregs[i])); > > } > > > > +/* > > + * Transfer the SVE state in task->thread.sve_state to > > + * task->thread.fpsimd_state. > > + * > > + * Task can be a non-runnable task, or current. In the latter case, > > + * softirqs (and preemption) must be disabled. > > + * task->thread.sve_state must point to at least sve_state_size(task) > > + * bytes of allocated kernel memory. > > + * task->thread.sve_state must be up to date before calling this function. > > + */ > > +static void sve_to_fpsimd(struct task_struct *task) > > +{ > > + unsigned int vq; > > + void const *sst = task->thread.sve_state; > > + struct fpsimd_state *fst = >thread.fpsimd_state; > > + unsigned int i; > > + > > + if (!system_supports_sve()) > > + return; > > + > > + vq = sve_vq_from_vl(task->thread.sve_vl); > > + for (i = 0; i < 32; ++i) > > + memcpy(>vregs[i], ZREG(sst, vq, i), > > + sizeof(fst->vregs[i])); > > +} > > Nit: could we actually just do an assignment with some pointer casting? > It looks like we invoke memcpy for every 16 bytes (same for > fpsimd_to_sve). I was uneasy about what the type of ZREG(sst, vq, i) ought to be. In any case, memest() is magic: my oldskool GCC (5.3.0) generates: 08084c70 : 08084c70: 1404b 08084c8008084c74: d503201fnop 08084c78: d65f03c0ret 08084c7c: d503201fnop 08084c80: f0007d61adrpx1, 09033000 08084c84: f942a021ldr x1, [x1,#1344] 08084c88: 36b001c1tbz w1, #22, 08084cc0 08084c8c: b94ca805ldr w5, [x0,#3240] 08084c90: 912a0001add x1, x0, #0xa80 08084c94: 91320004add x4, x0, #0xc80 08084c98: f9465006ldr x6, [x0,#3232] 08084c9c: 121c6ca5and w5, w5, #0xfff0 08084ca0: 5280mov w0, #0x0 // #0 08084ca4: 8b2040c2add x2, x6, w0, uxtw 08084ca8: 0b05add w0, w0, w5 08084cac: a9400c42ldp x2, x3, [x2] 08084cb0: a8810c22stp x2, x3, [x1],#16 08084cb4: eb01009fcmp x4, x1 08084cb8: 5461b.ne08084ca4 08084cbc: d65f03c0ret 08084cc0: d65f03c0ret 08084cc4: d503201fnop Without volatile, I think assigning a single object and doing a memcpy() are equivalent to the compiler: which it actually uses depends solely on optimisation considerations. (But then I'm not a language lawyer ... not a professional one anyway). Are you concerned compilers may mess this up? Cheers ---Dave ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v3 11/28] arm64/sve: Core task context handling
On Wed, Oct 11, 2017 at 05:15:58PM +0100, Catalin Marinas wrote: > On Tue, Oct 10, 2017 at 07:38:28PM +0100, Dave P Martin wrote: > > diff --git a/arch/arm64/include/asm/fpsimd.h > > b/arch/arm64/include/asm/fpsimd.h > > index 026a7c7..b1409de 100644 > > --- a/arch/arm64/include/asm/fpsimd.h > > +++ b/arch/arm64/include/asm/fpsimd.h > > @@ -72,6 +75,20 @@ extern void sve_load_state(void const *state, u32 const > > *pfpsr, > >unsigned long vq_minus_1); > > extern unsigned int sve_get_vl(void); > > > > +#ifdef CONFIG_ARM64_SVE > > + > > +extern size_t sve_state_size(struct task_struct const *task); > > + > > +extern void sve_alloc(struct task_struct *task); > > +extern void fpsimd_release_thread(struct task_struct *task); > > + > > +#else /* ! CONFIG_ARM64_SVE */ > > + > > +static void __maybe_unused sve_alloc(struct task_struct *task) { } > > +static void __maybe_unused fpsimd_release_thread(struct task_struct *task) > > { } > > Nitpick: usually we just add static inline functions here rather than > __maybe_unused. Fair enough -- come to think of it I've already converted some other instances of this at Ard's request. > > > diff --git a/arch/arm64/include/asm/processor.h > > b/arch/arm64/include/asm/processor.h > > index 29adab8..4831d28 100644 > > --- a/arch/arm64/include/asm/processor.h > > +++ b/arch/arm64/include/asm/processor.h > > @@ -39,6 +47,8 @@ > > #define FPEXC_IDF (1 << 7) > > > > /* > > + * (Note: in this discussion, statements about FPSIMD apply equally to > > SVE.) > > + * > > * In order to reduce the number of times the FPSIMD state is needlessly > > saved > > * and restored, we need to keep track of two things: > > * (a) for each task, we need to remember which CPU was the last one to > > have > > @@ -99,6 +109,287 @@ > > */ > > static DEFINE_PER_CPU(struct fpsimd_state *, fpsimd_last_state); > > > > +static void sve_free(struct task_struct *task) > > +{ > > + kfree(task->thread.sve_state); > > + task->thread.sve_state = NULL; > > +} > > I think we need a WARN_ON if TIF_SVE is still set here (and the callers > making sure it is cleared). I haven't checked the code paths via > fpsimd_release_thread() but wondering what happens if we get an > interrupt between freeing the state and making the pointer NULL, with > some context switching in a preemptible kernel. Having a WARN_ON() here may be a decent way to sanity-check that we don't ever have sve_state NULL with TIF_SVE set. This is a lot more economical than putting a WARN_ON() at each dereference of sve_state (of which there are quite a few). sve_free() is also a slow path. Currently, there are two callsites: sve_set_vector_length(), where we test_and_clear_tsk_thread_flags(task, TIF_SVE) before calling sve_free(); and fpsimd_release_thread() where we "don't care" because the thread is dying. Looking more closely though, is the release_thread() path preemptible? I can't see anything in the scheduler core to ensure this, nor any general reason why it should be needed. In which case preemption during thread exit after sve_free() could result in a NULL deference in fpsimd_thread_switch(). So, I think my favoured approach is: sve_release_thread() { local_bh_disable(); fpsimd_flush_task_state(current); clear_thread_flag(TIF_SVE); local_bh_enable(); sve_free(); } The local_bh stuff is cumbersome here, and could be replaced with barrier()s to force the order of fpsimd_flusk_task_state() versus clearing TIF_SVE. Or should the barrier really be in fpsimd_flush_task_state()? Disabling softirqs avoids the need to answer such questions... Then: sve_free(task) { WARN_ON(test_thread_flag(TIF_SVE)); barrier(); kfree(task->thread.sve_state); tash->thread.sve_state = NULL; } I'm assuming here that kfree() can't be called safely from atomic context, but this is unclear. I would expect to be able to free GFP_ATOMIC memory from atomic context (though sve_statue is GFP_KERNEL, so dunno). > Alternatively, always clear TIF_SVE here before freeing (also wondering > whether we should make sve_state NULL before the actual freeing but I > think TIF_SVE clearing should suffice). Could do. I feel that the current placement of the TIF_SVE clearing in sve_set_vector_length() feels "more natural", but this is a pretty flimsy argument. How strongly do you feel about this? [...] > > + * * TIF_SVE set: [...] > > + * * TIF_SVE clear: > > + * > > + *An attempt by the user task to execute an SVE instruction causes > > + *do_sve_acc() to be called, which does some preparation and then > > + *sets TIF_SVE. > > + * > > + *When stored, FPSIMD registers V0-V31 are encoded in > > + *task->fpsimd_state; bits [max : 128] for each of Z0-Z31 are > > + *logically zero but not stored anywhere; P0-P15 and FFR are not > > + *stored and have unspecified values from userspace's point of > > + *view. For
Re: [PATCH 02/37] KVM: arm64: Rework hyp_panic for VHE and non-VHE
On 12/10/17 11:41, Christoffer Dall wrote: > VHE actually doesn't rely on clearing the VTTBR when returning to the > hsot kernel, and that is the current key mechanism of hyp_panic to host > figure out how to attempt to return to a state good enough to print a > panic statement. > > Therefore, we split the hyp_panic function into two functions, a VHE and > a non-VHE, keeping the non-VHE version intact, but changing the VHE > behavior. > > The vttbr_el2 check on VHE doesn't really make that much sense, because > the only situation where we can get here on VHE is when the hypervisor > assembly code actually caleld into hyp_panic, which only happens when called > VBAR_EL2 has been set to the KVM exception vectors. On VHE, we can > always safely disable the traps and restore the host registers at this > point, so we simply do that unconditionally and call into the panic > function directly. > > Signed-off-by: Christoffer Dall> --- > arch/arm64/kvm/hyp/switch.c | 45 > +++-- > 1 file changed, 23 insertions(+), 22 deletions(-) > > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c > index a0123ad..a50ddf3 100644 > --- a/arch/arm64/kvm/hyp/switch.c > +++ b/arch/arm64/kvm/hyp/switch.c > @@ -394,10 +394,20 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) > static const char __hyp_panic_string[] = "HYP panic:\nPS:%08llx PC:%016llx > ESR:%08llx\nFAR:%016llx HPFAR:%016llx PAR:%016llx\nVCPU:%p\n"; > > static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par, > - struct kvm_vcpu *vcpu) > + struct kvm_cpu_context > *__host_ctxt) > { > + struct kvm_vcpu *vcpu; > unsigned long str_va; > > + vcpu = __host_ctxt->__hyp_running_vcpu; > + > + if (read_sysreg(vttbr_el2)) { > + __timer_disable_traps(vcpu); > + __deactivate_traps(vcpu); > + __deactivate_vm(vcpu); > + __sysreg_restore_host_state(__host_ctxt); > + } > + > /* >* Force the panic string to be loaded from the literal pool, >* making sure it is a kernel address and not a PC-relative > @@ -411,40 +421,31 @@ static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, > u64 elr, u64 par, > read_sysreg(hpfar_el2), par, vcpu); > } > > -static void __hyp_text __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par, > - struct kvm_vcpu *vcpu) > +static void __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par, > + struct kvm_cpu_context *host_ctxt) > { > + struct kvm_vcpu *vcpu; > + vcpu = host_ctxt->__hyp_running_vcpu; > + > + __deactivate_traps_vhe(); Is there a reason why we can't just call __deactivate_traps(), rather than the VHE-specific subset? It doesn't really matter, as we're about to panic, but still... > + __sysreg_restore_host_state(host_ctxt); > + > panic(__hyp_panic_string, > spsr, elr, > read_sysreg_el2(esr), read_sysreg_el2(far), > read_sysreg(hpfar_el2), par, vcpu); > } > > -static hyp_alternate_select(__hyp_call_panic, > - __hyp_call_panic_nvhe, __hyp_call_panic_vhe, > - ARM64_HAS_VIRT_HOST_EXTN); > - > void __hyp_text __noreturn hyp_panic(struct kvm_cpu_context *__host_ctxt) > { > - struct kvm_vcpu *vcpu = NULL; > - > u64 spsr = read_sysreg_el2(spsr); > u64 elr = read_sysreg_el2(elr); > u64 par = read_sysreg(par_el1); > > - if (read_sysreg(vttbr_el2)) { > - struct kvm_cpu_context *host_ctxt; > - > - host_ctxt = __host_ctxt; > - vcpu = host_ctxt->__hyp_running_vcpu; > - __timer_disable_traps(vcpu); > - __deactivate_traps(vcpu); > - __deactivate_vm(vcpu); > - __sysreg_restore_host_state(host_ctxt); > - } > - > - /* Call panic for real */ > - __hyp_call_panic()(spsr, elr, par, vcpu); > + if (!has_vhe()) > + __hyp_call_panic_nvhe(spsr, elr, par, __host_ctxt); > + else > + __hyp_call_panic_vhe(spsr, elr, par, __host_ctxt); > > unreachable(); > } > Otherwise looks good. M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 01/37] KVM: arm64: Avoid storing the vcpu pointer on the stack
On 12/10/17 11:41, Christoffer Dall wrote: > We already have the percpu area for the host cpu state, which points to > the VCPU, so there's no need to store the VCPU pointer on the stack on > every context switch. We can be a little more clever and just use > tpidr_el2 for the percpu offset and load the VCPU pointer from the host > context. > > This requires us to have a scratch register though, so we take the > chance to rearrange some of the el1_sync code to only look at the > vttbr_el2 to determine if this is a trap from the guest or an HVC from > the host. We do add an extra check to call the panic code if the kernel > is configured with debugging enabled and we saw a trap from the host > which wasn't an HVC, indicating that we left some EL2 trap configured by > mistake. > > Signed-off-by: Christoffer Dall> --- > arch/arm64/include/asm/kvm_asm.h | 20 > arch/arm64/kernel/asm-offsets.c | 1 + > arch/arm64/kvm/hyp/entry.S | 5 + > arch/arm64/kvm/hyp/hyp-entry.S | 39 ++- > arch/arm64/kvm/hyp/switch.c | 2 +- > 5 files changed, 41 insertions(+), 26 deletions(-) > > diff --git a/arch/arm64/include/asm/kvm_asm.h > b/arch/arm64/include/asm/kvm_asm.h > index ab4d0a9..7e48a39 100644 > --- a/arch/arm64/include/asm/kvm_asm.h > +++ b/arch/arm64/include/asm/kvm_asm.h > @@ -70,4 +70,24 @@ extern u32 __init_stage2_translation(void); > > #endif > > +#ifdef __ASSEMBLY__ > +.macro get_host_ctxt reg, tmp > + /* > + * '=kvm_host_cpu_state' is a host VA from the constant pool, it may > + * not be accessible by this address from EL2, hyp_panic() converts > + * it with kern_hyp_va() before use. > + */ This really looks like a stale comment, as there is no hyp_panic involved here anymore (thankfully!). > + ldr \reg, =kvm_host_cpu_state > + mrs \tmp, tpidr_el2 > + add \reg, \reg, \tmp > + kern_hyp_va \reg Here, we're trading a load from the stack for a load from the constant pool. Can't we do something like: adr_l \reg, kvm_host_cpu_state msr \tmp, tpidr_el2 add \reg, \reg, \tmp and that's it? This relies on the property that the kernel/hyp offset is constant, and that it doesn't matter if we add the offset to a kernel VA or a HYP VA... Completely untested of course! > +.endm > + > +.macro get_vcpu vcpu, ctxt > + ldr \vcpu, [\ctxt, #HOST_CONTEXT_VCPU] > + kern_hyp_va \vcpu > +.endm > + > +#endif > + > #endif /* __ARM_KVM_ASM_H__ */ > diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c > index 71bf088..612021d 100644 > --- a/arch/arm64/kernel/asm-offsets.c > +++ b/arch/arm64/kernel/asm-offsets.c > @@ -135,6 +135,7 @@ int main(void) >DEFINE(CPU_FP_REGS,offsetof(struct kvm_regs, fp_regs)); >DEFINE(VCPU_FPEXC32_EL2, offsetof(struct kvm_vcpu, > arch.ctxt.sys_regs[FPEXC32_EL2])); >DEFINE(VCPU_HOST_CONTEXT, offsetof(struct kvm_vcpu, > arch.host_cpu_context)); > + DEFINE(HOST_CONTEXT_VCPU, offsetof(struct kvm_cpu_context, > __hyp_running_vcpu)); > #endif > #ifdef CONFIG_CPU_PM >DEFINE(CPU_SUSPEND_SZ, sizeof(struct cpu_suspend_ctx)); > diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S > index 9a8ab5d..76cd48f 100644 > --- a/arch/arm64/kvm/hyp/entry.S > +++ b/arch/arm64/kvm/hyp/entry.S > @@ -62,9 +62,6 @@ ENTRY(__guest_enter) > // Store the host regs > save_callee_saved_regs x1 > > - // Store host_ctxt and vcpu for use at exit time > - stp x1, x0, [sp, #-16]! > - > add x18, x0, #VCPU_CONTEXT > > // Restore guest regs x0-x17 > @@ -119,7 +116,7 @@ ENTRY(__guest_exit) > save_callee_saved_regs x1 > > // Restore the host_ctxt from the stack Stale comment again. > - ldr x2, [sp], #16 > + get_host_ctxt x2, x3 > > // Now restore the host regs > restore_callee_saved_regs x2 > diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S > index e4f37b9..2950f26 100644 > --- a/arch/arm64/kvm/hyp/hyp-entry.S > +++ b/arch/arm64/kvm/hyp/hyp-entry.S > @@ -56,19 +56,16 @@ ENDPROC(__vhe_hyp_call) > el1_sync:// Guest trapped into EL2 > stp x0, x1, [sp, #-16]! > > -alternative_if_not ARM64_HAS_VIRT_HOST_EXTN > - mrs x1, esr_el2 > -alternative_else > - mrs x1, esr_el1 > -alternative_endif > - lsr x0, x1, #ESR_ELx_EC_SHIFT > - > - cmp x0, #ESR_ELx_EC_HVC64 > - b.neel1_trap > - > mrs x1, vttbr_el2 // If vttbr is valid, the 64bit guest > cbnzx1, el1_trap// called HVC Comment is outdated. Any guest trap will take this path. > > +#ifdef CONFIG_DEBUG > + mrs x0, esr_el2 > + lsr x0, x0, #ESR_ELx_EC_SHIFT > + cmp x0, #ESR_ELx_EC_HVC64 > + b.ne__hyp_panic > +#endif > + > /* Here,
Re: [PATCH v3 16/28] arm64/sve: Probe SVE capabilities and usable vector lengths
On 10/10/17 19:38, Dave Martin wrote: This patch uses the cpufeatures framework to determine common SVE capabilities and vector lengths, and configures the runtime SVE support code appropriately. ZCR_ELx is not really a feature register, but it is convenient to use it as a template for recording the maximum vector length supported by a CPU, using the LEN field. This field is similar to a feature field in that it is a contiguous bitfield for which we want to determine the minimum system-wide value. This patch adds ZCR as a pseudo-register in cpuinfo/cpufeatures, with appropriate custom code to populate it. Finding the minimum supported value of the LEN field is left to the cpufeatures framework in the usual way. The meaning of ID_AA64ZFR0_EL1 is not architecturally defined yet, so for now we just require it to be zero. Note that much of this code is dormant and SVE still won't be used yet, since system_supports_sve() remains hardwired to false. Signed-off-by: Dave MartinCc: Alex Bennée Cc: Suzuki K Poulose --- Dropped Alex Bennée's Reviewed-by, since there is new logic in this patch. Changes since v2 Bug fixes: * Got rid of dynamic allocation of the shadow vector length map during secondary boot. Secondary CPU boot takes place in atomic context, and relying on GFP_ATOMIC here doesn't seem justified. Instead, the needed additional bitmap is allocated statically. Only one shadow map is needed, because CPUs don't boot concurrently. Requested by Alex Bennée: * Reflowed untidy comment above read_zcr_features() * Added comments to read_zcr_features() to explain what it's trying to do (which is otherwise not readily apparent). Requested by Catalin Marinas: * Moved disabling of the EL1 SVE trap to the cpufeatures C code. This allows addition of new assembler in __cpu_setup to be avoided. Miscellaneous: * Added comments explaining the intent, purpose and basic constraints for fpsimd.c helpers. ... diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 92a9502..c5acf38 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c ... @@ -670,6 +689,14 @@ void update_cpu_features(int cpu, info->reg_mvfr2, boot->reg_mvfr2); } + if (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) { + taint |= check_update_ftr_reg(SYS_ZCR_EL1, cpu, + info->reg_zcr, boot->reg_zcr); + + if (!sys_caps_initialised) + sve_update_vq_map(); + } nit: I am not sure if we should also check if the "current" sanitised value of the id_aa64pfr0 also supports sve and skip the update if it isn't. The code is as such fine without the check, its just that we can avoid computing the map. It is in the CPU boot up path and hence is not performance critical. So, either way we are fine. Reviewed-by: Suzuki K Poulose ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v3 19/20] KVM: arm64: Handle RAS SErrors from EL2 on guest exit
Hi Marc, On 11/10/17 11:37, Marc Zyngier wrote: > On Thu, Oct 05 2017 at 8:18:11 pm BST, James Morse> wrote: >> We expect to have firmware-first handling of RAS SErrors, with errors >> notified via an APEI method. For systems without firmware-first, add >> some minimal handling to KVM. >> >> There are two ways KVM can take an SError due to a guest, either may be a >> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO, >> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit. >> >> The current SError from EL2 code unmasks SError and tries to fence any >> pending SError into a single instruction window. It then leaves SError >> unmasked. >> >> With the v8.2 RAS Extensions we may take an SError for a 'corrected' >> error, but KVM is only able to handle SError from EL2 if they occur >> during this single instruction window... >> >> The RAS Extensions give us a new instruction to synchronise and >> consume SErrors. The RAS Extensions document (ARM DDI0587), >> '2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising >> SError interrupts generated by 'instructions, translation table walks, >> hardware updates to the translation tables, and instruction fetches on >> the same PE'. This makes ESB equivalent to KVMs existing >> 'dsb, mrs-daifclr, isb' sequence. >> >> Use the alternatives to synchronise and consume any SError using ESB >> instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT >> in the exit_code so that we can restart the vcpu if it turns out this >> SError has no impact on the vcpu. >> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S >> index 12ee62d6d410..96caa5328b3a 100644 >> --- a/arch/arm64/kvm/hyp/entry.S >> +++ b/arch/arm64/kvm/hyp/entry.S >> @@ -124,6 +124,17 @@ ENTRY(__guest_exit) >> // Now restore the host regs >> restore_callee_saved_regs x2 >> >> +alternative_if ARM64_HAS_RAS_EXTN >> +// If we have the RAS extensions we can consume a pending error >> +// without an unmask-SError and isb. >> +esb >> +mrs_s x2, SYS_DISR_EL1 >> +str x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)] >> +cbz x2, 1f >> +msr_s SYS_DISR_EL1, xzr >> +orr x0, x0, #(1< > +1: ret >> +alternative_else >> // If we have a pending asynchronous abort, now is the >> // time to find out. From your VAXorcist book, page 666: >> // "Threaten me not, oh Evil one! For I speak with >> @@ -135,6 +146,8 @@ ENTRY(__guest_exit) >> >> dsb sy // Synchronize against in-flight ld/st >> msr daifclr, #4 // Unmask aborts >> +nop > > Oops. You've now introduced an instruction in what was supposed to be a > single instruction window (the isb). It means that we may fail to > identify the Serror as having been generated by our synchronisation > mechanism, and we'll panic for no good reason. Gah! and I pointed this thing out in the commit message, >> +alternative_endif >> >> // This is our single instruction exception window. A pending >> // SError is guaranteed to occur at the earliest when we unmask and here it is in the diff. > Moving the nop up will solve this. Yes. I've fixed this for v4, along with Julien's suggestions. Thanks! James ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v3 08/20] arm64: entry.S: convert elX_irq
Hi Julien, On 11/10/17 18:13, Julien Thierry wrote: > On 05/10/17 20:18, James Morse wrote: >> Following our 'dai' order, irqs should be processed with debug and >> serror exceptions unmasked. >> > Add a helper to unmask these two, (and fiq for good measure). >> >> Signed-off-by: James Morse>> --- >> arch/arm64/include/asm/assembler.h | 4 >> arch/arm64/kernel/entry.S | 4 ++-- >> 2 files changed, 6 insertions(+), 2 deletions(-) >> >> diff --git a/arch/arm64/include/asm/assembler.h >> b/arch/arm64/include/asm/assembler.h >> index c2a37e2f733c..7ffb2a629dc9 100644 >> --- a/arch/arm64/include/asm/assembler.h >> +++ b/arch/arm64/include/asm/assembler.h >> @@ -54,6 +54,10 @@ >> msrdaif, \tmp >> .endm >> +.macro enable_da_f >> +msrdaifclr, #(8 | 4 | 1) >> +.endm >> + > > We use this in irq entries because we are inheriting daif + we want to disable > irqs while we process irqs right? In this case not inheriting, we only do that for synchronous exceptions because we can't mask them, keeping the flags 'the same' lets us ignore them. Here we are unconditionally unmasking things for handling interrupts. If we stick with this dai order we know its safe to unmask SError and enable Debug. > I don't know if it's worth adding a comment but I find it easier to think > about > it like this. If its not-obvious, there should be a comment: /* IRQ is the lowest priority flag, unconditionally unmask the rest. */ (I was expecting flames for the hangman style naming!) > Otherwise, for patches 3 to 8 (I don't have any comment on 3 to 7): > Reviewed-by: Julien Thierry Thanks! I'll post a v4 ~tomorrow with this and the renaming changes. James ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v3 01/20] arm64: explicitly mask all exceptions
Hi Julien, On 11/10/17 17:30, Julien Thierry wrote: > On 05/10/17 20:17, James Morse wrote: >> There are a few places where we want to mask all exceptions. Today we >> do this in a piecemeal fashion, typically we expect the caller to >> have masked irqs and the arch code masks debug exceptions, ignoring >> SError which is probably masked. >> >> Make it clear that 'mask all exceptions' is the intention by adding >> helpers to do exactly that. >> >> This will let us unmask SError without having to add 'oh and SError' >> to these paths. >> diff --git a/arch/arm64/include/asm/daifflags.h >> b/arch/arm64/include/asm/daifflags.h >> new file mode 100644 >> index ..fb40da8e1457 >> --- /dev/null >> +++ b/arch/arm64/include/asm/daifflags.h >> +#ifndef __ASM_DAIFFLAGS_H >> +#define __ASM_DAIFFLAGS_H >> + >> +#include >> +#include >> + >> +/* Mask/unmask/restore all exceptions, including interrupts. */ >> +static inline unsigned long local_mask_daif(void) >> +{ >> +unsigned long flags; >> +asm volatile( >> +"mrs%0, daif// local_mask_daif\n" >> +"msrdaifset, #0xf" >> +: "=r" (flags) >> +: >> +: "memory"); >> +trace_hardirqs_off(); >> +return flags; >> +} >> + > nit: > Can we call this local_daif_save? Sure, having a save version fits better with the C irqflags versions. mask/unmask is just being pedantic, irqs and SError aren't disabled by these flags, (but curiously debug is...). > (and maybe a version that only disables without saving?) irqflags.h has one, so yes, why not. > Also, I was wondering why use 'local' as prefix? Is there gonna be a similar > set > of function for arm that could be called by common code? (e.g. drivers?) Its a hangover from the C irqflags calls like arch_local_save_flags() etc. I dropped the 'arch' as these should only be called by arch code and 'daif' should prevent any name collisions. Drivers should never touch this stuff, if they do its likely to be because they want to defer bursting-into-flames until we get to userspace, where its harder to work out which driver to blame. > Otherwise: > Reviewed-by: Julien ThierryThanks! James ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v3 22/28] arm64/sve: KVM: Prevent guests from using SVE
On 12/10/17 12:04, Dave Martin wrote: > On Wed, Oct 11, 2017 at 05:28:06PM +0100, Marc Zyngier wrote: >> [+ Christoffer] >> >> On 10/10/17 19:38, Dave Martin wrote: >>> Until KVM has full SVE support, guests must not be allowed to >>> execute SVE instructions. >>> >>> This patch enables the necessary traps, and also ensures that the >>> traps are disabled again on exit from the guest so that the host >>> can still use SVE if it wants to. >>> >>> This patch introduces another instance of >>> __this_cpu_write(fpsimd_last_state, NULL), so this flush operation >>> is abstracted out as a separate helper fpsimd_flush_cpu_state(). >>> Other instances are ported appropriately. >>> >>> As a side effect of this refactoring, a this_cpu_write() in >>> fpsimd_cpu_pm_notifier() is changed to __this_cpu_write(). This >>> should be fine, since cpu_pm_enter() is supposed to be called only >>> with interrupts disabled. >>> >>> Signed-off-by: Dave Martin>>> Reviewed-by: Alex Bennée >>> Cc: Marc Zyngier >>> Cc: Ard Biesheuvel >>> --- > > [...] > >>> diff --git a/arch/arm64/include/asm/kvm_host.h >>> b/arch/arm64/include/asm/kvm_host.h >>> index e923b58..674912d 100644 >>> --- a/arch/arm64/include/asm/kvm_host.h >>> +++ b/arch/arm64/include/asm/kvm_host.h > > [...] > >>> @@ -384,4 +385,14 @@ static inline void __cpu_init_stage2(void) > > [...] > >>> +static inline void kvm_fpsimd_flush_cpu_state(void) >>> +{ >>> + if (system_supports_sve()) >>> + sve_flush_cpu_state(); >> >> Hmmm. How does this work if... > > !IS_ENABLED(CONFIG_ARM64_SVE) implies !system_supports_sve(), so > if CONFIG_ARM64_SVE is not set, the call is optimised away. > > [...] > >>> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c >>> index a9cb794..6ae3703 100644 >>> --- a/arch/arm64/kernel/fpsimd.c >>> +++ b/arch/arm64/kernel/fpsimd.c >>> @@ -1073,6 +1073,33 @@ void fpsimd_flush_task_state(struct task_struct *t) > > [...] > >>> +#ifdef CONFIG_ARM64_SVE >>> +void sve_flush_cpu_state(void) >>> +{ >>> + struct fpsimd_state *const fpstate = __this_cpu_read(fpsimd_last_state); >>> + struct task_struct *tsk; >>> + >>> + if (!fpstate) >>> + return; >>> + >>> + tsk = container_of(fpstate, struct task_struct, thread.fpsimd_state); >>> + if (test_tsk_thread_flag(tsk, TIF_SVE)) >>> + fpsimd_flush_cpu_state(); >>> +} >>> +#endif /* CONFIG_ARM64_SVE */ >> >> ... CONFIG_ARM64_SVE is not set? Fixing this should just be a matter of >> moving the #ifdef/#endif inside the function... > > Because sve_flush_cpu_state() is not in the same compilation unit it > can't be static, and that means the compiler won't remove it > automatically if it's unused -- hence the #ifdef. > > Because the call site is optimised away, there is no link failure. > > Don't we rely on this sort of thing all over the place? Dunno. It just feels weird. But if you are sure that it won't break, fine by me. I guess we'll find out pretty quickly how this fares, specially with older toolchains. M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v3 22/28] arm64/sve: KVM: Prevent guests from using SVE
On Wed, Oct 11, 2017 at 05:28:06PM +0100, Marc Zyngier wrote: > [+ Christoffer] > > On 10/10/17 19:38, Dave Martin wrote: > > Until KVM has full SVE support, guests must not be allowed to > > execute SVE instructions. > > > > This patch enables the necessary traps, and also ensures that the > > traps are disabled again on exit from the guest so that the host > > can still use SVE if it wants to. > > > > This patch introduces another instance of > > __this_cpu_write(fpsimd_last_state, NULL), so this flush operation > > is abstracted out as a separate helper fpsimd_flush_cpu_state(). > > Other instances are ported appropriately. > > > > As a side effect of this refactoring, a this_cpu_write() in > > fpsimd_cpu_pm_notifier() is changed to __this_cpu_write(). This > > should be fine, since cpu_pm_enter() is supposed to be called only > > with interrupts disabled. > > > > Signed-off-by: Dave Martin> > Reviewed-by: Alex Bennée > > Cc: Marc Zyngier > > Cc: Ard Biesheuvel > > --- [...] > > diff --git a/arch/arm64/include/asm/kvm_host.h > > b/arch/arm64/include/asm/kvm_host.h > > index e923b58..674912d 100644 > > --- a/arch/arm64/include/asm/kvm_host.h > > +++ b/arch/arm64/include/asm/kvm_host.h [...] > > @@ -384,4 +385,14 @@ static inline void __cpu_init_stage2(void) [...] > > +static inline void kvm_fpsimd_flush_cpu_state(void) > > +{ > > + if (system_supports_sve()) > > + sve_flush_cpu_state(); > > Hmmm. How does this work if... !IS_ENABLED(CONFIG_ARM64_SVE) implies !system_supports_sve(), so if CONFIG_ARM64_SVE is not set, the call is optimised away. [...] > > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c > > index a9cb794..6ae3703 100644 > > --- a/arch/arm64/kernel/fpsimd.c > > +++ b/arch/arm64/kernel/fpsimd.c > > @@ -1073,6 +1073,33 @@ void fpsimd_flush_task_state(struct task_struct *t) [...] > > +#ifdef CONFIG_ARM64_SVE > > +void sve_flush_cpu_state(void) > > +{ > > + struct fpsimd_state *const fpstate = __this_cpu_read(fpsimd_last_state); > > + struct task_struct *tsk; > > + > > + if (!fpstate) > > + return; > > + > > + tsk = container_of(fpstate, struct task_struct, thread.fpsimd_state); > > + if (test_tsk_thread_flag(tsk, TIF_SVE)) > > + fpsimd_flush_cpu_state(); > > +} > > +#endif /* CONFIG_ARM64_SVE */ > > ... CONFIG_ARM64_SVE is not set? Fixing this should just be a matter of > moving the #ifdef/#endif inside the function... Because sve_flush_cpu_state() is not in the same compilation unit it can't be static, and that means the compiler won't remove it automatically if it's unused -- hence the #ifdef. Because the call site is optimised away, there is no link failure. Don't we rely on this sort of thing all over the place? Cheers ---Dave ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v3 18/28] arm64/sve: Preserve SVE registers around EFI runtime service calls
On Tue, Oct 10, 2017 at 07:38:35PM +0100, Dave P Martin wrote: > The EFI runtime services ABI allows EFI to make free use of the > FPSIMD registers during EFI runtime service calls, subject to the > callee-save requirements of the AArch64 procedure call standard. > > However, the SVE architecture allows upper bits of the SVE vector > registers to be zeroed as a side-effect of FPSIMD V-register > writes. This means that the SVE vector registers must be saved in > their entirety in order to avoid data loss: non-SVE-aware EFI > implementations cannot restore them correctly. > > The non-IRQ case is already handled gracefully by > kernel_neon_begin(). For the IRQ case, this patch allocates a > suitable per-CPU stash buffer for the full SVE register state and > uses it to preserve the affected registers around EFI calls. It is > currently unclear how the EFI runtime services ABI will be > clarified with respect to SVE, so it safest to assume that the > predicate registers and FFR must be saved and restored too. > > No attempt is made to restore the restore the vector length after > a call, for now. It is deemed rather insane for EFI to change it, > and contemporary EFI implementations certainly won't. > > Signed-off-by: Dave Martin> Reviewed-by: Alex Bennée > Cc: Ard Biesheuvel Reviewed-by: Catalin Marinas ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 35/37] KVM: arm/arm64: Get rid of vgic_elrsr
There is really no need to store the vgic_elrsr on the VGIC data structures as the only need we have for the elrsr is to figure out if an LR is inavtive when we save the VGIC state upon returning from the guest. We can might as well store this in a temporary local variable. This also gets rid of the endianness conversion in the VGIC save function, which is completely unnecessary and would actually result in incorrect functionality on big-endian systems, because we are only using typed values here and not converting pointers and reading different types here. Signed-off-by: Christoffer Dall--- include/kvm/arm_vgic.h| 2 -- virt/kvm/arm/hyp/vgic-v3-sr.c | 6 +++--- virt/kvm/arm/vgic/vgic-v2.c | 33 +++-- virt/kvm/arm/vgic/vgic-v3.c | 1 - 4 files changed, 10 insertions(+), 32 deletions(-) diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h index 34dba51..79c9e67 100644 --- a/include/kvm/arm_vgic.h +++ b/include/kvm/arm_vgic.h @@ -237,7 +237,6 @@ struct vgic_dist { struct vgic_v2_cpu_if { u32 vgic_hcr; u32 vgic_vmcr; - u64 vgic_elrsr; /* Saved only */ u32 vgic_apr; u32 vgic_lr[VGIC_V2_MAX_LRS]; }; @@ -246,7 +245,6 @@ struct vgic_v3_cpu_if { u32 vgic_hcr; u32 vgic_vmcr; u32 vgic_sre; /* Restored only, change ignored */ - u32 vgic_elrsr; /* Saved only */ u32 vgic_ap0r[4]; u32 vgic_ap1r[4]; u64 vgic_lr[VGIC_V3_MAX_LRS]; diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c index 91728fa..05548b2 100644 --- a/virt/kvm/arm/hyp/vgic-v3-sr.c +++ b/virt/kvm/arm/hyp/vgic-v3-sr.c @@ -222,15 +222,16 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu) if (used_lrs) { int i; u32 nr_pre_bits; + u32 elrsr; - cpu_if->vgic_elrsr = read_gicreg(ICH_ELSR_EL2); + elrsr = read_gicreg(ICH_ELSR_EL2); write_gicreg(0, ICH_HCR_EL2); val = read_gicreg(ICH_VTR_EL2); nr_pre_bits = vtr_to_nr_pre_bits(val); for (i = 0; i < used_lrs; i++) { - if (cpu_if->vgic_elrsr & (1 << i)) + if (elrsr & (1 << i)) cpu_if->vgic_lr[i] &= ~ICH_LR_STATE; else cpu_if->vgic_lr[i] = __gic_v3_get_lr(i); @@ -261,7 +262,6 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu) if (static_branch_unlikely(_v3_cpuif_trap)) write_gicreg(0, ICH_HCR_EL2); - cpu_if->vgic_elrsr = 0x; cpu_if->vgic_ap0r[0] = 0; cpu_if->vgic_ap0r[1] = 0; cpu_if->vgic_ap0r[2] = 0; diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c index 259079b..df7e03b 100644 --- a/virt/kvm/arm/vgic/vgic-v2.c +++ b/virt/kvm/arm/vgic/vgic-v2.c @@ -247,7 +247,6 @@ void vgic_v2_enable(struct kvm_vcpu *vcpu) * anyway. */ vcpu->arch.vgic_cpu.vgic_v2.vgic_vmcr = 0; - vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr = ~0; /* Get the show on the road... */ vcpu->arch.vgic_cpu.vgic_v2.vgic_hcr = GICH_HCR_EN; @@ -404,33 +403,19 @@ int vgic_v2_probe(const struct gic_kvm_info *info) return ret; } -static void save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base) -{ - struct vgic_v2_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v2; - int nr_lr = kvm_vgic_global_state.nr_lr; - u32 elrsr0, elrsr1; - - elrsr0 = readl_relaxed(base + GICH_ELRSR0); - if (unlikely(nr_lr > 32)) - elrsr1 = readl_relaxed(base + GICH_ELRSR1); - else - elrsr1 = 0; - -#ifdef CONFIG_CPU_BIG_ENDIAN - cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1; -#else - cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0; -#endif -} - static void save_lrs(struct kvm_vcpu *vcpu, void __iomem *base) { struct vgic_v2_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v2; - int i; u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs; + u64 elrsr; + int i; + + elrsr = readl_relaxed(base + GICH_ELRSR0); + if (unlikely(used_lrs > 32)) + elrsr |= ((u64)readl_relaxed(base + GICH_ELRSR1)) << 32; for (i = 0; i < used_lrs; i++) { - if (cpu_if->vgic_elrsr & (1UL << i)) + if (elrsr & (1UL << i)) cpu_if->vgic_lr[i] &= ~GICH_LR_STATE; else cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4)); @@ -452,13 +437,9 @@ void vgic_v2_save_state(struct kvm_vcpu *vcpu) if (used_lrs) { cpu_if->vgic_apr = readl_relaxed(base +
[PATCH 25/37] KVM: arm64: Prepare to handle traps on remaining deferred EL1 sysregs
Handle accesses during traps to any remaining EL1 registers which can be deferred to vcpu_load and vcpu_put, by either accessing them directly on the physical CPU when the latest version is stored there, or by synchronizing the memory representation with the CPU state. Signed-off-by: Christoffer Dall--- arch/arm64/include/asm/kvm_emulate.h | 14 --- arch/arm64/kvm/inject_fault.c| 79 arch/arm64/kvm/sys_regs.c| 6 ++- 3 files changed, 76 insertions(+), 23 deletions(-) diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 630dd60..69bb40d 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -66,11 +66,6 @@ static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu) return (unsigned long *)_gp_regs(vcpu)->regs.pc; } -static inline unsigned long *vcpu_elr_el1(const struct kvm_vcpu *vcpu) -{ - return (unsigned long *)_gp_regs(vcpu)->elr_el1; -} - static inline unsigned long *vcpu_cpsr(const struct kvm_vcpu *vcpu) { return (unsigned long *)_gp_regs(vcpu)->regs.pstate; @@ -120,15 +115,6 @@ static inline void vcpu_set_reg(struct kvm_vcpu *vcpu, u8 reg_num, vcpu_gp_regs(vcpu)->regs.regs[reg_num] = val; } -/* Get vcpu SPSR for current mode */ -static inline unsigned long *vcpu_spsr(const struct kvm_vcpu *vcpu) -{ - if (vcpu_mode_is_32bit(vcpu)) - return vcpu_spsr32(vcpu); - - return (unsigned long *)_gp_regs(vcpu)->spsr[KVM_SPSR_EL1]; -} - static inline bool vcpu_mode_priv(const struct kvm_vcpu *vcpu) { u32 mode; diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c index 45c7026..f4513fc 100644 --- a/arch/arm64/kvm/inject_fault.c +++ b/arch/arm64/kvm/inject_fault.c @@ -23,6 +23,7 @@ #include #include +#include #include #define PSTATE_FAULT_BITS_64 (PSR_MODE_EL1h | PSR_A_BIT | PSR_F_BIT | \ @@ -33,13 +34,55 @@ #define LOWER_EL_AArch64_VECTOR0x400 #define LOWER_EL_AArch32_VECTOR0x600 +static u64 vcpu_get_vbar_el1(struct kvm_vcpu *vcpu) +{ + unsigned long vbar; + + if (vcpu->arch.sysregs_loaded_on_cpu) + vbar = read_sysreg_el1(vbar); + else + vbar = vcpu_sys_reg(vcpu, VBAR_EL1); + + if (vcpu_el1_is_32bit(vcpu)) + return lower_32_bits(vbar); + return vbar; +} + +static void vcpu_set_elr_el1(struct kvm_vcpu *vcpu, u64 val) +{ + if (vcpu->arch.sysregs_loaded_on_cpu) + write_sysreg_el1(val, elr); + else + vcpu_gp_regs(vcpu)->elr_el1 = val; +} + +/* Set the SPSR for the current mode */ +static void vcpu_set_spsr(struct kvm_vcpu *vcpu, u64 val) +{ + if (vcpu_mode_is_32bit(vcpu)) + *vcpu_spsr32(vcpu) = val; + + if (vcpu->arch.sysregs_loaded_on_cpu) + write_sysreg_el1(val, spsr); + else + vcpu_gp_regs(vcpu)->spsr[KVM_SPSR_EL1] = val; +} + +static u32 vcpu_get_c1_sctlr(struct kvm_vcpu *vcpu) +{ + if (vcpu->arch.sysregs_loaded_on_cpu) + return lower_32_bits(read_sysreg_el1(sctlr)); + else + return vcpu_cp15(vcpu, c1_SCTLR); +} + static void prepare_fault32(struct kvm_vcpu *vcpu, u32 mode, u32 vect_offset) { unsigned long cpsr; unsigned long new_spsr_value = *vcpu_cpsr(vcpu); bool is_thumb = (new_spsr_value & COMPAT_PSR_T_BIT); u32 return_offset = (is_thumb) ? 4 : 0; - u32 sctlr = vcpu_cp15(vcpu, c1_SCTLR); + u32 sctlr = vcpu_get_c1_sctlr(vcpu); cpsr = mode | COMPAT_PSR_I_BIT; @@ -51,14 +94,14 @@ static void prepare_fault32(struct kvm_vcpu *vcpu, u32 mode, u32 vect_offset) *vcpu_cpsr(vcpu) = cpsr; /* Note: These now point to the banked copies */ - *vcpu_spsr(vcpu) = new_spsr_value; + vcpu_set_spsr(vcpu, new_spsr_value); *vcpu_reg32(vcpu, 14) = *vcpu_pc(vcpu) + return_offset; /* Branch to exception vector */ if (sctlr & (1 << 13)) vect_offset += 0x; else /* always have security exceptions */ - vect_offset += vcpu_cp15(vcpu, c12_VBAR); + vect_offset += vcpu_get_vbar_el1(vcpu); *vcpu_pc(vcpu) = vect_offset; } @@ -79,6 +122,20 @@ static void inject_abt32(struct kvm_vcpu *vcpu, bool is_pabt, u32 *far, *fsr; bool is_lpae; + /* +* We are going to need the latest values of the following system +* regiters: +* DFAR: mapped to FAR_EL1 +* IFAR: mapped to FAR_EL1 +* DFSR: mapped to ESR_EL1 +* TTBCR: mapped to TCR_EL1 +*/ + if (vcpu->arch.sysregs_loaded_on_cpu) { + vcpu->arch.ctxt.sys_regs[FAR_EL1] = read_sysreg_el1(far); + vcpu->arch.ctxt.sys_regs[ESR_EL1] = read_sysreg_el1(esr); +
[PATCH 17/37] KVM: arm64: Move userspace system registers into separate function
There's a semantic difference between the EL1 registers that control operation of a kernel running in EL1 and EL1 registers that only control userspace execution in EL0. Since we can defer saving/restoring the latter, move them into their own function. We also take this chance to rename the function saving/restoring the remaining system register to make it clear this function deals with the EL1 system registers. No functional change. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/hyp/sysreg-sr.c | 34 +++--- 1 file changed, 23 insertions(+), 11 deletions(-) diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c index c4a3714..193c2b0 100644 --- a/arch/arm64/kvm/hyp/sysreg-sr.c +++ b/arch/arm64/kvm/hyp/sysreg-sr.c @@ -34,14 +34,18 @@ static void __hyp_text __sysreg_do_nothing(struct kvm_cpu_context *ctxt) { } static void __hyp_text __sysreg_save_common_state(struct kvm_cpu_context *ctxt) { + ctxt->sys_regs[MDSCR_EL1] = read_sysreg(mdscr_el1); + ctxt->gp_regs.regs.sp = read_sysreg(sp_el0); +} + +static void __hyp_text __sysreg_save_user_state(struct kvm_cpu_context *ctxt) +{ ctxt->sys_regs[ACTLR_EL1] = read_sysreg(actlr_el1); ctxt->sys_regs[TPIDR_EL0] = read_sysreg(tpidr_el0); ctxt->sys_regs[TPIDRRO_EL0] = read_sysreg(tpidrro_el0); - ctxt->sys_regs[MDSCR_EL1] = read_sysreg(mdscr_el1); - ctxt->gp_regs.regs.sp = read_sysreg(sp_el0); } -static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt) +static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt) { ctxt->sys_regs[MPIDR_EL1] = read_sysreg(vmpidr_el2); ctxt->sys_regs[CSSELR_EL1] = read_sysreg(csselr_el1); @@ -70,31 +74,37 @@ static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt) } static hyp_alternate_select(__sysreg_call_save_host_state, - __sysreg_save_state, __sysreg_do_nothing, + __sysreg_save_el1_state, __sysreg_do_nothing, ARM64_HAS_VIRT_HOST_EXTN); void __hyp_text __sysreg_save_host_state(struct kvm_cpu_context *ctxt) { __sysreg_call_save_host_state()(ctxt); __sysreg_save_common_state(ctxt); + __sysreg_save_user_state(ctxt); } void __hyp_text __sysreg_save_guest_state(struct kvm_cpu_context *ctxt) { - __sysreg_save_state(ctxt); + __sysreg_save_el1_state(ctxt); __sysreg_save_common_state(ctxt); + __sysreg_save_user_state(ctxt); } static void __hyp_text __sysreg_restore_common_state(struct kvm_cpu_context *ctxt) { - write_sysreg(ctxt->sys_regs[ACTLR_EL1], actlr_el1); - write_sysreg(ctxt->sys_regs[TPIDR_EL0], tpidr_el0); - write_sysreg(ctxt->sys_regs[TPIDRRO_EL0], tpidrro_el0); write_sysreg(ctxt->sys_regs[MDSCR_EL1], mdscr_el1); write_sysreg(ctxt->gp_regs.regs.sp, sp_el0); } -static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt) +static void __hyp_text __sysreg_restore_user_state(struct kvm_cpu_context *ctxt) +{ + write_sysreg(ctxt->sys_regs[ACTLR_EL1], actlr_el1); + write_sysreg(ctxt->sys_regs[TPIDR_EL0], tpidr_el0); + write_sysreg(ctxt->sys_regs[TPIDRRO_EL0], tpidrro_el0); +} + +static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt) { write_sysreg(ctxt->sys_regs[MPIDR_EL1], vmpidr_el2); write_sysreg(ctxt->sys_regs[CSSELR_EL1],csselr_el1); @@ -123,19 +133,21 @@ static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt) } static hyp_alternate_select(__sysreg_call_restore_host_state, - __sysreg_restore_state, __sysreg_do_nothing, + __sysreg_restore_el1_state, __sysreg_do_nothing, ARM64_HAS_VIRT_HOST_EXTN); void __hyp_text __sysreg_restore_host_state(struct kvm_cpu_context *ctxt) { __sysreg_call_restore_host_state()(ctxt); __sysreg_restore_common_state(ctxt); + __sysreg_restore_user_state(ctxt); } void __hyp_text __sysreg_restore_guest_state(struct kvm_cpu_context *ctxt) { - __sysreg_restore_state(ctxt); + __sysreg_restore_el1_state(ctxt); __sysreg_restore_common_state(ctxt); + __sysreg_restore_user_state(ctxt); } static void __hyp_text __fpsimd32_save_state(struct kvm_cpu_context *ctxt) -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 18/37] KVM: arm64: Rewrite sysreg alternatives to static keys
As we are about to move calls around in the sysreg save/restore logic, let's first rewrite the alternative function callers, because it is going to make the next patches much easier to read. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/hyp/sysreg-sr.c | 17 - 1 file changed, 4 insertions(+), 13 deletions(-) diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c index 193c2b0..b55c4ad 100644 --- a/arch/arm64/kvm/hyp/sysreg-sr.c +++ b/arch/arm64/kvm/hyp/sysreg-sr.c @@ -22,9 +22,6 @@ #include #include -/* Yes, this does nothing, on purpose */ -static void __hyp_text __sysreg_do_nothing(struct kvm_cpu_context *ctxt) { } - /* * Non-VHE: Both host and guest must save everything. * @@ -73,13 +70,10 @@ static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt) ctxt->gp_regs.regs.pstate = read_sysreg_el2(spsr); } -static hyp_alternate_select(__sysreg_call_save_host_state, - __sysreg_save_el1_state, __sysreg_do_nothing, - ARM64_HAS_VIRT_HOST_EXTN); - void __hyp_text __sysreg_save_host_state(struct kvm_cpu_context *ctxt) { - __sysreg_call_save_host_state()(ctxt); + if (!has_vhe()) + __sysreg_save_el1_state(ctxt); __sysreg_save_common_state(ctxt); __sysreg_save_user_state(ctxt); } @@ -132,13 +126,10 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt) write_sysreg_el2(ctxt->gp_regs.regs.pstate, spsr); } -static hyp_alternate_select(__sysreg_call_restore_host_state, - __sysreg_restore_el1_state, __sysreg_do_nothing, - ARM64_HAS_VIRT_HOST_EXTN); - void __hyp_text __sysreg_restore_host_state(struct kvm_cpu_context *ctxt) { - __sysreg_call_restore_host_state()(ctxt); + if (!has_vhe()) + __sysreg_restore_el1_state(ctxt); __sysreg_restore_common_state(ctxt); __sysreg_restore_user_state(ctxt); } -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 34/37] KVM: arm/arm64: Handle VGICv3 save/restore from the main VGIC code on VHE
Just like we can program the GICv2 hypervisor control interface directly from the core vgic code, we can do the same for the GICv3 hypervisor control interface on VHE systems. We do this by simply calling the save/restore functions when we have VHE and we can then get rid of the save/restore function calls from the VHE world switch function. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/hyp/switch.c | 3 --- virt/kvm/arm/vgic/vgic.c| 4 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 90da506..0cdf6ae 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -366,8 +366,6 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu) activate_traps_vhe(vcpu); __activate_vm(vcpu); - __vgic_restore_state(vcpu); - sysreg_restore_guest_state_vhe(guest_ctxt); __debug_switch_to_guest(vcpu); @@ -380,7 +378,6 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu) goto again; sysreg_save_guest_state_vhe(guest_ctxt); - __vgic_save_state(vcpu); deactivate_traps_vhe(vcpu); diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c index 790fd66..f0072f7 100644 --- a/virt/kvm/arm/vgic/vgic.c +++ b/virt/kvm/arm/vgic/vgic.c @@ -689,6 +689,8 @@ static inline void vgic_save_state(struct kvm_vcpu *vcpu) { if (!static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) vgic_v2_save_state(vcpu); + else if (has_vhe()) + __vgic_v3_save_state(vcpu); } /* Sync back the hardware VGIC state into our emulation after a guest's run. */ @@ -711,6 +713,8 @@ static inline void vgic_restore_state(struct kvm_vcpu *vcpu) { if (!static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) vgic_v2_restore_state(vcpu); + else if (has_vhe()) + __vgic_v3_restore_state(vcpu); } /* Flush our emulation state into the GIC hardware before entering the guest. */ -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 33/37] KVM: arm/arm64: Move arm64-only vgic-v2-sr.c file to arm64
The vgic-v2-sr.c file now only contains the logic to replay unaligned accesses to the virtual CPU interface on 16K and 64K page systems, which is only relevant on 64-bit platforms. Therefore move this file to the arm64 KVM tree, remove the compile directive from the 32-bit side makefile, and remove the ifdef in the C file. Signed-off-by: Christoffer Dall--- arch/arm/kvm/hyp/Makefile | 1 - arch/arm64/kvm/hyp/Makefile | 2 +- {virt/kvm/arm => arch/arm64/kvm}/hyp/vgic-v2-sr.c | 2 -- 3 files changed, 1 insertion(+), 4 deletions(-) rename {virt/kvm/arm => arch/arm64/kvm}/hyp/vgic-v2-sr.c (98%) diff --git a/arch/arm/kvm/hyp/Makefile b/arch/arm/kvm/hyp/Makefile index 8679405..4be5061 100644 --- a/arch/arm/kvm/hyp/Makefile +++ b/arch/arm/kvm/hyp/Makefile @@ -6,7 +6,6 @@ ccflags-y += -fno-stack-protector KVM=../../../../virt/kvm -obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v2-sr.o obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/timer-sr.o diff --git a/arch/arm64/kvm/hyp/Makefile b/arch/arm64/kvm/hyp/Makefile index 14c4e3b..b746fa5 100644 --- a/arch/arm64/kvm/hyp/Makefile +++ b/arch/arm64/kvm/hyp/Makefile @@ -6,10 +6,10 @@ ccflags-y += -fno-stack-protector KVM=../../../../virt/kvm -obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v2-sr.o obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/timer-sr.o +obj-$(CONFIG_KVM_ARM_HOST) += vgic-v2-sr.o obj-$(CONFIG_KVM_ARM_HOST) += sysreg-sr.o obj-$(CONFIG_KVM_ARM_HOST) += debug-sr.o obj-$(CONFIG_KVM_ARM_HOST) += entry.o diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c similarity index 98% rename from virt/kvm/arm/hyp/vgic-v2-sr.c rename to arch/arm64/kvm/hyp/vgic-v2-sr.c index b433257..fcd7b4e 100644 --- a/virt/kvm/arm/hyp/vgic-v2-sr.c +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c @@ -22,7 +22,6 @@ #include #include -#ifdef CONFIG_ARM64 /* * __vgic_v2_perform_cpuif_access -- perform a GICV access on behalf of the * guest. @@ -76,4 +75,3 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu) return 1; } -#endif -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 27/37] KVM: arm64: Defer saving/restoring system registers to vcpu load/put on VHE
Some system registers do not affect the host kernel's execution and can therefore be loaded when we are about to run a VCPU and we don't have to restore the host state to the hardware before the time when we are actually about to return to userspace or schedule out the VCPU thread. The EL1 system registers and the userspace state registers, which only affect EL0 execution, do not affect the host kernel's execution. The 32-bit system registers are not used by a VHE host kernel and therefore don't need to be saved/restored on every entry/exit to/from the guest, but can be deferred to vcpu_load and vcpu_put, respectively. We have already prepared the trap handling code which accesses any of these registers to directly access the registers on the physical CPU or to sync the registers when needed. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/hyp/switch.c| 6 -- arch/arm64/kvm/hyp/sysreg-sr.c | 46 ++ 2 files changed, 38 insertions(+), 14 deletions(-) diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 6356bec..6a12504 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -337,11 +337,6 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu) __vgic_restore_state(vcpu); - /* -* We must restore the 32-bit state before the sysregs, thanks -* to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72). -*/ - __sysreg32_restore_state(vcpu); sysreg_restore_guest_state_vhe(guest_ctxt); __debug_switch_to_guest(vcpu); @@ -354,7 +349,6 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu) goto again; sysreg_save_guest_state_vhe(guest_ctxt); - __sysreg32_save_state(vcpu); __vgic_save_state(vcpu); __deactivate_traps(vcpu); diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c index 354ca02..6ff1ce5 100644 --- a/arch/arm64/kvm/hyp/sysreg-sr.c +++ b/arch/arm64/kvm/hyp/sysreg-sr.c @@ -25,8 +25,12 @@ /* * Non-VHE: Both host and guest must save everything. * - * VHE: Host must save tpidr*_el0, actlr_el1, mdscr_el1, sp_el0, - * and guest must save everything. + * VHE: Host and guest must save mdscr_el1 and sp_el0 (and the PC and pstate, + * which are handled as part of the el2 return state) on every switch. + * tpidr_el0, tpidrro_el0, and actlr_el1 only need to be switched when going + * to host userspace or a different VCPU. EL1 registers only need to be + * switched when potentially going to run a different VCPU. The latter two + * classes are handled as part of kvm_arch_vcpu_load and kvm_arch_vcpu_put. */ static void __hyp_text __sysreg_save_common_state(struct kvm_cpu_context *ctxt) @@ -85,14 +89,11 @@ void __hyp_text __sysreg_save_state_nvhe(struct kvm_cpu_context *ctxt) void sysreg_save_host_state_vhe(struct kvm_cpu_context *ctxt) { __sysreg_save_common_state(ctxt); - __sysreg_save_user_state(ctxt); } void sysreg_save_guest_state_vhe(struct kvm_cpu_context *ctxt) { - __sysreg_save_el1_state(ctxt); __sysreg_save_common_state(ctxt); - __sysreg_save_user_state(ctxt); __sysreg_save_el2_return_state(ctxt); } @@ -153,14 +154,11 @@ void __hyp_text __sysreg_restore_state_nvhe(struct kvm_cpu_context *ctxt) void sysreg_restore_host_state_vhe(struct kvm_cpu_context *ctxt) { __sysreg_restore_common_state(ctxt); - __sysreg_restore_user_state(ctxt); } void sysreg_restore_guest_state_vhe(struct kvm_cpu_context *ctxt) { - __sysreg_restore_el1_state(ctxt); __sysreg_restore_common_state(ctxt); - __sysreg_restore_user_state(ctxt); __sysreg_restore_el2_return_state(ctxt); } @@ -224,6 +222,26 @@ void __hyp_text __sysreg32_restore_state(struct kvm_vcpu *vcpu) */ void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu) { + struct kvm_cpu_context *host_ctxt = vcpu->arch.host_cpu_context; + struct kvm_cpu_context *guest_ctxt = >arch.ctxt; + + if (!has_vhe()) + return; + + __sysreg_save_user_state(host_ctxt); + + + /* +* Load guest EL1 and user state +* +* We must restore the 32-bit state before the sysregs, thanks +* to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72). +*/ + __sysreg32_restore_state(vcpu); + __sysreg_restore_user_state(guest_ctxt); + __sysreg_restore_el1_state(guest_ctxt); + + vcpu->arch.sysregs_loaded_on_cpu = true; } /** @@ -250,4 +268,16 @@ void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu) __fpsimd_restore_state(_ctxt->gp_regs.fp_regs); vcpu->arch.guest_vfp_loaded = 0; } + + if (!has_vhe()) + return; + + __sysreg_save_el1_state(guest_ctxt); + __sysreg_save_user_state(guest_ctxt); + __sysreg32_save_state(vcpu); + + /* Restore host user state */ +
[PATCH 28/37] KVM: arm64: Move common VHE/non-VHE trap config in separate functions
As we are about to be more lazy with some of the trap configuration register read/writes for VHE systems, move the logic that is currently shared between VHE and non-VHE into a separate function which can be called from either the world-switch path or from vcpu_load/vcpu_put. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/hyp/switch.c | 70 + 1 file changed, 39 insertions(+), 31 deletions(-) diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 6a12504..c587416 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -23,6 +23,43 @@ #include #include +static void __hyp_text __activate_traps_common(struct kvm_vcpu *vcpu) +{ + /* +* We are about to set CPTR_EL2.TFP to trap all floating point +* register accesses to EL2, however, the ARM ARM clearly states that +* traps are only taken to EL2 if the operation would not otherwise +* trap to EL1. Therefore, always make sure that for 32-bit guests, +* we set FPEXC.EN to prevent traps to EL1, when setting the TFP bit. +* If FP/ASIMD is not implemented, FPEXC is UNDEFINED and any access to +* it will cause an exception. +*/ + if (vcpu_el1_is_32bit(vcpu) && system_supports_fpsimd() && + !vcpu->arch.guest_vfp_loaded) { + write_sysreg(1 << 30, fpexc32_el2); + isb(); + } + write_sysreg(vcpu->arch.hcr_el2, hcr_el2); + + /* Trap on AArch32 cp15 c15 (impdef sysregs) accesses (EL1 or EL0) */ + write_sysreg(1 << 15, hstr_el2); + /* +* Make sure we trap PMU access from EL0 to EL2. Also sanitize +* PMSELR_EL0 to make sure it never contains the cycle +* counter, which could make a PMXEVCNTR_EL0 access UNDEF at +* EL1 instead of being trapped to EL2. +*/ + write_sysreg(0, pmselr_el0); + write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0); + write_sysreg(vcpu->arch.mdcr_el2, mdcr_el2); +} + +static void __hyp_text __deactivate_traps_common(void) +{ + write_sysreg(0, hstr_el2); + write_sysreg(0, pmuserenr_el0); +} + static void __hyp_text __activate_traps_vhe(struct kvm_vcpu *vcpu) { u64 val; @@ -57,35 +94,7 @@ static hyp_alternate_select(__activate_traps_arch, static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu) { - u64 val; - - /* -* We are about to set CPTR_EL2.TFP to trap all floating point -* register accesses to EL2, however, the ARM ARM clearly states that -* traps are only taken to EL2 if the operation would not otherwise -* trap to EL1. Therefore, always make sure that for 32-bit guests, -* we set FPEXC.EN to prevent traps to EL1, when setting the TFP bit. -* If FP/ASIMD is not implemented, FPEXC is UNDEFINED and any access to -* it will cause an exception. -*/ - val = vcpu->arch.hcr_el2; - if (vcpu_el1_is_32bit(vcpu) && system_supports_fpsimd() && - !vcpu->arch.guest_vfp_loaded) { - write_sysreg(1 << 30, fpexc32_el2); - isb(); - } - write_sysreg(val, hcr_el2); - /* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */ - write_sysreg(1 << 15, hstr_el2); - /* -* Make sure we trap PMU access from EL0 to EL2. Also sanitize -* PMSELR_EL0 to make sure it never contains the cycle -* counter, which could make a PMXEVCNTR_EL0 access UNDEF at -* EL1 instead of being trapped to EL2. -*/ - write_sysreg(0, pmselr_el0); - write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0); - write_sysreg(vcpu->arch.mdcr_el2, mdcr_el2); + __activate_traps_common(vcpu); __activate_traps_arch()(vcpu); } @@ -131,9 +140,8 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu) if (vcpu->arch.hcr_el2 & HCR_VSE) vcpu->arch.hcr_el2 = read_sysreg(hcr_el2); + __deactivate_traps_common(); __deactivate_traps_arch()(); - write_sysreg(0, hstr_el2); - write_sysreg(0, pmuserenr_el0); } static inline void __hyp_text __activate_vm(struct kvm_vcpu *vcpu) -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 30/37] KVM: arm64: Configure c15, PMU, and debug register traps on cpu load/put for VHE
We do not have to change the c15 trap setting on each switch to/from the guest on VHE systems, because this setting only affects EL0. The PMU and debug trap configuration can also be done on vcpu load/put instead, because they don't affect how the host kernel can access the debug registers while executing KVM kernel code and KVM doesn't use floating point itself. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/hyp/switch.c | 35 ++- 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 09be10f..13e137e 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -42,8 +42,6 @@ static void __hyp_text __activate_traps_fpsimd32(struct kvm_vcpu *vcpu) static void __hyp_text __activate_traps_common(struct kvm_vcpu *vcpu) { - write_sysreg(vcpu->arch.hcr_el2, hcr_el2); - /* Trap on AArch32 cp15 c15 (impdef sysregs) accesses (EL1 or EL0) */ write_sysreg(1 << 15, hstr_el2); /* @@ -63,12 +61,15 @@ static void __hyp_text __deactivate_traps_common(void) write_sysreg(0, pmuserenr_el0); } +/* Activate the traps we can during vcpu_load with VHE */ void activate_traps_vhe_load(struct kvm_vcpu *vcpu) { u64 val; + /* Make sure 32-bit guests trap VFP */ __activate_traps_fpsimd32(vcpu); + /* Trap VFP accesses on a VHE system */ val = read_sysreg(cpacr_el1); val |= CPACR_EL1_TTA; if (vcpu->arch.guest_vfp_loaded) @@ -76,11 +77,28 @@ void activate_traps_vhe_load(struct kvm_vcpu *vcpu) else val &= ~CPACR_EL1_FPEN; write_sysreg(val, cpacr_el1); + + /* Activate traps on impdef sysregs, PMU, and debug */ + __activate_traps_common(vcpu); } +/* Deactivate the traps we can during vcpu_put with VHE */ void deactivate_traps_vhe_put(void) { + u64 mdcr_el2; + + /* Re-enable host VFP access */ write_sysreg(CPACR_EL1_FPEN, cpacr_el1); + + /* Re-enable host access to impdef sysregs and the PMU */ + __deactivate_traps_common(); + + /* Re-enable host access to the debug regs */ + mdcr_el2 = read_sysreg(mdcr_el2); + mdcr_el2 &= MDCR_EL2_HPMN_MASK | + MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT | + MDCR_EL2_TPMS; + write_sysreg(mdcr_el2, mdcr_el2); } static void __hyp_text __activate_traps_vhe(struct kvm_vcpu *vcpu) @@ -92,8 +110,13 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu) { u64 val; + /* Activate traps on impdef sysregs, PMU, and debug */ + __activate_traps_common(vcpu); + + /* Make sure 32-bit guests trap VFP */ __activate_traps_fpsimd32(vcpu); + /* Trap VFP accesses on a non-VHE system */ val = CPTR_EL2_DEFAULT; val |= CPTR_EL2_TTA; if (vcpu->arch.guest_vfp_loaded) @@ -109,20 +132,14 @@ static hyp_alternate_select(__activate_traps_arch, static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu) { - __activate_traps_common(vcpu); __activate_traps_arch()(vcpu); + write_sysreg(vcpu->arch.hcr_el2, hcr_el2); } static void __hyp_text __deactivate_traps_vhe(void) { extern char vectors[]; /* kernel exception vectors */ - u64 mdcr_el2 = read_sysreg(mdcr_el2); - mdcr_el2 &= MDCR_EL2_HPMN_MASK | - MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT | - MDCR_EL2_TPMS; - - write_sysreg(mdcr_el2, mdcr_el2); write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2); write_sysreg(vectors, vbar_el1); } -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 31/37] KVM: arm64: Separate activate_traps and deactive_traps for VHE and non-VHE
To make the code more readable and to avoid the overhead of a function call, let's get rid of a pair of the alternative function selectors and explicitly call the VHE and non-VHE functions instead, telling the compiler to try to inline the static function if it can. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/hyp/switch.c | 78 + 1 file changed, 37 insertions(+), 41 deletions(-) diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 13e137e..5692aa0 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -101,9 +101,27 @@ void deactivate_traps_vhe_put(void) write_sysreg(mdcr_el2, mdcr_el2); } -static void __hyp_text __activate_traps_vhe(struct kvm_vcpu *vcpu) +static inline void activate_traps_vhe(struct kvm_vcpu *vcpu) { - write_sysreg(__kvm_hyp_vector, vbar_el1); + write_sysreg(vcpu->arch.hcr_el2, hcr_el2); + write_sysreg_el2(__kvm_hyp_vector, vbar); +} + +static inline void deactivate_traps_vhe(struct kvm_vcpu *vcpu) +{ + extern char vectors[]; /* kernel exception vectors */ + + /* +* If we pended a virtual abort, preserve it until it gets +* cleared. See D1.14.3 (Virtual Interrupts) for details, but +* the crucial bit is "On taking a vSError interrupt, +* HCR_EL2.VSE is cleared to 0." +*/ + if (vcpu->arch.hcr_el2 & HCR_VSE) + vcpu->arch.hcr_el2 = read_sysreg(hcr_el2); + + write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2); + write_sysreg(vectors, vbar_el1); } static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu) @@ -124,44 +142,15 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu) else val |= CPTR_EL2_TFP; write_sysreg(val, cptr_el2); -} -static hyp_alternate_select(__activate_traps_arch, - __activate_traps_nvhe, __activate_traps_vhe, - ARM64_HAS_VIRT_HOST_EXTN); - -static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu) -{ - __activate_traps_arch()(vcpu); + /* Configure all other hypervisor traps and features */ write_sysreg(vcpu->arch.hcr_el2, hcr_el2); } -static void __hyp_text __deactivate_traps_vhe(void) -{ - extern char vectors[]; /* kernel exception vectors */ - - write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2); - write_sysreg(vectors, vbar_el1); -} - -static void __hyp_text __deactivate_traps_nvhe(void) +static void __hyp_text __deactivate_traps_nvhe(struct kvm_vcpu *vcpu) { - u64 mdcr_el2 = read_sysreg(mdcr_el2); - - mdcr_el2 &= MDCR_EL2_HPMN_MASK; - mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT; - - write_sysreg(mdcr_el2, mdcr_el2); - write_sysreg(HCR_RW, hcr_el2); - write_sysreg(CPTR_EL2_DEFAULT, cptr_el2); -} - -static hyp_alternate_select(__deactivate_traps_arch, - __deactivate_traps_nvhe, __deactivate_traps_vhe, - ARM64_HAS_VIRT_HOST_EXTN); + u64 mdcr_el2; -static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu) -{ /* * If we pended a virtual abort, preserve it until it gets * cleared. See D1.14.3 (Virtual Interrupts) for details, but @@ -172,7 +161,14 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu) vcpu->arch.hcr_el2 = read_sysreg(hcr_el2); __deactivate_traps_common(); - __deactivate_traps_arch()(); + + mdcr_el2 = read_sysreg(mdcr_el2); + mdcr_el2 &= MDCR_EL2_HPMN_MASK; + mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT; + + write_sysreg(mdcr_el2, mdcr_el2); + write_sysreg(HCR_RW, hcr_el2); + write_sysreg(CPTR_EL2_DEFAULT, cptr_el2); } static inline void __hyp_text __activate_vm(struct kvm_vcpu *vcpu) @@ -371,7 +367,7 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu) sysreg_save_host_state_vhe(host_ctxt); - __activate_traps(vcpu); + activate_traps_vhe(vcpu); __activate_vm(vcpu); __vgic_restore_state(vcpu); @@ -390,7 +386,7 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu) sysreg_save_guest_state_vhe(guest_ctxt); __vgic_save_state(vcpu); - __deactivate_traps(vcpu); + deactivate_traps_vhe(vcpu); sysreg_restore_host_state_vhe(host_ctxt); @@ -418,7 +414,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) __sysreg_save_state_nvhe(host_ctxt); - __activate_traps(vcpu); + __activate_traps_nvhe(vcpu); __activate_vm(vcpu); __vgic_restore_state(vcpu); @@ -445,7 +441,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) __timer_disable_traps(vcpu); __vgic_save_state(vcpu); - __deactivate_traps(vcpu); + __deactivate_traps_nvhe(vcpu); __deactivate_vm(vcpu); __sysreg_restore_state_nvhe(host_ctxt); @@ -471,7
[PATCH 37/37] KVM: arm/arm64: Avoid VGICv3 save/restore on VHE with no IRQs
We can finally get completely rid of any calls to the VGICv3 save/restore functions when the AP lists are empty on VHE systems. This requires carefully factoring out trap configuration from saving and restoring state, and carefully choosing what to do on the VHE and non-VHE path. One of the challenges is that we cannot save/restore the VMCR lazily because we can only write the VMCR when ICC_SRE_EL1.SRE is cleared when emulating a GICv2-on-GICv3, since otherwise all Group-0 interrupts end up being delivered as FIQ. To solve this problem, and still provide fast performance in the fast path of exiting a VM when no interrupts are pending (which also optimized the latency for actually delivering virtual interrupts coming from physical interrupts), we orchestrate a dance of only doing the activate/deactivate traps in vgic load/put for VHE systems (which can have ICC_SRE_EL1.SRE cleared when running in the host), and doing the configuration on every round-trip on non-VHE systems. Signed-off-by: Christoffer Dall--- arch/arm/include/asm/kvm_hyp.h | 2 + arch/arm/kvm/hyp/switch.c| 8 ++- arch/arm64/include/asm/kvm_hyp.h | 2 + arch/arm64/kvm/hyp/switch.c | 8 ++- virt/kvm/arm/hyp/vgic-v3-sr.c| 116 +-- virt/kvm/arm/vgic/vgic-v3.c | 6 ++ virt/kvm/arm/vgic/vgic.c | 7 +-- 7 files changed, 101 insertions(+), 48 deletions(-) diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h index b3dd4f4..d01676e 100644 --- a/arch/arm/include/asm/kvm_hyp.h +++ b/arch/arm/include/asm/kvm_hyp.h @@ -109,6 +109,8 @@ void __sysreg_restore_state(struct kvm_cpu_context *ctxt); void __vgic_v3_save_state(struct kvm_vcpu *vcpu); void __vgic_v3_restore_state(struct kvm_vcpu *vcpu); +void __vgic_v3_activate_traps(struct kvm_vcpu *vcpu); +void __vgic_v3_deactivate_traps(struct kvm_vcpu *vcpu); void __vgic_v3_save_aprs(struct kvm_vcpu *vcpu); void __vgic_v3_restore_aprs(struct kvm_vcpu *vcpu); diff --git a/arch/arm/kvm/hyp/switch.c b/arch/arm/kvm/hyp/switch.c index 0d834f8..6b487c2 100644 --- a/arch/arm/kvm/hyp/switch.c +++ b/arch/arm/kvm/hyp/switch.c @@ -89,14 +89,18 @@ static void __hyp_text __deactivate_vm(struct kvm_vcpu *vcpu) static void __hyp_text __vgic_save_state(struct kvm_vcpu *vcpu) { - if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) + if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) { __vgic_v3_save_state(vcpu); + __vgic_v3_deactivate_traps(vcpu); + } } static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu) { - if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) + if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) { + __vgic_v3_activate_traps(vcpu); __vgic_v3_restore_state(vcpu); + } } static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu) diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index 62872ce..1c21a70 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -125,6 +125,8 @@ int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu); void __vgic_v3_save_state(struct kvm_vcpu *vcpu); void __vgic_v3_restore_state(struct kvm_vcpu *vcpu); +void __vgic_v3_activate_traps(struct kvm_vcpu *vcpu); +void __vgic_v3_deactivate_traps(struct kvm_vcpu *vcpu); void __vgic_v3_save_aprs(struct kvm_vcpu *vcpu); void __vgic_v3_restore_aprs(struct kvm_vcpu *vcpu); int __vgic_v3_perform_cpuif_access(struct kvm_vcpu *vcpu); diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 0cdf6ae..42e6d56 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -184,14 +184,18 @@ static inline void __hyp_text __deactivate_vm(struct kvm_vcpu *vcpu) static void __hyp_text __vgic_save_state(struct kvm_vcpu *vcpu) { - if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) + if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) { __vgic_v3_save_state(vcpu); + __vgic_v3_deactivate_traps(vcpu); + } } static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu) { - if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) + if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) { + __vgic_v3_activate_traps(vcpu); __vgic_v3_restore_state(vcpu); + } } static bool __hyp_text __true_value(void) diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c index ed5da75..34d71d2 100644 --- a/virt/kvm/arm/hyp/vgic-v3-sr.c +++ b/virt/kvm/arm/hyp/vgic-v3-sr.c @@ -208,15 +208,15 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu) { struct vgic_v3_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v3; u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs; - u64 val; /* *
[PATCH 32/37] KVM: arm/arm64: Handle VGICv2 save/restore from the main VGIC code
We can program the GICv2 hypervisor control interface logic directly from the core vgic code and can instead do the save/restore directly from the flush/sync functions, which can lead to a number of future optimizations. Signed-off-by: Christoffer Dall--- arch/arm/kvm/hyp/switch.c| 4 -- arch/arm64/include/asm/kvm_hyp.h | 2 - arch/arm64/kvm/hyp/switch.c | 4 -- virt/kvm/arm/hyp/vgic-v2-sr.c| 83 virt/kvm/arm/vgic/vgic-init.c| 22 ++ virt/kvm/arm/vgic/vgic-v2.c | 92 virt/kvm/arm/vgic/vgic.c | 21 - virt/kvm/arm/vgic/vgic.h | 5 +++ 8 files changed, 130 insertions(+), 103 deletions(-) diff --git a/arch/arm/kvm/hyp/switch.c b/arch/arm/kvm/hyp/switch.c index c3b9799..0d834f8 100644 --- a/arch/arm/kvm/hyp/switch.c +++ b/arch/arm/kvm/hyp/switch.c @@ -91,16 +91,12 @@ static void __hyp_text __vgic_save_state(struct kvm_vcpu *vcpu) { if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) __vgic_v3_save_state(vcpu); - else - __vgic_v2_save_state(vcpu); } static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu) { if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) __vgic_v3_restore_state(vcpu); - else - __vgic_v2_restore_state(vcpu); } static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu) diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index 28d5f3c..bd3fe64 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -121,8 +121,6 @@ typeof(orig) * __hyp_text fname(void) \ return val; \ } -void __vgic_v2_save_state(struct kvm_vcpu *vcpu); -void __vgic_v2_restore_state(struct kvm_vcpu *vcpu); int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu); void __vgic_v3_save_state(struct kvm_vcpu *vcpu); diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 5692aa0..90da506 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -186,16 +186,12 @@ static void __hyp_text __vgic_save_state(struct kvm_vcpu *vcpu) { if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) __vgic_v3_save_state(vcpu); - else - __vgic_v2_save_state(vcpu); } static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu) { if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) __vgic_v3_restore_state(vcpu); - else - __vgic_v2_restore_state(vcpu); } static bool __hyp_text __true_value(void) diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/virt/kvm/arm/hyp/vgic-v2-sr.c index a3f18d3..b433257 100644 --- a/virt/kvm/arm/hyp/vgic-v2-sr.c +++ b/virt/kvm/arm/hyp/vgic-v2-sr.c @@ -22,89 +22,6 @@ #include #include -static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base) -{ - struct vgic_v2_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v2; - int nr_lr = (kern_hyp_va(_vgic_global_state))->nr_lr; - u32 elrsr0, elrsr1; - - elrsr0 = readl_relaxed(base + GICH_ELRSR0); - if (unlikely(nr_lr > 32)) - elrsr1 = readl_relaxed(base + GICH_ELRSR1); - else - elrsr1 = 0; - -#ifdef CONFIG_CPU_BIG_ENDIAN - cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1; -#else - cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0; -#endif -} - -static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base) -{ - struct vgic_v2_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v2; - int i; - u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs; - - for (i = 0; i < used_lrs; i++) { - if (cpu_if->vgic_elrsr & (1UL << i)) - cpu_if->vgic_lr[i] &= ~GICH_LR_STATE; - else - cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4)); - - writel_relaxed(0, base + GICH_LR0 + (i * 4)); - } -} - -/* vcpu is already in the HYP VA space */ -void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu) -{ - struct kvm *kvm = kern_hyp_va(vcpu->kvm); - struct vgic_v2_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v2; - struct vgic_dist *vgic = >arch.vgic; - void __iomem *base = kern_hyp_va(vgic->vctrl_base); - u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs; - - if (!base) - return; - - if (used_lrs) { - cpu_if->vgic_apr = readl_relaxed(base + GICH_APR); - - save_elrsr(vcpu, base); - save_lrs(vcpu, base); - - writel_relaxed(0, base + GICH_HCR); - } else { - cpu_if->vgic_elrsr = ~0UL; - cpu_if->vgic_apr = 0; - } -} - -/* vcpu is already in the HYP VA space */
[PATCH 26/37] KVM: arm64: Prepare to handle traps on deferred AArch32 sysregs
Handle accesses to any AArch32 EL1 system registers where we can defer saving and restoring them to vcpu_load and vcpu_put, and which are stored in special EL2 registers only used support 32-bit guests. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/inject_fault.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c index f4513fc..02990f5 100644 --- a/arch/arm64/kvm/inject_fault.c +++ b/arch/arm64/kvm/inject_fault.c @@ -59,9 +59,18 @@ static void vcpu_set_elr_el1(struct kvm_vcpu *vcpu, u64 val) /* Set the SPSR for the current mode */ static void vcpu_set_spsr(struct kvm_vcpu *vcpu, u64 val) { - if (vcpu_mode_is_32bit(vcpu)) + if (vcpu_mode_is_32bit(vcpu)) { + if (vcpu->arch.sysregs_loaded_on_cpu) + __sysreg32_save_state(vcpu); + *vcpu_spsr32(vcpu) = val; + if (vcpu->arch.sysregs_loaded_on_cpu) + __sysreg32_restore_state(vcpu); + + return; + } + if (vcpu->arch.sysregs_loaded_on_cpu) write_sysreg_el1(val, spsr); else @@ -129,11 +138,13 @@ static void inject_abt32(struct kvm_vcpu *vcpu, bool is_pabt, * IFAR: mapped to FAR_EL1 * DFSR: mapped to ESR_EL1 * TTBCR: mapped to TCR_EL1 +* IFSR: stored in IFSR32_EL2 */ if (vcpu->arch.sysregs_loaded_on_cpu) { vcpu->arch.ctxt.sys_regs[FAR_EL1] = read_sysreg_el1(far); vcpu->arch.ctxt.sys_regs[ESR_EL1] = read_sysreg_el1(esr); vcpu->arch.ctxt.sys_regs[TCR_EL1] = read_sysreg_el1(tcr); + vcpu->arch.ctxt.sys_regs[IFSR32_EL2] = read_sysreg(ifsr32_el2); } if (is_pabt) { @@ -161,6 +172,7 @@ static void inject_abt32(struct kvm_vcpu *vcpu, bool is_pabt, if (vcpu->arch.sysregs_loaded_on_cpu) { write_sysreg_el1(vcpu->arch.ctxt.sys_regs[FAR_EL1], far); write_sysreg_el1(vcpu->arch.ctxt.sys_regs[ESR_EL1], esr); + write_sysreg(vcpu->arch.ctxt.sys_regs[IFSR32_EL2], ifsr32_el2); } } -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 20/37] KVM: arm64: Unify non-VHE host/guest sysreg save and restore functions
There is no need to have multiple identical functions with different names for saving host and guest state. When saving and restoring state for the host and guest, the state is the same for both contexts, and that's why we have the kvm_cpu_context structure. Delete one version and rename the other into simply save/restore. Signed-off-by: Christoffer Dall--- arch/arm64/include/asm/kvm_hyp.h | 6 ++ arch/arm64/kvm/hyp/switch.c | 10 +- arch/arm64/kvm/hyp/sysreg-sr.c | 18 ++ 3 files changed, 9 insertions(+), 25 deletions(-) diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index 998152d..3f54c55 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -132,10 +132,8 @@ int __vgic_v3_perform_cpuif_access(struct kvm_vcpu *vcpu); void __timer_enable_traps(struct kvm_vcpu *vcpu); void __timer_disable_traps(struct kvm_vcpu *vcpu); -void __sysreg_save_host_state_nvhe(struct kvm_cpu_context *ctxt); -void __sysreg_restore_host_state_nvhe(struct kvm_cpu_context *ctxt); -void __sysreg_save_guest_state_nvhe(struct kvm_cpu_context *ctxt); -void __sysreg_restore_guest_state_nvhe(struct kvm_cpu_context *ctxt); +void __sysreg_save_state_nvhe(struct kvm_cpu_context *ctxt); +void __sysreg_restore_state_nvhe(struct kvm_cpu_context *ctxt); void sysreg_save_host_state_vhe(struct kvm_cpu_context *ctxt); void sysreg_restore_host_state_vhe(struct kvm_cpu_context *ctxt); void sysreg_save_guest_state_vhe(struct kvm_cpu_context *ctxt); diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 7c4d430..6356bec 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -383,7 +383,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) host_ctxt->__hyp_running_vcpu = vcpu; guest_ctxt = >arch.ctxt; - __sysreg_save_host_state_nvhe(host_ctxt); + __sysreg_save_state_nvhe(host_ctxt); __activate_traps(vcpu); __activate_vm(vcpu); @@ -396,7 +396,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) * to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72). */ __sysreg32_restore_state(vcpu); - __sysreg_restore_guest_state_nvhe(guest_ctxt); + __sysreg_restore_state_nvhe(guest_ctxt); __debug_switch_to_guest(vcpu); /* Jump in the fire! */ @@ -407,7 +407,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) if (fixup_guest_exit(vcpu, _code)) goto again; - __sysreg_save_guest_state_nvhe(guest_ctxt); + __sysreg_save_state_nvhe(guest_ctxt); __sysreg32_save_state(vcpu); __timer_disable_traps(vcpu); __vgic_save_state(vcpu); @@ -415,7 +415,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) __deactivate_traps(vcpu); __deactivate_vm(vcpu); - __sysreg_restore_host_state_nvhe(host_ctxt); + __sysreg_restore_state_nvhe(host_ctxt); /* * This must come after restoring the host sysregs, since a non-VHE @@ -440,7 +440,7 @@ static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par, __timer_disable_traps(vcpu); __deactivate_traps(vcpu); __deactivate_vm(vcpu); - __sysreg_restore_host_state_nvhe(__host_ctxt); + __sysreg_restore_state_nvhe(__host_ctxt); } /* diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c index d8c42de..f5c1b44 100644 --- a/arch/arm64/kvm/hyp/sysreg-sr.c +++ b/arch/arm64/kvm/hyp/sysreg-sr.c @@ -70,14 +70,7 @@ static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt) ctxt->gp_regs.regs.pstate = read_sysreg_el2(spsr); } -void __hyp_text __sysreg_save_host_state_nvhe(struct kvm_cpu_context *ctxt) -{ - __sysreg_save_el1_state(ctxt); - __sysreg_save_common_state(ctxt); - __sysreg_save_user_state(ctxt); -} - -void __hyp_text __sysreg_save_guest_state_nvhe(struct kvm_cpu_context *ctxt) +void __hyp_text __sysreg_save_state_nvhe(struct kvm_cpu_context *ctxt) { __sysreg_save_el1_state(ctxt); __sysreg_save_common_state(ctxt); @@ -138,14 +131,7 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt) write_sysreg_el2(ctxt->gp_regs.regs.pstate, spsr); } -void __hyp_text __sysreg_restore_host_state_nvhe(struct kvm_cpu_context *ctxt) -{ - __sysreg_restore_el1_state(ctxt); - __sysreg_restore_common_state(ctxt); - __sysreg_restore_user_state(ctxt); -} - -void __hyp_text __sysreg_restore_guest_state_nvhe(struct kvm_cpu_context *ctxt) +void __hyp_text __sysreg_restore_state_nvhe(struct kvm_cpu_context *ctxt) { __sysreg_restore_el1_state(ctxt); __sysreg_restore_common_state(ctxt); -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu
[PATCH 22/37] KVM: arm64: Change 32-bit handling of VM system registers
We currently handle 32-bit accesses to trapped VM system registers using the 32-bit index into the coproc array on the vcpu structure, which is a union of the coprog array and the sysreg array. Since all the 32-bit coproc indicies are created to correspond to the architectural mapping between 64-bit system registers and 32-bit coprocessor registers, and because the AArch64 system registers are the double in size of the AArch32 coprocessor registers, we can always find the system register entry that we must update by dividing the 32-bit coproc index by 2. This is going to make our lives much easier when we have to start accessing system registers that uses deferred save/restore and might have to be read directly from the physical CPU. Signed-off-by: Christoffer Dall--- arch/arm64/include/asm/kvm_host.h | 8 arch/arm64/kvm/sys_regs.c | 20 +++- 2 files changed, 15 insertions(+), 13 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 5e09eb9..9f5761f 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -289,14 +289,6 @@ struct kvm_vcpu_arch { #define vcpu_cp14(v,r) ((v)->arch.ctxt.copro[(r)]) #define vcpu_cp15(v,r) ((v)->arch.ctxt.copro[(r)]) -#ifdef CONFIG_CPU_BIG_ENDIAN -#define vcpu_cp15_64_high(v,r) vcpu_cp15((v),(r)) -#define vcpu_cp15_64_low(v,r) vcpu_cp15((v),(r) + 1) -#else -#define vcpu_cp15_64_high(v,r) vcpu_cp15((v),(r) + 1) -#define vcpu_cp15_64_low(v,r) vcpu_cp15((v),(r)) -#endif - struct kvm_vm_stat { ulong remote_tlb_flush; }; diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index bb0e41b..dbe35fd 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -120,16 +120,26 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r) { bool was_enabled = vcpu_has_cache_enabled(vcpu); + u64 val; + int reg = r->reg; BUG_ON(!p->is_write); - if (!p->is_aarch32) { - vcpu_sys_reg(vcpu, r->reg) = p->regval; + /* See the 32bit mapping in kvm_host.h */ + if (p->is_aarch32) + reg = r->reg / 2; + + if (!p->is_aarch32 || !p->is_32bit) { + val = p->regval; } else { - if (!p->is_32bit) - vcpu_cp15_64_high(vcpu, r->reg) = upper_32_bits(p->regval); - vcpu_cp15_64_low(vcpu, r->reg) = lower_32_bits(p->regval); + val = vcpu_sys_reg(vcpu, reg); + if (r->reg % 2) + val = (p->regval << 32) | (u64)lower_32_bits(val); + else + val = ((u64)upper_32_bits(val) << 32) | + (u64)lower_32_bits(p->regval); } + vcpu_sys_reg(vcpu, reg) = val; kvm_toggle_cache(vcpu, was_enabled); return true; -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 36/37] KVM: arm/arm64: Move VGIC APR save/restore to vgic put/load
The APRs can only have bits set when the guest acknowledges an interrupt in the LR and can only have a bit cleared when the guest EOIs an interrupt in the LR. Therefore, if we have no LRs with any pending/active interrupts, the APR cannot change value and there is no need to clear it on every exit from the VM (hint: it will have already been cleared when we exited the guest the last time with the LRs all EOIed). The only case we need to take care of is when we migrate the VCPU away from a CPU or migrate a new VCPU onto a CPU, or when we return to userspace to capture the state of the VCPU for migration. To make sure this works, factor out the APR save/restore functionality into separate functions called from the VCPU (and by extension VGIC) put/load hooks. Signed-off-by: Christoffer Dall--- arch/arm/include/asm/kvm_hyp.h | 2 + arch/arm64/include/asm/kvm_hyp.h | 2 + virt/kvm/arm/hyp/vgic-v3-sr.c| 123 +-- virt/kvm/arm/vgic/vgic-v2.c | 7 +-- virt/kvm/arm/vgic/vgic-v3.c | 5 ++ 5 files changed, 77 insertions(+), 62 deletions(-) diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h index ab20ffa..b3dd4f4 100644 --- a/arch/arm/include/asm/kvm_hyp.h +++ b/arch/arm/include/asm/kvm_hyp.h @@ -109,6 +109,8 @@ void __sysreg_restore_state(struct kvm_cpu_context *ctxt); void __vgic_v3_save_state(struct kvm_vcpu *vcpu); void __vgic_v3_restore_state(struct kvm_vcpu *vcpu); +void __vgic_v3_save_aprs(struct kvm_vcpu *vcpu); +void __vgic_v3_restore_aprs(struct kvm_vcpu *vcpu); asmlinkage void __vfp_save_state(struct vfp_hard_struct *vfp); asmlinkage void __vfp_restore_state(struct vfp_hard_struct *vfp); diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index bd3fe64..62872ce 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -125,6 +125,8 @@ int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu); void __vgic_v3_save_state(struct kvm_vcpu *vcpu); void __vgic_v3_restore_state(struct kvm_vcpu *vcpu); +void __vgic_v3_save_aprs(struct kvm_vcpu *vcpu); +void __vgic_v3_restore_aprs(struct kvm_vcpu *vcpu); int __vgic_v3_perform_cpuif_access(struct kvm_vcpu *vcpu); void __timer_enable_traps(struct kvm_vcpu *vcpu); diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c index 05548b2..ed5da75 100644 --- a/virt/kvm/arm/hyp/vgic-v3-sr.c +++ b/virt/kvm/arm/hyp/vgic-v3-sr.c @@ -221,14 +221,11 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu) if (used_lrs) { int i; - u32 nr_pre_bits; u32 elrsr; elrsr = read_gicreg(ICH_ELSR_EL2); write_gicreg(0, ICH_HCR_EL2); - val = read_gicreg(ICH_VTR_EL2); - nr_pre_bits = vtr_to_nr_pre_bits(val); for (i = 0; i < used_lrs; i++) { if (elrsr & (1 << i)) @@ -238,38 +235,9 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu) __gic_v3_set_lr(0, i); } - - switch (nr_pre_bits) { - case 7: - cpu_if->vgic_ap0r[3] = __vgic_v3_read_ap0rn(3); - cpu_if->vgic_ap0r[2] = __vgic_v3_read_ap0rn(2); - case 6: - cpu_if->vgic_ap0r[1] = __vgic_v3_read_ap0rn(1); - default: - cpu_if->vgic_ap0r[0] = __vgic_v3_read_ap0rn(0); - } - - switch (nr_pre_bits) { - case 7: - cpu_if->vgic_ap1r[3] = __vgic_v3_read_ap1rn(3); - cpu_if->vgic_ap1r[2] = __vgic_v3_read_ap1rn(2); - case 6: - cpu_if->vgic_ap1r[1] = __vgic_v3_read_ap1rn(1); - default: - cpu_if->vgic_ap1r[0] = __vgic_v3_read_ap1rn(0); - } } else { if (static_branch_unlikely(_v3_cpuif_trap)) write_gicreg(0, ICH_HCR_EL2); - - cpu_if->vgic_ap0r[0] = 0; - cpu_if->vgic_ap0r[1] = 0; - cpu_if->vgic_ap0r[2] = 0; - cpu_if->vgic_ap0r[3] = 0; - cpu_if->vgic_ap1r[0] = 0; - cpu_if->vgic_ap1r[1] = 0; - cpu_if->vgic_ap1r[2] = 0; - cpu_if->vgic_ap1r[3] = 0; } val = read_gicreg(ICC_SRE_EL2); @@ -286,8 +254,6 @@ void __hyp_text __vgic_v3_restore_state(struct kvm_vcpu *vcpu) { struct vgic_v3_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v3; u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs; - u64 val; - u32 nr_pre_bits; int i; /* @@ -305,32 +271,9 @@ void __hyp_text __vgic_v3_restore_state(struct kvm_vcpu *vcpu) write_gicreg(cpu_if->vgic_vmcr, ICH_VMCR_EL2); } - val = read_gicreg(ICH_VTR_EL2); -
[PATCH 29/37] KVM: arm64: Configure FPSIMD traps on vcpu load/put for VHE
There is no need to enable/disable traps to FP registers on every switch to/from the VM, because the host kernel does not use this resource without calling vcpu_put. We can therefore move things around enough that we still always write FPEXC32_EL2 before programming CPTR_EL2 but only program these during vcpu load/put. Signed-off-by: Christoffer Dall--- arch/arm64/include/asm/kvm_hyp.h | 3 +++ arch/arm64/kvm/hyp/switch.c | 34 -- arch/arm64/kvm/hyp/sysreg-sr.c | 4 3 files changed, 31 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index 3f54c55..28d5f3c 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -148,6 +148,9 @@ void __fpsimd_save_state(struct user_fpsimd_state *fp_regs); void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs); bool __fpsimd_enabled(void); +void activate_traps_vhe_load(struct kvm_vcpu *vcpu); +void deactivate_traps_vhe_put(void); + u64 __guest_enter(struct kvm_vcpu *vcpu, struct kvm_cpu_context *host_ctxt); void __noreturn __hyp_do_panic(unsigned long, ...); diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index c587416..09be10f 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -23,22 +23,25 @@ #include #include -static void __hyp_text __activate_traps_common(struct kvm_vcpu *vcpu) +static void __hyp_text __activate_traps_fpsimd32(struct kvm_vcpu *vcpu) { /* -* We are about to set CPTR_EL2.TFP to trap all floating point -* register accesses to EL2, however, the ARM ARM clearly states that -* traps are only taken to EL2 if the operation would not otherwise -* trap to EL1. Therefore, always make sure that for 32-bit guests, -* we set FPEXC.EN to prevent traps to EL1, when setting the TFP bit. -* If FP/ASIMD is not implemented, FPEXC is UNDEFINED and any access to -* it will cause an exception. +* We are about to trap all floating point register accesses to EL2, +* however, traps are only taken to EL2 if the operation would not +* otherwise trap to EL1. Therefore, always make sure that for 32-bit +* guests, we set FPEXC.EN to prevent traps to EL1, when setting the +* TFP bit. If FP/ASIMD is not implemented, FPEXC is UNDEFINED and +* any access to it will cause an exception. */ if (vcpu_el1_is_32bit(vcpu) && system_supports_fpsimd() && !vcpu->arch.guest_vfp_loaded) { write_sysreg(1 << 30, fpexc32_el2); isb(); } +} + +static void __hyp_text __activate_traps_common(struct kvm_vcpu *vcpu) +{ write_sysreg(vcpu->arch.hcr_el2, hcr_el2); /* Trap on AArch32 cp15 c15 (impdef sysregs) accesses (EL1 or EL0) */ @@ -60,10 +63,12 @@ static void __hyp_text __deactivate_traps_common(void) write_sysreg(0, pmuserenr_el0); } -static void __hyp_text __activate_traps_vhe(struct kvm_vcpu *vcpu) +void activate_traps_vhe_load(struct kvm_vcpu *vcpu) { u64 val; + __activate_traps_fpsimd32(vcpu); + val = read_sysreg(cpacr_el1); val |= CPACR_EL1_TTA; if (vcpu->arch.guest_vfp_loaded) @@ -71,7 +76,15 @@ static void __hyp_text __activate_traps_vhe(struct kvm_vcpu *vcpu) else val &= ~CPACR_EL1_FPEN; write_sysreg(val, cpacr_el1); +} +void deactivate_traps_vhe_put(void) +{ + write_sysreg(CPACR_EL1_FPEN, cpacr_el1); +} + +static void __hyp_text __activate_traps_vhe(struct kvm_vcpu *vcpu) +{ write_sysreg(__kvm_hyp_vector, vbar_el1); } @@ -79,6 +92,8 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu) { u64 val; + __activate_traps_fpsimd32(vcpu); + val = CPTR_EL2_DEFAULT; val |= CPTR_EL2_TTA; if (vcpu->arch.guest_vfp_loaded) @@ -109,7 +124,6 @@ static void __hyp_text __deactivate_traps_vhe(void) write_sysreg(mdcr_el2, mdcr_el2); write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2); - write_sysreg(CPACR_EL1_FPEN, cpacr_el1); write_sysreg(vectors, vbar_el1); } diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c index 6ff1ce5..54a4f55 100644 --- a/arch/arm64/kvm/hyp/sysreg-sr.c +++ b/arch/arm64/kvm/hyp/sysreg-sr.c @@ -242,6 +242,8 @@ void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu) __sysreg_restore_el1_state(guest_ctxt); vcpu->arch.sysregs_loaded_on_cpu = true; + + activate_traps_vhe_load(vcpu); } /** @@ -272,6 +274,8 @@ void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu) if (!has_vhe()) return; + deactivate_traps_vhe_put(); + __sysreg_save_el1_state(guest_ctxt); __sysreg_save_user_state(guest_ctxt); __sysreg32_save_state(vcpu); -- 2.9.0
[PATCH 24/37] KVM: arm64: Prepare to handle traps on deferred EL0 sysregs
We can trap access to ACTLR_EL1 which we can later defer to only save/restore during vcpu_load and vcpu_put, so let's read the value directly from the CPU when necessary. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/sys_regs_generic_v8.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kvm/sys_regs_generic_v8.c b/arch/arm64/kvm/sys_regs_generic_v8.c index 969ade1..348f227 100644 --- a/arch/arm64/kvm/sys_regs_generic_v8.c +++ b/arch/arm64/kvm/sys_regs_generic_v8.c @@ -38,7 +38,10 @@ static bool access_actlr(struct kvm_vcpu *vcpu, if (p->is_write) return ignore_write(vcpu, p); - p->regval = vcpu_sys_reg(vcpu, ACTLR_EL1); + if (vcpu->arch.sysregs_loaded_on_cpu) + read_sysreg(actlr_el1); + else + p->regval = vcpu_sys_reg(vcpu, ACTLR_EL1); return true; } -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 23/37] KVM: arm64: Prepare to handle traps on deferred VM sysregs
When we defer the save/restore of system registers to vcpu_load and vcpu_put, we need to take care of the emulation code that handles traps to these registers, since simply reading the memory array will return stale data. Therefore, introduce two functions to directly read/write the registers from the physical CPU when we're on a VHE system that has loaded the system registers onto the physical CPU. Signed-off-by: Christoffer Dall--- arch/arm64/include/asm/kvm_host.h | 4 +++ arch/arm64/kvm/sys_regs.c | 54 +-- 2 files changed, 56 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 9f5761f..dcded44 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -278,6 +278,10 @@ struct kvm_vcpu_arch { /* Detect first run of a vcpu */ bool has_run_once; + + /* True when deferrable sysregs are loaded on the physical CPU, +* see kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs. */ + bool sysregs_loaded_on_cpu; }; #define vcpu_gp_regs(v)(&(v)->arch.ctxt.gp_regs) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index dbe35fd..f7887dd 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include #include @@ -110,8 +111,57 @@ static bool access_dcsw(struct kvm_vcpu *vcpu, return true; } +static u64 read_deferrable_vm_reg(struct kvm_vcpu *vcpu, int reg) +{ + if (vcpu->arch.sysregs_loaded_on_cpu) { + switch (reg) { + case SCTLR_EL1: return read_sysreg_el1(sctlr); + case TTBR0_EL1: return read_sysreg_el1(ttbr0); + case TTBR1_EL1: return read_sysreg_el1(ttbr1); + case TCR_EL1: return read_sysreg_el1(tcr); + case ESR_EL1: return read_sysreg_el1(esr); + case FAR_EL1: return read_sysreg_el1(far); + case AFSR0_EL1: return read_sysreg_el1(afsr0); + case AFSR1_EL1: return read_sysreg_el1(afsr1); + case MAIR_EL1: return read_sysreg_el1(mair); + case AMAIR_EL1: return read_sysreg_el1(amair); + case CONTEXTIDR_EL1:return read_sysreg_el1(contextidr); + case DACR32_EL2:return read_sysreg(dacr32_el2); + case IFSR32_EL2:return read_sysreg(ifsr32_el2); + default:BUG(); + } + } + + return vcpu_sys_reg(vcpu, reg); +} + +static void write_deferrable_vm_reg(struct kvm_vcpu *vcpu, int reg, u64 val) +{ + if (vcpu->arch.sysregs_loaded_on_cpu) { + switch (reg) { + case SCTLR_EL1: write_sysreg_el1(val, sctlr); return; + case TTBR0_EL1: write_sysreg_el1(val, ttbr0); return; + case TTBR1_EL1: write_sysreg_el1(val, ttbr1); return; + case TCR_EL1: write_sysreg_el1(val, tcr); return; + case ESR_EL1: write_sysreg_el1(val, esr); return; + case FAR_EL1: write_sysreg_el1(val, far); return; + case AFSR0_EL1: write_sysreg_el1(val, afsr0); return; + case AFSR1_EL1: write_sysreg_el1(val, afsr1); return; + case MAIR_EL1: write_sysreg_el1(val, mair);return; + case AMAIR_EL1: write_sysreg_el1(val, amair); return; + case CONTEXTIDR_EL1:write_sysreg_el1(val, contextidr); return; + case DACR32_EL2:write_sysreg(val, dacr32_el2); return; + case IFSR32_EL2:write_sysreg(val, ifsr32_el2); return; + default:BUG(); + } + } + + vcpu_sys_reg(vcpu, reg) = val; +} + /* * Generic accessor for VM registers. Only called as long as HCR_TVM + * * is set. If the guest enables the MMU, we stop trapping the VM * sys_regs and leave it in complete control of the caches. */ @@ -132,14 +182,14 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu, if (!p->is_aarch32 || !p->is_32bit) { val = p->regval; } else { - val = vcpu_sys_reg(vcpu, reg); + val = read_deferrable_vm_reg(vcpu, reg); if (r->reg % 2) val = (p->regval << 32) | (u64)lower_32_bits(val); else val = ((u64)upper_32_bits(val) << 32) | (u64)lower_32_bits(p->regval); } - vcpu_sys_reg(vcpu, reg) = val; + write_deferrable_vm_reg(vcpu, reg, val); kvm_toggle_cache(vcpu, was_enabled); return true; -- 2.9.0
[PATCH 12/37] KVM: arm64: Factor out fault info population and gic workarounds
The current world-switch function has functionality to detect a number of cases where we need to fixup some part of the exit condition and possibly run the guest again, before having restored the host state. This includes populating missing fault info, emulating GICv2 CPU interface accesses when mapped at unaligned addresses, and emulating the GICv3 CPU interfaceon systems that need that. As we are about to have an alternative switch function for VHE systems, but VHE systems still need the same early fixup logic, factor out this logic into a separate function that can be shared by both switch functions. No functional change. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/hyp/switch.c | 91 ++--- 1 file changed, 52 insertions(+), 39 deletions(-) diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index e270cba..ed30af5 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -258,50 +258,24 @@ static void __hyp_text __skip_instr(struct kvm_vcpu *vcpu) write_sysreg_el2(*vcpu_pc(vcpu), elr); } -int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) +/* + * Return true when we were able to fixup the guest exit and should return to + * the guest, false when we should restore the host state and return to the + * main run loop. + */ +static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code) { - struct kvm_cpu_context *host_ctxt; - struct kvm_cpu_context *guest_ctxt; - u64 exit_code; - - vcpu = kern_hyp_va(vcpu); - - host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context); - host_ctxt->__hyp_running_vcpu = vcpu; - guest_ctxt = >arch.ctxt; - - __sysreg_save_host_state(host_ctxt); - - __activate_traps(vcpu); - __activate_vm(vcpu); - - __vgic_restore_state(vcpu); - __timer_enable_traps(vcpu); - - /* -* We must restore the 32-bit state before the sysregs, thanks -* to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72). -*/ - __sysreg32_restore_state(vcpu); - __sysreg_restore_guest_state(guest_ctxt); - __debug_switch_to_guest(vcpu); - - /* Jump in the fire! */ -again: - exit_code = __guest_enter(vcpu, host_ctxt); - /* And we're baaack! */ - /* * We're using the raw exception code in order to only process * the trap if no SError is pending. We will come back to the * same PC once the SError has been injected, and replay the * trapping instruction. */ - if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu)) - goto again; + if (*exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu)) + return true; if (static_branch_unlikely(_v2_cpuif_trap) && - exit_code == ARM_EXCEPTION_TRAP) { + *exit_code == ARM_EXCEPTION_TRAP) { bool valid; valid = kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_DABT_LOW && @@ -315,13 +289,13 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) if (ret == 1) { __skip_instr(vcpu); - goto again; + return true; } if (ret == -1) { /* Promote an illegal access to an SError */ __skip_instr(vcpu); - exit_code = ARM_EXCEPTION_EL1_SERROR; + *exit_code = ARM_EXCEPTION_EL1_SERROR; } /* 0 falls through to be handler out of EL2 */ @@ -329,19 +303,58 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) } if (static_branch_unlikely(_v3_cpuif_trap) && - exit_code == ARM_EXCEPTION_TRAP && + *exit_code == ARM_EXCEPTION_TRAP && (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 || kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_CP15_32)) { int ret = __vgic_v3_perform_cpuif_access(vcpu); if (ret == 1) { __skip_instr(vcpu); - goto again; + return true; } /* 0 falls through to be handled out of EL2 */ } + return false; +} + +int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) +{ + struct kvm_cpu_context *host_ctxt; + struct kvm_cpu_context *guest_ctxt; + u64 exit_code; + + vcpu = kern_hyp_va(vcpu); + + host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context); + host_ctxt->__hyp_running_vcpu = vcpu; + guest_ctxt = >arch.ctxt; + + __sysreg_save_host_state(host_ctxt); + + __activate_traps(vcpu); + __activate_vm(vcpu); + + __vgic_restore_state(vcpu); +
[PATCH 13/37] KVM: arm64: Introduce VHE-specific kvm_vcpu_run
So far this is just a copy of the legacy non-VHE switch function, where we only change the existing calls to has_vhe() in both the original and new functions. Signed-off-by: Christoffer Dall--- arch/arm/include/asm/kvm_asm.h | 4 +++ arch/arm64/include/asm/kvm_asm.h | 2 ++ arch/arm64/kvm/hyp/switch.c | 57 virt/kvm/arm/arm.c | 5 +++- 4 files changed, 67 insertions(+), 1 deletion(-) diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h index 36dd296..1a7bc5f 100644 --- a/arch/arm/include/asm/kvm_asm.h +++ b/arch/arm/include/asm/kvm_asm.h @@ -70,8 +70,12 @@ extern void __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu); extern void __kvm_timer_set_cntvoff(u32 cntvoff_low, u32 cntvoff_high); +/* no VHE on 32-bit :( */ +static inline int kvm_vcpu_run(struct kvm_vcpu *vcpu) { return 0; } + extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu); + extern void __init_stage2_translation(void); extern u64 __vgic_v3_get_ich_vtr_el2(void); diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 7e48a39..2eb5b23 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -57,6 +57,8 @@ extern void __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu); extern void __kvm_timer_set_cntvoff(u32 cntvoff_low, u32 cntvoff_high); +extern int kvm_vcpu_run(struct kvm_vcpu *vcpu); + extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu); extern u64 __vgic_v3_get_ich_vtr_el2(void); diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index ed30af5..8a0f38f 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -319,6 +319,63 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code) return false; } +/* Switch to the guest for VHE systems running in EL2 */ +int kvm_vcpu_run(struct kvm_vcpu *vcpu) +{ + struct kvm_cpu_context *host_ctxt; + struct kvm_cpu_context *guest_ctxt; + u64 exit_code; + + vcpu = kern_hyp_va(vcpu); + + host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context); + host_ctxt->__hyp_running_vcpu = vcpu; + guest_ctxt = >arch.ctxt; + + __sysreg_save_host_state(host_ctxt); + + __activate_traps(vcpu); + __activate_vm(vcpu); + + __vgic_restore_state(vcpu); + __timer_enable_traps(vcpu); + + /* +* We must restore the 32-bit state before the sysregs, thanks +* to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72). +*/ + __sysreg32_restore_state(vcpu); + __sysreg_restore_guest_state(guest_ctxt); + __debug_switch_to_guest(vcpu); + + /* Jump in the fire! */ +again: + exit_code = __guest_enter(vcpu, host_ctxt); + /* And we're baaack! */ + + if (fixup_guest_exit(vcpu, _code)) + goto again; + + __sysreg_save_guest_state(guest_ctxt); + __sysreg32_save_state(vcpu); + __timer_disable_traps(vcpu); + __vgic_save_state(vcpu); + + __deactivate_traps(vcpu); + __deactivate_vm(vcpu); + + __sysreg_restore_host_state(host_ctxt); + + /* +* This must come after restoring the host sysregs, since a non-VHE +* system may enable SPE here and make use of the TTBRs. +*/ + __debug_switch_to_host(vcpu); + + return exit_code; +} + +/* Switch to the guest for legacy non-VHE systems */ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) { struct kvm_cpu_context *host_ctxt; diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c index cf121b2..b11647a 100644 --- a/virt/kvm/arm/arm.c +++ b/virt/kvm/arm/arm.c @@ -706,7 +706,10 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) trace_kvm_entry(*vcpu_pc(vcpu)); guest_enter_irqoff(); - ret = kvm_call_hyp(__kvm_vcpu_run, vcpu); + if (has_vhe()) + ret = kvm_vcpu_run(vcpu); + else + ret = kvm_call_hyp(__kvm_vcpu_run, vcpu); vcpu->mode = OUTSIDE_GUEST_MODE; vcpu->stat.exits++; -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 15/37] KVM: arm64: Don't deactivate VM on VHE systems
There is no need to reset the VTTBR to zero when exiting the guest on VHE systems. VHE systems don't use stage 2 translations for the EL2&0 translation regime used by the host. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/hyp/switch.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index b72dc66..2cedf12 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -136,13 +136,13 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu) write_sysreg(0, pmuserenr_el0); } -static void __hyp_text __activate_vm(struct kvm_vcpu *vcpu) +static inline void __hyp_text __activate_vm(struct kvm_vcpu *vcpu) { struct kvm *kvm = kern_hyp_va(vcpu->kvm); write_sysreg(kvm->arch.vttbr, vttbr_el2); } -static void __hyp_text __deactivate_vm(struct kvm_vcpu *vcpu) +static inline void __hyp_text __deactivate_vm(struct kvm_vcpu *vcpu) { write_sysreg(0, vttbr_el2); } @@ -360,7 +360,6 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu) __vgic_save_state(vcpu); __deactivate_traps(vcpu); - __deactivate_vm(vcpu); __sysreg_restore_host_state(host_ctxt); -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 19/37] KVM: arm64: Introduce separate VHE/non-VHE sysreg save/restore functions
As we are about to handle system registers quite differently between VHE and non-VHE systems. In preparation for that, we need to split some of the handling functions between VHE and non-VHE functionality. For now, we simply copy the non-VHE functions, but we do change the use of static keys for VHE and non-VHE functionality now that we have separate functions. Signed-off-by: Christoffer Dall--- arch/arm64/include/asm/kvm_hyp.h | 12 arch/arm64/kvm/hyp/switch.c | 20 ++-- arch/arm64/kvm/hyp/sysreg-sr.c | 40 3 files changed, 50 insertions(+), 22 deletions(-) diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index a0e5a70..998152d 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -132,10 +132,14 @@ int __vgic_v3_perform_cpuif_access(struct kvm_vcpu *vcpu); void __timer_enable_traps(struct kvm_vcpu *vcpu); void __timer_disable_traps(struct kvm_vcpu *vcpu); -void __sysreg_save_host_state(struct kvm_cpu_context *ctxt); -void __sysreg_restore_host_state(struct kvm_cpu_context *ctxt); -void __sysreg_save_guest_state(struct kvm_cpu_context *ctxt); -void __sysreg_restore_guest_state(struct kvm_cpu_context *ctxt); +void __sysreg_save_host_state_nvhe(struct kvm_cpu_context *ctxt); +void __sysreg_restore_host_state_nvhe(struct kvm_cpu_context *ctxt); +void __sysreg_save_guest_state_nvhe(struct kvm_cpu_context *ctxt); +void __sysreg_restore_guest_state_nvhe(struct kvm_cpu_context *ctxt); +void sysreg_save_host_state_vhe(struct kvm_cpu_context *ctxt); +void sysreg_restore_host_state_vhe(struct kvm_cpu_context *ctxt); +void sysreg_save_guest_state_vhe(struct kvm_cpu_context *ctxt); +void sysreg_restore_guest_state_vhe(struct kvm_cpu_context *ctxt); void __sysreg32_save_state(struct kvm_vcpu *vcpu); void __sysreg32_restore_state(struct kvm_vcpu *vcpu); diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index b98b73b..7c4d430 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -330,7 +330,7 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu) host_ctxt->__hyp_running_vcpu = vcpu; guest_ctxt = >arch.ctxt; - __sysreg_save_host_state(host_ctxt); + sysreg_save_host_state_vhe(host_ctxt); __activate_traps(vcpu); __activate_vm(vcpu); @@ -342,7 +342,7 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu) * to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72). */ __sysreg32_restore_state(vcpu); - __sysreg_restore_guest_state(guest_ctxt); + sysreg_restore_guest_state_vhe(guest_ctxt); __debug_switch_to_guest(vcpu); /* Jump in the fire! */ @@ -353,13 +353,13 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu) if (fixup_guest_exit(vcpu, _code)) goto again; - __sysreg_save_guest_state(guest_ctxt); + sysreg_save_guest_state_vhe(guest_ctxt); __sysreg32_save_state(vcpu); __vgic_save_state(vcpu); __deactivate_traps(vcpu); - __sysreg_restore_host_state(host_ctxt); + sysreg_restore_host_state_vhe(host_ctxt); /* * This must come after restoring the host sysregs, since a non-VHE @@ -383,7 +383,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) host_ctxt->__hyp_running_vcpu = vcpu; guest_ctxt = >arch.ctxt; - __sysreg_save_host_state(host_ctxt); + __sysreg_save_host_state_nvhe(host_ctxt); __activate_traps(vcpu); __activate_vm(vcpu); @@ -396,7 +396,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) * to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72). */ __sysreg32_restore_state(vcpu); - __sysreg_restore_guest_state(guest_ctxt); + __sysreg_restore_guest_state_nvhe(guest_ctxt); __debug_switch_to_guest(vcpu); /* Jump in the fire! */ @@ -407,7 +407,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) if (fixup_guest_exit(vcpu, _code)) goto again; - __sysreg_save_guest_state(guest_ctxt); + __sysreg_save_guest_state_nvhe(guest_ctxt); __sysreg32_save_state(vcpu); __timer_disable_traps(vcpu); __vgic_save_state(vcpu); @@ -415,7 +415,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) __deactivate_traps(vcpu); __deactivate_vm(vcpu); - __sysreg_restore_host_state(host_ctxt); + __sysreg_restore_host_state_nvhe(host_ctxt); /* * This must come after restoring the host sysregs, since a non-VHE @@ -440,7 +440,7 @@ static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par, __timer_disable_traps(vcpu); __deactivate_traps(vcpu); __deactivate_vm(vcpu); - __sysreg_restore_host_state(__host_ctxt); +
[PATCH 21/37] KVM: arm64: Don't save the host ELR_EL2 and SPSR_EL2 on VHE systems
On non-VHE systems we need to save the ELR_EL2 and SPSR_EL2 so that we can return to the host in EL1 in the same state and location where we issued a hypercall to EL2, but these registers don't contain anything important on VHE, because all of the host runs in EL2. Therefore, exclude them when saving/restoring the host state. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/hyp/sysreg-sr.c | 13 + 1 file changed, 13 insertions(+) diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c index f5c1b44..354ca02 100644 --- a/arch/arm64/kvm/hyp/sysreg-sr.c +++ b/arch/arm64/kvm/hyp/sysreg-sr.c @@ -66,6 +66,10 @@ static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt) ctxt->gp_regs.sp_el1= read_sysreg(sp_el1); ctxt->gp_regs.elr_el1 = read_sysreg_el1(elr); ctxt->gp_regs.spsr[KVM_SPSR_EL1]= read_sysreg_el1(spsr); +} + +static void __hyp_text __sysreg_save_el2_return_state(struct kvm_cpu_context *ctxt) +{ ctxt->gp_regs.regs.pc = read_sysreg_el2(elr); ctxt->gp_regs.regs.pstate = read_sysreg_el2(spsr); } @@ -75,6 +79,7 @@ void __hyp_text __sysreg_save_state_nvhe(struct kvm_cpu_context *ctxt) __sysreg_save_el1_state(ctxt); __sysreg_save_common_state(ctxt); __sysreg_save_user_state(ctxt); + __sysreg_save_el2_return_state(ctxt); } void sysreg_save_host_state_vhe(struct kvm_cpu_context *ctxt) @@ -88,6 +93,7 @@ void sysreg_save_guest_state_vhe(struct kvm_cpu_context *ctxt) __sysreg_save_el1_state(ctxt); __sysreg_save_common_state(ctxt); __sysreg_save_user_state(ctxt); + __sysreg_save_el2_return_state(ctxt); } static void __hyp_text __sysreg_restore_common_state(struct kvm_cpu_context *ctxt) @@ -127,6 +133,11 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt) write_sysreg(ctxt->gp_regs.sp_el1, sp_el1); write_sysreg_el1(ctxt->gp_regs.elr_el1, elr); write_sysreg_el1(ctxt->gp_regs.spsr[KVM_SPSR_EL1],spsr); +} + +static void __hyp_text +__sysreg_restore_el2_return_state(struct kvm_cpu_context *ctxt) +{ write_sysreg_el2(ctxt->gp_regs.regs.pc, elr); write_sysreg_el2(ctxt->gp_regs.regs.pstate, spsr); } @@ -136,6 +147,7 @@ void __hyp_text __sysreg_restore_state_nvhe(struct kvm_cpu_context *ctxt) __sysreg_restore_el1_state(ctxt); __sysreg_restore_common_state(ctxt); __sysreg_restore_user_state(ctxt); + __sysreg_restore_el2_return_state(ctxt); } void sysreg_restore_host_state_vhe(struct kvm_cpu_context *ctxt) @@ -149,6 +161,7 @@ void sysreg_restore_guest_state_vhe(struct kvm_cpu_context *ctxt) __sysreg_restore_el1_state(ctxt); __sysreg_restore_common_state(ctxt); __sysreg_restore_user_state(ctxt); + __sysreg_restore_el2_return_state(ctxt); } static void __hyp_text __fpsimd32_save_state(struct kvm_cpu_context *ctxt) -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 10/37] KVM: arm64: Slightly improve debug save/restore functions
The debug save/restore functions can be improved by using the has_vhe() static key instead of the instruction alternative. Using the static key uses the same paradigm as we're going to use elsewhere, it makes the code more readable, and it generates slightly better code (no stack setups and function calls unless necessary). We also use a static key on the restore path, because it will be marginally faster than loading a value from memory. Finally, we don't have to conditionally clear the debug dirty flag if it's set, we can just clear it. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/hyp/debug-sr.c | 22 +- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/arch/arm64/kvm/hyp/debug-sr.c b/arch/arm64/kvm/hyp/debug-sr.c index 0fc0758..a2291b6 100644 --- a/arch/arm64/kvm/hyp/debug-sr.c +++ b/arch/arm64/kvm/hyp/debug-sr.c @@ -75,11 +75,6 @@ #define psb_csync()asm volatile("hint #17") -static void __hyp_text __debug_save_spe_vhe(u64 *pmscr_el1) -{ - /* The vcpu can run. but it can't hide. */ -} - static void __hyp_text __debug_save_spe_nvhe(u64 *pmscr_el1) { u64 reg; @@ -109,10 +104,6 @@ static void __hyp_text __debug_save_spe_nvhe(u64 *pmscr_el1) dsb(nsh); } -static hyp_alternate_select(__debug_save_spe, - __debug_save_spe_nvhe, __debug_save_spe_vhe, - ARM64_HAS_VIRT_HOST_EXTN); - static void __hyp_text __debug_restore_spe(u64 pmscr_el1) { if (!pmscr_el1) @@ -174,17 +165,22 @@ void __hyp_text __debug_cond_save_host_state(struct kvm_vcpu *vcpu) { __debug_save_state(vcpu, >arch.host_debug_state.regs, kern_hyp_va(vcpu->arch.host_cpu_context)); - __debug_save_spe()(>arch.host_debug_state.pmscr_el1); + + /* Non-VHE: Disable and flush SPE data generation +* VHE: The vcpu can run. but it can't hide. */ + if (!has_vhe()) + __debug_save_spe_nvhe(>arch.host_debug_state.pmscr_el1); } void __hyp_text __debug_cond_restore_host_state(struct kvm_vcpu *vcpu) { - __debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1); + if (!has_vhe()) + __debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1); + __debug_restore_state(vcpu, >arch.host_debug_state.regs, kern_hyp_va(vcpu->arch.host_cpu_context)); - if (vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY) - vcpu->arch.debug_flags &= ~KVM_ARM64_DEBUG_DIRTY; + vcpu->arch.debug_flags &= ~KVM_ARM64_DEBUG_DIRTY; } u32 __hyp_text __kvm_get_mdcr_el2(void) -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 16/37] KVM: arm64: Remove noop calls to timer save/restore from VHE switch
The VHE switch function calls __timer_enable_traps and __timer_disable_traps which don't do anything on VHE systems. Therefore, simply remove these calls from the VHE switch function and make the functions non-conditional as they are now only called from the non-VHE switch path. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/hyp/switch.c | 2 -- virt/kvm/arm/hyp/timer-sr.c | 10 ++ 2 files changed, 2 insertions(+), 10 deletions(-) diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 2cedf12..b98b73b 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -336,7 +336,6 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu) __activate_vm(vcpu); __vgic_restore_state(vcpu); - __timer_enable_traps(vcpu); /* * We must restore the 32-bit state before the sysregs, thanks @@ -356,7 +355,6 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu) __sysreg_save_guest_state(guest_ctxt); __sysreg32_save_state(vcpu); - __timer_disable_traps(vcpu); __vgic_save_state(vcpu); __deactivate_traps(vcpu); diff --git a/virt/kvm/arm/hyp/timer-sr.c b/virt/kvm/arm/hyp/timer-sr.c index f398616..82c217e 100644 --- a/virt/kvm/arm/hyp/timer-sr.c +++ b/virt/kvm/arm/hyp/timer-sr.c @@ -53,16 +53,10 @@ void __hyp_text disable_el1_phys_timer_access(void) void __hyp_text __timer_disable_traps(struct kvm_vcpu *vcpu) { - /* -* We don't need to do this for VHE since the host kernel runs in EL2 -* with HCR_EL2.TGE ==1, which makes those bits have no impact. -*/ - if (!has_vhe()) - enable_el1_phys_timer_access(); + enable_el1_phys_timer_access(); } void __hyp_text __timer_enable_traps(struct kvm_vcpu *vcpu) { - if (!has_vhe()) - disable_el1_phys_timer_access(); + disable_el1_phys_timer_access(); } -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 14/37] KVM: arm64: Remove kern_hyp_va() use in VHE switch function
VHE kernels run completely in EL2 and therefore don't have a notion of kernel and hyp addresses, they are all just kernel addresses. Therefore don't call kern_hyp_va() in the VHE switch function. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/hyp/switch.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 8a0f38f..b72dc66 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -326,9 +326,7 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu) struct kvm_cpu_context *guest_ctxt; u64 exit_code; - vcpu = kern_hyp_va(vcpu); - - host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context); + host_ctxt = vcpu->arch.host_cpu_context; host_ctxt->__hyp_running_vcpu = vcpu; guest_ctxt = >arch.ctxt; -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 09/37] KVM: arm64: Move debug dirty flag calculation out of world switch
There is no need to figure out inside the world-switch if we should save/restore the debug registers or not, we can might as well do that in the higher level debug setup code, making it easier to optimize down the line. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/debug.c| 9 + arch/arm64/kvm/hyp/debug-sr.c | 6 -- 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index dbadfaf..62550de19 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -193,6 +193,15 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) if (trap_debug) vcpu->arch.mdcr_el2 |= MDCR_EL2_TDA; + /* +* If any of KDE, MDE or KVM_ARM64_DEBUG_DIRTY is set, perform +* a full save/restore cycle. +*/ + if ((vcpu_sys_reg(vcpu, MDSCR_EL1) & DBG_MDSCR_KDE) || + (vcpu_sys_reg(vcpu, MDSCR_EL1) & DBG_MDSCR_MDE)) + vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY; + + trace_kvm_arm_set_dreg32("MDCR_EL2", vcpu->arch.mdcr_el2); trace_kvm_arm_set_dreg32("MDSCR_EL1", vcpu_sys_reg(vcpu, MDSCR_EL1)); } diff --git a/arch/arm64/kvm/hyp/debug-sr.c b/arch/arm64/kvm/hyp/debug-sr.c index f5154ed..0fc0758 100644 --- a/arch/arm64/kvm/hyp/debug-sr.c +++ b/arch/arm64/kvm/hyp/debug-sr.c @@ -172,12 +172,6 @@ void __hyp_text __debug_restore_state(struct kvm_vcpu *vcpu, void __hyp_text __debug_cond_save_host_state(struct kvm_vcpu *vcpu) { - /* If any of KDE, MDE or KVM_ARM64_DEBUG_DIRTY is set, perform -* a full save/restore cycle. */ - if ((vcpu->arch.ctxt.sys_regs[MDSCR_EL1] & DBG_MDSCR_KDE) || - (vcpu->arch.ctxt.sys_regs[MDSCR_EL1] & DBG_MDSCR_MDE)) - vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY; - __debug_save_state(vcpu, >arch.host_debug_state.regs, kern_hyp_va(vcpu->arch.host_cpu_context)); __debug_save_spe()(>arch.host_debug_state.pmscr_el1); -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 06/37] KVM: arm/arm64: Only load/put VCPU state for KVM_RUN
We only want to do things like invalidating TLBs or load timer state onto the physical CPU if we really intend to run the VCPU, not if we are simply retrieving some in-kernel value from userspace, for example. Signed-off-by: Christoffer Dall--- virt/kvm/arm/arm.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c index 6e9513e..d495453 100644 --- a/virt/kvm/arm/arm.c +++ b/virt/kvm/arm/arm.c @@ -338,6 +338,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) { int *last_ran; + if (vcpu->ioctl != KVM_RUN) + return; + last_ran = this_cpu_ptr(vcpu->kvm->arch.last_vcpu_ran); /* @@ -359,6 +362,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) { + if (vcpu->ioctl != KVM_RUN) + return; + kvm_timer_vcpu_put(vcpu); kvm_vgic_put(vcpu); -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 05/37] KVM: Record the executing ioctl number on the vcpu struct
Some architectures may decide to do different things during kvm_arch_vcpu_load depending on the ioctl being executed. For example, arm64 is about to do significant work in vcpu load/put when running a vcpu, but not when doing things like KVM_SET_ONE_REG or KVM_SET_MP_STATE. Therefore, store the ioctl number that we are executing on the VCPU during the first vcpu_load() which succeeds in getting the vcpu->mutex and set the ioctl number to 0 when exiting kvm_vcpu_ioctl() after successfully loading the vcpu. Cc: Paolo BonziniCc: Radim Krčmář Signed-off-by: Christoffer Dall --- arch/x86/kvm/vmx.c | 2 +- arch/x86/kvm/x86.c | 8 include/linux/kvm_host.h | 3 ++- virt/kvm/kvm_main.c | 6 -- 4 files changed, 11 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 6970249..d729889 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -9493,7 +9493,7 @@ static void vmx_free_vcpu_nested(struct kvm_vcpu *vcpu) struct vcpu_vmx *vmx = to_vmx(vcpu); int r; - r = vcpu_load(vcpu); + r = vcpu_load(vcpu, vcpu->ioctl); BUG_ON(r); vmx_switch_vmcs(vcpu, >vmcs01); free_nested(vmx); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index cd17b7d..68d9c95 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7723,7 +7723,7 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu) int r; kvm_vcpu_mtrr_init(vcpu); - r = vcpu_load(vcpu); + r = vcpu_load(vcpu, vcpu->ioctl); if (r) return r; kvm_vcpu_reset(vcpu, false); @@ -7739,7 +7739,7 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu) kvm_hv_vcpu_postcreate(vcpu); - if (vcpu_load(vcpu)) + if (vcpu_load(vcpu, vcpu->ioctl)) return; msr.data = 0x0; msr.index = MSR_IA32_TSC; @@ -7759,7 +7759,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) int r; vcpu->arch.apf.msr_val = 0; - r = vcpu_load(vcpu); + r = vcpu_load(vcpu, vcpu->ioctl); BUG_ON(r); kvm_mmu_unload(vcpu); vcpu_put(vcpu); @@ -8116,7 +8116,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) static void kvm_unload_vcpu_mmu(struct kvm_vcpu *vcpu) { int r; - r = vcpu_load(vcpu); + r = vcpu_load(vcpu, vcpu->ioctl); BUG_ON(r); kvm_mmu_unload(vcpu); vcpu_put(vcpu); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 6882538..da0acc0 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -274,6 +274,7 @@ struct kvm_vcpu { bool preempted; struct kvm_vcpu_arch arch; struct dentry *debugfs_dentry; + unsigned int ioctl; /* ioctl currently executing or 0 */ }; static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu) @@ -533,7 +534,7 @@ static inline int kvm_vcpu_get_idx(struct kvm_vcpu *vcpu) int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id); void kvm_vcpu_uninit(struct kvm_vcpu *vcpu); -int __must_check vcpu_load(struct kvm_vcpu *vcpu); +int __must_check vcpu_load(struct kvm_vcpu *vcpu, unsigned int ioctl); void vcpu_put(struct kvm_vcpu *vcpu); #ifdef __KVM_HAVE_IOAPIC diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 9deb5a2..1911ef0 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -147,12 +147,13 @@ bool kvm_is_reserved_pfn(kvm_pfn_t pfn) /* * Switches to specified vcpu, until a matching vcpu_put() */ -int vcpu_load(struct kvm_vcpu *vcpu) +int vcpu_load(struct kvm_vcpu *vcpu, unsigned int ioctl) { int cpu; if (mutex_lock_killable(>mutex)) return -EINTR; + vcpu->ioctl = ioctl; cpu = get_cpu(); preempt_notifier_register(>preempt_notifier); kvm_arch_vcpu_load(vcpu, cpu); @@ -2529,7 +2530,7 @@ static long kvm_vcpu_ioctl(struct file *filp, #endif - r = vcpu_load(vcpu); + r = vcpu_load(vcpu, ioctl); if (r) return r; switch (ioctl) { @@ -2704,6 +2705,7 @@ static long kvm_vcpu_ioctl(struct file *filp, } out: vcpu_put(vcpu); + vcpu->ioctl = 0; kfree(fpu); kfree(kvm_sregs); return r; -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 07/37] KVM: arm/arm64: Add kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs
As we are about to move a buch of save/restore logic for VHE kernels to the load and put functions, we need some infrastructure to do this. Signed-off-by: Christoffer Dall--- arch/arm/include/asm/kvm_host.h | 3 +++ arch/arm64/include/asm/kvm_host.h | 3 +++ arch/arm64/kvm/hyp/sysreg-sr.c| 27 +++ virt/kvm/arm/arm.c| 2 ++ 4 files changed, 35 insertions(+) diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index 1100170..13f8165 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -290,4 +290,7 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu, int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr); +static inline void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu) {} +static inline void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu) {} + #endif /* __ARM_KVM_HOST_H__ */ diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 27305e7..7d3bfa7 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -383,4 +383,7 @@ static inline void __cpu_init_stage2(void) "PARange is %d bits, unsupported configuration!", parange); } +void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu); +void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu); + #endif /* __ARM64_KVM_HOST_H__ */ diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c index c54cc2a..b7438c8 100644 --- a/arch/arm64/kvm/hyp/sysreg-sr.c +++ b/arch/arm64/kvm/hyp/sysreg-sr.c @@ -183,3 +183,30 @@ void __hyp_text __sysreg32_restore_state(struct kvm_vcpu *vcpu) if (vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY) write_sysreg(sysreg[DBGVCR32_EL2], dbgvcr32_el2); } + +/** + * kvm_vcpu_load_sysregs - Load guest system register to physical CPU + * + * @vcpu: The VCPU pointer + * + * If the kernel runs in EL2 then load the system register state for the VCPU + * for EL1 onto the physical CPU so that we can go back and foward between the + * VM and the hypervisor without switching all this state around. + */ +void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu) +{ +} + +/** + * kvm_vcpu_put_sysregs - Restore host system register state to physical CPU + * + * @vcpu: The VCPU pointer + * + * If the kernel runs in EL2 and the physical register state belongs to the + * VCPU, then restore the system register state for the host for EL1 onto the + * physical CPU so that we can run userspace and other threads on this + * physical CPU. + */ +void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu) +{ +} diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c index d495453..cf121b2 100644 --- a/virt/kvm/arm/arm.c +++ b/virt/kvm/arm/arm.c @@ -358,6 +358,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) kvm_arm_set_running_vcpu(vcpu); kvm_vgic_load(vcpu); kvm_timer_vcpu_load(vcpu); + kvm_vcpu_load_sysregs(vcpu); } void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) @@ -365,6 +366,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) if (vcpu->ioctl != KVM_RUN) return; + kvm_vcpu_put_sysregs(vcpu); kvm_timer_vcpu_put(vcpu); kvm_vgic_put(vcpu); -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 04/37] KVM: arm/arm64: Get rid of vcpu->arch.irq_lines
We currently have a separate read-modify-write of the HCR_EL2 on entry to the guest for the sole purpose of setting the VF and VI bits, if set. Since this is most rarely the case (only when using userspace IRQ chip and interrupts are in flight), let's get rid of this operation and instead modify the bits in the vcpu->arch.hcr[_el2] directly when needed. Signed-off-by: Christoffer Dall--- arch/arm/include/asm/kvm_emulate.h | 9 ++--- arch/arm/include/asm/kvm_host.h | 3 --- arch/arm/kvm/emulate.c | 2 +- arch/arm/kvm/hyp/switch.c| 2 +- arch/arm64/include/asm/kvm_emulate.h | 9 ++--- arch/arm64/include/asm/kvm_host.h| 3 --- arch/arm64/kvm/hyp/switch.c | 6 -- arch/arm64/kvm/inject_fault.c| 2 +- virt/kvm/arm/arm.c | 11 ++- virt/kvm/arm/mmu.c | 6 +++--- 10 files changed, 16 insertions(+), 37 deletions(-) diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h index 98089ff..34663a8 100644 --- a/arch/arm/include/asm/kvm_emulate.h +++ b/arch/arm/include/asm/kvm_emulate.h @@ -62,14 +62,9 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu) vcpu->arch.hcr = HCR_GUEST_MASK; } -static inline unsigned long vcpu_get_hcr(const struct kvm_vcpu *vcpu) +static inline unsigned long *vcpu_hcr(const struct kvm_vcpu *vcpu) { - return vcpu->arch.hcr; -} - -static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr) -{ - vcpu->arch.hcr = hcr; + return (unsigned long *)>arch.hcr; } static inline bool vcpu_mode_is_32bit(const struct kvm_vcpu *vcpu) diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index 4a879f6..1100170 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -153,9 +153,6 @@ struct kvm_vcpu_arch { /* HYP trapping configuration */ u32 hcr; - /* Interrupt related fields */ - u32 irq_lines; /* IRQ and FIQ levels */ - /* Exception Information */ struct kvm_vcpu_fault_info fault; diff --git a/arch/arm/kvm/emulate.c b/arch/arm/kvm/emulate.c index 0064b86..4286a89 100644 --- a/arch/arm/kvm/emulate.c +++ b/arch/arm/kvm/emulate.c @@ -313,5 +313,5 @@ void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr) */ void kvm_inject_vabt(struct kvm_vcpu *vcpu) { - vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VA); + *vcpu_hcr(vcpu) |= HCR_VA; } diff --git a/arch/arm/kvm/hyp/switch.c b/arch/arm/kvm/hyp/switch.c index 330c9ce..c3b9799 100644 --- a/arch/arm/kvm/hyp/switch.c +++ b/arch/arm/kvm/hyp/switch.c @@ -43,7 +43,7 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu, u32 *fpexc_host) isb(); } - write_sysreg(vcpu->arch.hcr | vcpu->arch.irq_lines, HCR); + write_sysreg(vcpu->arch.hcr, HCR); /* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */ write_sysreg(HSTR_T(15), HSTR); write_sysreg(HCPTR_TTA | HCPTR_TCP(10) | HCPTR_TCP(11), HCPTR); diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index e5df3fc..1fbfe96 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -51,14 +51,9 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu) vcpu->arch.hcr_el2 &= ~HCR_RW; } -static inline unsigned long vcpu_get_hcr(struct kvm_vcpu *vcpu) +static inline unsigned long *vcpu_hcr(struct kvm_vcpu *vcpu) { - return vcpu->arch.hcr_el2; -} - -static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr) -{ - vcpu->arch.hcr_el2 = hcr; + return (unsigned long *)>arch.hcr_el2; } static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 806ccef..27305e7 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -266,9 +266,6 @@ struct kvm_vcpu_arch { /* IO related fields */ struct kvm_decode mmio_decode; - /* Interrupt related fields */ - u64 irq_lines; /* IRQ and FIQ levels */ - /* Cache some mmu pages needed inside spinlock regions */ struct kvm_mmu_memory_cache mmu_page_cache; diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index bcf1a79..7703d63 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -168,12 +168,6 @@ static void __hyp_text __vgic_save_state(struct kvm_vcpu *vcpu) static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu) { - u64 val; - - val = read_sysreg(hcr_el2); - val |= vcpu->arch.irq_lines; - write_sysreg(val, hcr_el2); - if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) __vgic_v3_restore_state(vcpu); else diff --git
[PATCH 01/37] KVM: arm64: Avoid storing the vcpu pointer on the stack
We already have the percpu area for the host cpu state, which points to the VCPU, so there's no need to store the VCPU pointer on the stack on every context switch. We can be a little more clever and just use tpidr_el2 for the percpu offset and load the VCPU pointer from the host context. This requires us to have a scratch register though, so we take the chance to rearrange some of the el1_sync code to only look at the vttbr_el2 to determine if this is a trap from the guest or an HVC from the host. We do add an extra check to call the panic code if the kernel is configured with debugging enabled and we saw a trap from the host which wasn't an HVC, indicating that we left some EL2 trap configured by mistake. Signed-off-by: Christoffer Dall--- arch/arm64/include/asm/kvm_asm.h | 20 arch/arm64/kernel/asm-offsets.c | 1 + arch/arm64/kvm/hyp/entry.S | 5 + arch/arm64/kvm/hyp/hyp-entry.S | 39 ++- arch/arm64/kvm/hyp/switch.c | 2 +- 5 files changed, 41 insertions(+), 26 deletions(-) diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index ab4d0a9..7e48a39 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -70,4 +70,24 @@ extern u32 __init_stage2_translation(void); #endif +#ifdef __ASSEMBLY__ +.macro get_host_ctxt reg, tmp + /* +* '=kvm_host_cpu_state' is a host VA from the constant pool, it may +* not be accessible by this address from EL2, hyp_panic() converts +* it with kern_hyp_va() before use. +*/ + ldr \reg, =kvm_host_cpu_state + mrs \tmp, tpidr_el2 + add \reg, \reg, \tmp + kern_hyp_va \reg +.endm + +.macro get_vcpu vcpu, ctxt + ldr \vcpu, [\ctxt, #HOST_CONTEXT_VCPU] + kern_hyp_va \vcpu +.endm + +#endif + #endif /* __ARM_KVM_ASM_H__ */ diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c index 71bf088..612021d 100644 --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c @@ -135,6 +135,7 @@ int main(void) DEFINE(CPU_FP_REGS, offsetof(struct kvm_regs, fp_regs)); DEFINE(VCPU_FPEXC32_EL2, offsetof(struct kvm_vcpu, arch.ctxt.sys_regs[FPEXC32_EL2])); DEFINE(VCPU_HOST_CONTEXT,offsetof(struct kvm_vcpu, arch.host_cpu_context)); + DEFINE(HOST_CONTEXT_VCPU,offsetof(struct kvm_cpu_context, __hyp_running_vcpu)); #endif #ifdef CONFIG_CPU_PM DEFINE(CPU_SUSPEND_SZ, sizeof(struct cpu_suspend_ctx)); diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S index 9a8ab5d..76cd48f 100644 --- a/arch/arm64/kvm/hyp/entry.S +++ b/arch/arm64/kvm/hyp/entry.S @@ -62,9 +62,6 @@ ENTRY(__guest_enter) // Store the host regs save_callee_saved_regs x1 - // Store host_ctxt and vcpu for use at exit time - stp x1, x0, [sp, #-16]! - add x18, x0, #VCPU_CONTEXT // Restore guest regs x0-x17 @@ -119,7 +116,7 @@ ENTRY(__guest_exit) save_callee_saved_regs x1 // Restore the host_ctxt from the stack - ldr x2, [sp], #16 + get_host_ctxt x2, x3 // Now restore the host regs restore_callee_saved_regs x2 diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S index e4f37b9..2950f26 100644 --- a/arch/arm64/kvm/hyp/hyp-entry.S +++ b/arch/arm64/kvm/hyp/hyp-entry.S @@ -56,19 +56,16 @@ ENDPROC(__vhe_hyp_call) el1_sync: // Guest trapped into EL2 stp x0, x1, [sp, #-16]! -alternative_if_not ARM64_HAS_VIRT_HOST_EXTN - mrs x1, esr_el2 -alternative_else - mrs x1, esr_el1 -alternative_endif - lsr x0, x1, #ESR_ELx_EC_SHIFT - - cmp x0, #ESR_ELx_EC_HVC64 - b.neel1_trap - mrs x1, vttbr_el2 // If vttbr is valid, the 64bit guest cbnzx1, el1_trap// called HVC +#ifdef CONFIG_DEBUG + mrs x0, esr_el2 + lsr x0, x0, #ESR_ELx_EC_SHIFT + cmp x0, #ESR_ELx_EC_HVC64 + b.ne__hyp_panic +#endif + /* Here, we're pretty sure the host called HVC. */ ldp x0, x1, [sp], #16 @@ -101,10 +98,15 @@ alternative_endif eret el1_trap: + get_host_ctxt x0, x1 + get_vcpux1, x0 + + mrs x0, esr_el2 + lsr x0, x0, #ESR_ELx_EC_SHIFT /* * x0: ESR_EC +* x1: vcpu pointer */ - ldr x1, [sp, #16 + 8] // vcpu stored by __guest_enter /* * We trap the first access to the FP/SIMD to save the host context @@ -122,13 +124,15 @@ alternative_else_nop_endif el1_irq: stp x0, x1, [sp, #-16]! - ldr x1, [sp, #16 + 8] + get_host_ctxt x0, x1 + get_vcpux1, x0 mov x0, #ARM_EXCEPTION_IRQ b __guest_exit el1_error:
[PATCH 03/37] KVM: arm64: Move HCR_INT_OVERRIDE to default HCR_EL2 guest flag
From: Shih-Wei LiWe always set the IMO and FMO bits in the HCR_EL2 when running the guest, regardless if we use the vgic or not. By moving these flags to HCR_GUEST_FLAGS we can avoid one of the extra save/restore operations of HCR_EL2 in the world switch code, and we can also soon get rid of the other one. Signed-off-by: Shih-Wei Li Signed-off-by: Christoffer Dall --- arch/arm64/include/asm/kvm_arm.h | 4 ++-- arch/arm64/kvm/hyp/switch.c | 3 --- 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 61d694c..e67e279 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -79,9 +79,9 @@ */ #define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \ HCR_TVM | HCR_BSU_IS | HCR_FB | HCR_TAC | \ -HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW) +HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | \ +HCR_FMO | HCR_IMO) #define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF) -#define HCR_INT_OVERRIDE (HCR_FMO | HCR_IMO) #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H) /* TCR_EL2 Registers bits */ diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index a50ddf3..bcf1a79 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -164,8 +164,6 @@ static void __hyp_text __vgic_save_state(struct kvm_vcpu *vcpu) __vgic_v3_save_state(vcpu); else __vgic_v2_save_state(vcpu); - - write_sysreg(read_sysreg(hcr_el2) & ~HCR_INT_OVERRIDE, hcr_el2); } static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu) @@ -173,7 +171,6 @@ static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu) u64 val; val = read_sysreg(hcr_el2); - val |= HCR_INT_OVERRIDE; val |= vcpu->arch.irq_lines; write_sysreg(val, hcr_el2); -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 02/37] KVM: arm64: Rework hyp_panic for VHE and non-VHE
VHE actually doesn't rely on clearing the VTTBR when returning to the hsot kernel, and that is the current key mechanism of hyp_panic to figure out how to attempt to return to a state good enough to print a panic statement. Therefore, we split the hyp_panic function into two functions, a VHE and a non-VHE, keeping the non-VHE version intact, but changing the VHE behavior. The vttbr_el2 check on VHE doesn't really make that much sense, because the only situation where we can get here on VHE is when the hypervisor assembly code actually caleld into hyp_panic, which only happens when VBAR_EL2 has been set to the KVM exception vectors. On VHE, we can always safely disable the traps and restore the host registers at this point, so we simply do that unconditionally and call into the panic function directly. Signed-off-by: Christoffer Dall--- arch/arm64/kvm/hyp/switch.c | 45 +++-- 1 file changed, 23 insertions(+), 22 deletions(-) diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index a0123ad..a50ddf3 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -394,10 +394,20 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) static const char __hyp_panic_string[] = "HYP panic:\nPS:%08llx PC:%016llx ESR:%08llx\nFAR:%016llx HPFAR:%016llx PAR:%016llx\nVCPU:%p\n"; static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par, -struct kvm_vcpu *vcpu) +struct kvm_cpu_context *__host_ctxt) { + struct kvm_vcpu *vcpu; unsigned long str_va; + vcpu = __host_ctxt->__hyp_running_vcpu; + + if (read_sysreg(vttbr_el2)) { + __timer_disable_traps(vcpu); + __deactivate_traps(vcpu); + __deactivate_vm(vcpu); + __sysreg_restore_host_state(__host_ctxt); + } + /* * Force the panic string to be loaded from the literal pool, * making sure it is a kernel address and not a PC-relative @@ -411,40 +421,31 @@ static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par, read_sysreg(hpfar_el2), par, vcpu); } -static void __hyp_text __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par, - struct kvm_vcpu *vcpu) +static void __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par, +struct kvm_cpu_context *host_ctxt) { + struct kvm_vcpu *vcpu; + vcpu = host_ctxt->__hyp_running_vcpu; + + __deactivate_traps_vhe(); + __sysreg_restore_host_state(host_ctxt); + panic(__hyp_panic_string, spsr, elr, read_sysreg_el2(esr), read_sysreg_el2(far), read_sysreg(hpfar_el2), par, vcpu); } -static hyp_alternate_select(__hyp_call_panic, - __hyp_call_panic_nvhe, __hyp_call_panic_vhe, - ARM64_HAS_VIRT_HOST_EXTN); - void __hyp_text __noreturn hyp_panic(struct kvm_cpu_context *__host_ctxt) { - struct kvm_vcpu *vcpu = NULL; - u64 spsr = read_sysreg_el2(spsr); u64 elr = read_sysreg_el2(elr); u64 par = read_sysreg(par_el1); - if (read_sysreg(vttbr_el2)) { - struct kvm_cpu_context *host_ctxt; - - host_ctxt = __host_ctxt; - vcpu = host_ctxt->__hyp_running_vcpu; - __timer_disable_traps(vcpu); - __deactivate_traps(vcpu); - __deactivate_vm(vcpu); - __sysreg_restore_host_state(host_ctxt); - } - - /* Call panic for real */ - __hyp_call_panic()(spsr, elr, par, vcpu); + if (!has_vhe()) + __hyp_call_panic_nvhe(spsr, elr, par, __host_ctxt); + else + __hyp_call_panic_vhe(spsr, elr, par, __host_ctxt); unreachable(); } -- 2.9.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 08/37] KVM: arm64: Defer restoring host VFP state to vcpu_put
Avoid saving the guest VFP registers and restoring the host VFP registers on every exit from the VM. Only when we're about to run userspace or other threads in the kernel do we really have to switch the state back to the host state. We still initially configure the VFP registers to trap when entering the VM, but the difference is that we now leave the guest state in the hardware registers while running the VM. Signed-off-by: Christoffer Dall--- arch/arm64/include/asm/kvm_emulate.h | 5 arch/arm64/include/asm/kvm_host.h| 3 +++ arch/arm64/kernel/asm-offsets.c | 1 + arch/arm64/kvm/hyp/entry.S | 3 +++ arch/arm64/kvm/hyp/switch.c | 47 +++- arch/arm64/kvm/hyp/sysreg-sr.c | 21 +--- 6 files changed, 44 insertions(+), 36 deletions(-) diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 1fbfe96..630dd60 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -56,6 +56,11 @@ static inline unsigned long *vcpu_hcr(struct kvm_vcpu *vcpu) return (unsigned long *)>arch.hcr_el2; } +static inline bool vcpu_el1_is_32bit(struct kvm_vcpu *vcpu) +{ + return (!(vcpu->arch.hcr_el2 & HCR_RW)); +} + static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu) { return (unsigned long *)_gp_regs(vcpu)->regs.pc; diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 7d3bfa7..5e09eb9 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -210,6 +210,9 @@ struct kvm_vcpu_arch { /* Guest debug state */ u64 debug_flags; + /* 1 if the guest VFP state is loaded into the hardware */ + u64 guest_vfp_loaded; + /* * We maintain more than a single set of debug registers to support * debugging the guest from the host and to maintain separate host and diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c index 612021d..9946732 100644 --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c @@ -133,6 +133,7 @@ int main(void) DEFINE(CPU_GP_REGS, offsetof(struct kvm_cpu_context, gp_regs)); DEFINE(CPU_USER_PT_REGS, offsetof(struct kvm_regs, regs)); DEFINE(CPU_FP_REGS, offsetof(struct kvm_regs, fp_regs)); + DEFINE(VCPU_GUEST_VFP_LOADED,offsetof(struct kvm_vcpu, arch.guest_vfp_loaded)); DEFINE(VCPU_FPEXC32_EL2, offsetof(struct kvm_vcpu, arch.ctxt.sys_regs[FPEXC32_EL2])); DEFINE(VCPU_HOST_CONTEXT,offsetof(struct kvm_vcpu, arch.host_cpu_context)); DEFINE(HOST_CONTEXT_VCPU,offsetof(struct kvm_cpu_context, __hyp_running_vcpu)); diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S index 76cd48f..b3e7191 100644 --- a/arch/arm64/kvm/hyp/entry.S +++ b/arch/arm64/kvm/hyp/entry.S @@ -185,6 +185,9 @@ alternative_endif add x0, x2, #CPU_GP_REG_OFFSET(CPU_FP_REGS) bl __fpsimd_restore_state + mov x0, #1 + str x0, [x3, #VCPU_GUEST_VFP_LOADED] + // Skip restoring fpexc32 for AArch64 guests mrs x1, hcr_el2 tbnzx1, #HCR_RW_SHIFT, 1f diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 7703d63..ef05c59 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -23,43 +23,31 @@ #include #include -static bool __hyp_text __fpsimd_enabled_nvhe(void) -{ - return !(read_sysreg(cptr_el2) & CPTR_EL2_TFP); -} - -static bool __hyp_text __fpsimd_enabled_vhe(void) -{ - return !!(read_sysreg(cpacr_el1) & CPACR_EL1_FPEN); -} - -static hyp_alternate_select(__fpsimd_is_enabled, - __fpsimd_enabled_nvhe, __fpsimd_enabled_vhe, - ARM64_HAS_VIRT_HOST_EXTN); - -bool __hyp_text __fpsimd_enabled(void) -{ - return __fpsimd_is_enabled()(); -} - -static void __hyp_text __activate_traps_vhe(void) +static void __hyp_text __activate_traps_vhe(struct kvm_vcpu *vcpu) { u64 val; val = read_sysreg(cpacr_el1); val |= CPACR_EL1_TTA; - val &= ~CPACR_EL1_FPEN; + if (vcpu->arch.guest_vfp_loaded) + val |= CPACR_EL1_FPEN; + else + val &= ~CPACR_EL1_FPEN; write_sysreg(val, cpacr_el1); write_sysreg(__kvm_hyp_vector, vbar_el1); } -static void __hyp_text __activate_traps_nvhe(void) +static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu) { u64 val; val = CPTR_EL2_DEFAULT; - val |= CPTR_EL2_TTA | CPTR_EL2_TFP; + val |= CPTR_EL2_TTA; + if (vcpu->arch.guest_vfp_loaded) + val &= ~CPTR_EL2_TFP; + else + val |= CPTR_EL2_TFP; write_sysreg(val, cptr_el2); } @@ -81,7 +69,8 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu) * it will cause
[PATCH 00/37] Optimize KVM/ARM for VHE systems
This series redesigns parts of KVM/ARM to optimize the performance on VHE systems. The general approach is to try to do as little work as possible when transitioning betwen the VM and the hypervisor. This has the benefit of lower latency when waiting for interrupts and delivering virtual interrupts, and reduces the overhead of emulating behavior and I/O in the host kernel. Patches 01 through 04 are not VHE specific, but rework parts of KVM/ARM that can be generally improved. We then add infrastructure to move more logic into vcpu_load and vcpu_put, we improve handling of VFP and debug registers. We then introduce a new world-switch function for VHE systems, which we can tweak and optimize for VHE systems. To do that, we rework a lot of the sytem register save/restore handling and emulation code that may need access to system registers, so that we can defer as many system register save/restore operations to vcpu_load and vcpu_put, and move this logic out of the VHE world switch function. We then optimize the configuration of traps. On non-VHE systems, both the host and VM kernels run in EL1, but because the host kernel should have full access to the underlying hardware, but the VM kernel should not, we essentially make the host kernel more privieleged than the VM kernel despite them both running at the same privilege level by enabling VE traps when entering the VM and disabling those traps when exiting the VM. On VHE systems, the host kernel runs in EL2 and has full access to the hardware (as much as allowed by secure side software), and is unaffected by the trap configuration. That means we can configure the traps for VMs running in EL1 once, and don't have to switch them on and off for every entry/exit to/from the VM. Finally, we improve our VGIC handling by moving all save/restore logic out of the VHE world-switch, and we make it possible to truly only evaluate if the AP list is empty and not do *any* VGIC work if that is the case, and only do the minimal amount of work required in the course of the VGIC processing when we have virtual interrupts in flight. The patches are based on v4.14-rc2 plus the timer optimization series [1] and the patches that use the TPIDR_EL2 for per-cpu offsets in the host kernel and stops storing the VCPU pointer in TPIDR_EL2 from the SDEI series [2]. I've given the patches a fair amount of testing on Thunder-X, Mustang, Seattle, and TC2 (32-bit) for non-VHE testing, and tested VHE functionality on the Foundation model, running both 64-bit VMs and 32-bit VMs side-by-side and using both GICv3-on-GICv3 and GICv2-on-GICv3. The paches are also available in the vhe-optimize branch on my kernel.org repository [3]. Thanks, -Christoffer [1]: https://lists.cs.columbia.edu/pipermail/kvmarm/2017-September/027245.html [2]: https://lists.cs.columbia.edu/pipermail/kvmarm/2017-September/027230.html [3]: git://git.kernel.org/pub/scm/linux/kernel/git/cdall/linux.git vhe-optimize Christoffer Dall (36): KVM: arm64: Avoid storing the vcpu pointer on the stack KVM: arm64: Rework hyp_panic for VHE and non-VHE KVM: arm/arm64: Get rid of vcpu->arch.irq_lines KVM: Record the executing ioctl number on the vcpu struct KVM: arm/arm64: Only load/put VCPU state for KVM_RUN KVM: arm/arm64: Add kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs KVM: arm64: Defer restoring host VFP state to vcpu_put KVM: arm64: Move debug dirty flag calculation out of world switch KVM: arm64: Slightly improve debug save/restore functions KVM: arm64: Improve debug register save/restore flow KVM: arm64: Factor out fault info population and gic workarounds KVM: arm64: Introduce VHE-specific kvm_vcpu_run KVM: arm64: Remove kern_hyp_va() use in VHE switch function KVM: arm64: Don't deactivate VM on VHE systems KVM: arm64: Remove noop calls to timer save/restore from VHE switch KVM: arm64: Move userspace system registers into separate function KVM: arm64: Rewrite sysreg alternatives to static keys KVM: arm64: Introduce separate VHE/non-VHE sysreg save/restore functions KVM: arm64: Unify non-VHE host/guest sysreg save and restore functions KVM: arm64: Don't save the host ELR_EL2 and SPSR_EL2 on VHE systems KVM: arm64: Change 32-bit handling of VM system registers KVM: arm64: Prepare to handle traps on deferred VM sysregs KVM: arm64: Prepare to handle traps on deferred EL0 sysregs KVM: arm64: Prepare to handle traps on remaining deferred EL1 sysregs KVM: arm64: Prepare to handle traps on deferred AArch32 sysregs KVM: arm64: Defer saving/restoring system registers to vcpu load/put on VHE KVM: arm64: Move common VHE/non-VHE trap config in separate functions KVM: arm64: Configure FPSIMD traps on vcpu load/put for VHE KVM: arm64: Configure c15, PMU, and debug register traps on cpu load/put for VHE KVM: arm64: Separate activate_traps and deactive_traps for VHE and non-VHE KVM: arm/arm64: Handle VGICv2 save/restore from the main VGIC code KVM: