Re: [PATCH v3 25/28] arm64/sve: Detect SVE and activate runtime support

2017-10-12 Thread Catalin Marinas
On Tue, Oct 10, 2017 at 07:38:42PM +0100, Dave P Martin wrote:
> This patch enables detection of hardware SVE support via the
> cpufeatures framework, and reports its presence to the kernel and
> userspace via the new ARM64_SVE cpucap and HWCAP_SVE hwcap
> respectively.
> 
> Userspace can also detect SVE using ID_AA64PFR0_EL1, using the
> cpufeatures MRS emulation.
> 
> When running on hardware that supports SVE, this enables runtime
> kernel support for SVE, and allows user tasks to execute SVE
> instructions and make of the of the SVE-specific user/kernel
> interface extensions implemented by this series.
> 
> Signed-off-by: Dave Martin 
> Cc: Suzuki K Poulose 

Reviewed-by: Catalin Marinas 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 24/28] arm64/sve: KVM: Hide SVE from CPU features exposed to guests

2017-10-12 Thread Catalin Marinas
On Tue, Oct 10, 2017 at 07:38:41PM +0100, Dave P Martin wrote:
> KVM guests cannot currently use SVE, because SVE is always
> configured to trap to EL2.
> 
> However, a guest that sees SVE reported as present in
> ID_AA64PFR0_EL1 may legitimately expect that SVE works and try to
> use it.  Instead of working, the guest will receive an injected
> undef exception, which may cause the guest to oops or go into a
> spin.
> 
> To avoid misleading the guest into believing that SVE will work,
> this patch masks out the SVE field from ID_AA64PFR0_EL1 when a
> guest attempts to read this register.  No support is explicitly
> added for ID_AA64ZFR0_EL1 either, so that is still emulated as
> reading as zero, which is consistent with SVE not being
> implemented.
> 
> This is a temporary measure, and will be removed in a later series
> when full KVM support for SVE is implemented.
> 
> Signed-off-by: Dave Martin 
> Reviewed-by: Alex Bennée 
> Cc: Marc Zyngier 

Acked-by: Catalin Marinas 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 23/28] arm64/sve: KVM: Treat guest SVE use as undefined instruction execution

2017-10-12 Thread Catalin Marinas
On Tue, Oct 10, 2017 at 07:38:40PM +0100, Dave P Martin wrote:
> When trapping forbidden attempts by a guest to use SVE, we want the
> guest to see a trap consistent with SVE not being implemented.
> 
> This patch injects an undefined instruction exception into the
> guest in response to such an exception.
> 
> Signed-off-by: Dave Martin 
> Reviewed-by: Alex Bennée 

Acked-by: Catalin Marinas 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 21/28] arm64/sve: Add sysctl to set the default vector length for new processes

2017-10-12 Thread Catalin Marinas
On Tue, Oct 10, 2017 at 07:38:38PM +0100, Dave P Martin wrote:
> Because of the effect of SVE on the size of the signal frame, the
> default vector length used for new processes involves a tradeoff
> between performance of SVE-enabled software on the one hand, and
> reliability of non-SVE-aware software on the other hand.
> 
> For this reason, the best choice depends on the repertoire of
> userspace software in use and is thus best left up to distro
> maintainers, sysadmins and developers.
> 
> If CONFIG_SYSCTL is enabled, this patch exposes the default vector
> length in /proc/sys/abi/sve_default_vector_length, where boot
> scripts or the adventurous can poke it.
> 
> In common with other arm64 ABI sysctls, this control is currently
> global: setting it requires CAP_SYS_ADMIN in the root user
> namespace, but the value set is effective for subsequent execs in
> all namespaces.  The control only affects _new_ processes, however:
> changing it does not affect the vector length of any existing
> process.
> 
> The intended usage model is that if userspace is known to be fully
> SVE-tolerant (or a developer is curious to find out) then init
> scripts can crank this up during startup.
> 
> Signed-off-by: Dave Martin 
> Reviewed-by: Alex Bennée 

Reviewed-by: Catalin Marinas 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 20/28] arm64/sve: Add prctl controls for userspace vector length management

2017-10-12 Thread Catalin Marinas
On Tue, Oct 10, 2017 at 07:38:37PM +0100, Dave P Martin wrote:
> This patch adds two arm64-specific prctls, to permit userspace to
> control its vector length:
> 
>  * PR_SVE_SET_VL: set the thread's SVE vector length and vector
>length inheritance mode.
> 
>  * PR_SVE_GET_VL: get the same information.
> 
> Although these calls shadow instruction set features in the SVE
> architecture, these prctls provide additional control: the vector
> length inheritance mode is Linux-specific and nothing to do with
> the architecture, and the architecture does not permit EL0 to set
> its own vector length directly.  Both can be used in portable tools
> without requiring the use of SVE instructions.
> 
> Signed-off-by: Dave Martin 
> Cc: Alex Bennée 

Reviewed-by: Catalin Marinas 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 19/28] arm64/sve: ptrace and ELF coredump support

2017-10-12 Thread Catalin Marinas
On Tue, Oct 10, 2017 at 07:38:36PM +0100, Dave P Martin wrote:
> @@ -702,6 +737,211 @@ static int system_call_set(struct task_struct *target,
>   return ret;
>  }
>  
> +#ifdef CONFIG_ARM64_SVE
> +
> +static void sve_init_header_from_task(struct user_sve_header *header,
> +   struct task_struct *target)
> +{
> + unsigned int vq;
> +
> + memset(header, 0, sizeof(*header));
> +
> + header->flags = test_tsk_thread_flag(target, TIF_SVE) ?
> + SVE_PT_REGS_SVE : SVE_PT_REGS_FPSIMD;

For PTRACE_SYSCALL, we may or may not have TIF_SVE depending on what
happened with the target. Just a thought: shall we clear TIF_SVE (and
sync it to fpsimd) in syscall_trace_enter()?

> + if (test_tsk_thread_flag(target, TIF_SVE_VL_INHERIT))
> + header->flags |= SVE_PT_VL_INHERIT;
> +
> + header->vl = target->thread.sve_vl;
> + vq = sve_vq_from_vl(header->vl);
> +
> + header->max_vl = sve_max_vl;
> + if (WARN_ON(!sve_vl_valid(sve_max_vl)))
> + header->max_vl = header->vl;
> +
> + header->size = SVE_PT_SIZE(vq, header->flags);
> + header->max_size = SVE_PT_SIZE(sve_vq_from_vl(header->max_vl),
> +   SVE_PT_REGS_SVE);
> +}
[...]
> +static int sve_set(struct task_struct *target,
> +const struct user_regset *regset,
> +unsigned int pos, unsigned int count,
> +const void *kbuf, const void __user *ubuf)
> +{
> + int ret;
> + struct user_sve_header header;
> + unsigned int vq;
> + unsigned long start, end;
> +
> + if (!system_supports_sve())
> + return -EINVAL;
> +
> + /* Header */
> + if (count < sizeof(header))
> + return -EINVAL;
> + ret = user_regset_copyin(, , , , ,
> +  0, sizeof(header));
> + if (ret)
> + goto out;
> +
> + /*
> +  * Apart from PT_SVE_REGS_MASK, all PT_SVE_* flags are consumed by
> +  * sve_set_vector_length(), which will also validate them for us:
> +  */
> + ret = sve_set_vector_length(target, header.vl,
> + ((unsigned long)header.flags & ~SVE_PT_REGS_MASK) << 16);
> + if (ret)
> + goto out;
> +
> + /* Actual VL set may be less than the user asked for: */
> + vq = sve_vq_from_vl(target->thread.sve_vl);
> +
> + /* Registers: FPSIMD-only case */
> +
> + BUILD_BUG_ON(SVE_PT_FPSIMD_OFFSET != sizeof(header));
> + if ((header.flags & SVE_PT_REGS_MASK) == SVE_PT_REGS_FPSIMD) {
> + sve_sync_to_fpsimd(target);
> +
> + ret = __fpr_set(target, regset, pos, count, kbuf, ubuf,
> + SVE_PT_FPSIMD_OFFSET);
> + clear_tsk_thread_flag(target, TIF_SVE);
> + goto out;
> + }

__fpr_set() already calls sve_sync_to_fpsimd(). Anyway, do you actually
need this since we are going to override the FPSIMD state anyway here.

> +
> + /* Otherwise: full SVE case */
> +
> + /*
> +  * If setting a different VL from the requested VL and there is
> +  * register data, the data layout will be wrong: don't even
> +  * try to set the registers in this case.
> +  */
> + if (count && vq != sve_vq_from_vl(header.vl)) {
> + ret = -EIO;
> + goto out;
> + }
> +
> + sve_alloc(target);
> + fpsimd_sync_to_sve(target);

Similarly here, it's a full SVE case, so we are going to override it
anyway.

> + set_tsk_thread_flag(target, TIF_SVE);

This has the side-effect of enabling TIF_SVE even for PTRACE_SYSCALL
which may be cleared in some circumstances. It may not be an issue
though.

-- 
Catalin
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 02/37] KVM: arm64: Rework hyp_panic for VHE and non-VHE

2017-10-12 Thread Christoffer Dall
On Thu, Oct 12, 2017 at 04:55:16PM +0100, Marc Zyngier wrote:
> On 12/10/17 11:41, Christoffer Dall wrote:
> > VHE actually doesn't rely on clearing the VTTBR when returning to the
> > hsot kernel, and that is the current key mechanism of hyp_panic to
> 
> host
> 
> > figure out how to attempt to return to a state good enough to print a
> > panic statement.
> > 
> > Therefore, we split the hyp_panic function into two functions, a VHE and
> > a non-VHE, keeping the non-VHE version intact, but changing the VHE
> > behavior.
> > 
> > The vttbr_el2 check on VHE doesn't really make that much sense, because
> > the only situation where we can get here on VHE is when the hypervisor
> > assembly code actually caleld into hyp_panic, which only happens when
> 
> called
> 
> > VBAR_EL2 has been set to the KVM exception vectors.  On VHE, we can
> > always safely disable the traps and restore the host registers at this
> > point, so we simply do that unconditionally and call into the panic
> > function directly.
> > 
> > Signed-off-by: Christoffer Dall 
> > ---
> >  arch/arm64/kvm/hyp/switch.c | 45 
> > +++--
> >  1 file changed, 23 insertions(+), 22 deletions(-)
> > 
> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> > index a0123ad..a50ddf3 100644
> > --- a/arch/arm64/kvm/hyp/switch.c
> > +++ b/arch/arm64/kvm/hyp/switch.c
> > @@ -394,10 +394,20 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
> >  static const char __hyp_panic_string[] = "HYP panic:\nPS:%08llx PC:%016llx 
> > ESR:%08llx\nFAR:%016llx HPFAR:%016llx PAR:%016llx\nVCPU:%p\n";
> >  
> >  static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par,
> > -struct kvm_vcpu *vcpu)
> > +struct kvm_cpu_context 
> > *__host_ctxt)
> >  {
> > +   struct kvm_vcpu *vcpu;
> > unsigned long str_va;
> >  
> > +   vcpu = __host_ctxt->__hyp_running_vcpu;
> > +
> > +   if (read_sysreg(vttbr_el2)) {
> > +   __timer_disable_traps(vcpu);
> > +   __deactivate_traps(vcpu);
> > +   __deactivate_vm(vcpu);
> > +   __sysreg_restore_host_state(__host_ctxt);
> > +   }
> > +
> > /*
> >  * Force the panic string to be loaded from the literal pool,
> >  * making sure it is a kernel address and not a PC-relative
> > @@ -411,40 +421,31 @@ static void __hyp_text __hyp_call_panic_nvhe(u64 
> > spsr, u64 elr, u64 par,
> >read_sysreg(hpfar_el2), par, vcpu);
> >  }
> >  
> > -static void __hyp_text __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par,
> > -   struct kvm_vcpu *vcpu)
> > +static void __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par,
> > +struct kvm_cpu_context *host_ctxt)
> >  {
> > +   struct kvm_vcpu *vcpu;
> > +   vcpu = host_ctxt->__hyp_running_vcpu;
> > +
> > +   __deactivate_traps_vhe();
> 
> Is there a reason why we can't just call __deactivate_traps(), rather
> than the VHE-specific subset? It doesn't really matter, as we're about
> to panic, but still...
> 

It doesn't really matter, especially as later patches will change this,
but this patch would be slightly nicer keeping the __deactivate_traps.
I don't mind changing that around in the next version.


> > +   __sysreg_restore_host_state(host_ctxt);
> > +
> > panic(__hyp_panic_string,
> >   spsr,  elr,
> >   read_sysreg_el2(esr),   read_sysreg_el2(far),
> >   read_sysreg(hpfar_el2), par, vcpu);
> >  }
> >  
> > -static hyp_alternate_select(__hyp_call_panic,
> > -   __hyp_call_panic_nvhe, __hyp_call_panic_vhe,
> > -   ARM64_HAS_VIRT_HOST_EXTN);
> > -
> >  void __hyp_text __noreturn hyp_panic(struct kvm_cpu_context *__host_ctxt)
> >  {
> > -   struct kvm_vcpu *vcpu = NULL;
> > -
> > u64 spsr = read_sysreg_el2(spsr);
> > u64 elr = read_sysreg_el2(elr);
> > u64 par = read_sysreg(par_el1);
> >  
> > -   if (read_sysreg(vttbr_el2)) {
> > -   struct kvm_cpu_context *host_ctxt;
> > -
> > -   host_ctxt = __host_ctxt;
> > -   vcpu = host_ctxt->__hyp_running_vcpu;
> > -   __timer_disable_traps(vcpu);
> > -   __deactivate_traps(vcpu);
> > -   __deactivate_vm(vcpu);
> > -   __sysreg_restore_host_state(host_ctxt);
> > -   }
> > -
> > -   /* Call panic for real */
> > -   __hyp_call_panic()(spsr, elr, par, vcpu);
> > +   if (!has_vhe())
> > +   __hyp_call_panic_nvhe(spsr, elr, par, __host_ctxt);
> > +   else
> > +   __hyp_call_panic_vhe(spsr, elr, par, __host_ctxt);
> >  
> > unreachable();
> >  }
> > 
> 
> Otherwise looks good.
> 

Thanks,
-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 01/37] KVM: arm64: Avoid storing the vcpu pointer on the stack

2017-10-12 Thread Christoffer Dall
On Thu, Oct 12, 2017 at 04:49:44PM +0100, Marc Zyngier wrote:
> On 12/10/17 11:41, Christoffer Dall wrote:
> > We already have the percpu area for the host cpu state, which points to
> > the VCPU, so there's no need to store the VCPU pointer on the stack on
> > every context switch.  We can be a little more clever and just use
> > tpidr_el2 for the percpu offset and load the VCPU pointer from the host
> > context.
> > 
> > This requires us to have a scratch register though, so we take the
> > chance to rearrange some of the el1_sync code to only look at the
> > vttbr_el2 to determine if this is a trap from the guest or an HVC from
> > the host.  We do add an extra check to call the panic code if the kernel
> > is configured with debugging enabled and we saw a trap from the host
> > which wasn't an HVC, indicating that we left some EL2 trap configured by
> > mistake.
> > 
> > Signed-off-by: Christoffer Dall 
> > ---
> >  arch/arm64/include/asm/kvm_asm.h | 20 
> >  arch/arm64/kernel/asm-offsets.c  |  1 +
> >  arch/arm64/kvm/hyp/entry.S   |  5 +
> >  arch/arm64/kvm/hyp/hyp-entry.S   | 39 
> > ++-
> >  arch/arm64/kvm/hyp/switch.c  |  2 +-
> >  5 files changed, 41 insertions(+), 26 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_asm.h 
> > b/arch/arm64/include/asm/kvm_asm.h
> > index ab4d0a9..7e48a39 100644
> > --- a/arch/arm64/include/asm/kvm_asm.h
> > +++ b/arch/arm64/include/asm/kvm_asm.h
> > @@ -70,4 +70,24 @@ extern u32 __init_stage2_translation(void);
> >  
> >  #endif
> >  
> > +#ifdef __ASSEMBLY__
> > +.macro get_host_ctxt reg, tmp
> > +   /*
> > +* '=kvm_host_cpu_state' is a host VA from the constant pool, it may
> > +* not be accessible by this address from EL2, hyp_panic() converts
> > +* it with kern_hyp_va() before use.
> > +*/
> 
> This really looks like a stale comment, as there is no hyp_panic
> involved here anymore (thankfully!).
> 

yeah, I suppose.

> > +   ldr \reg, =kvm_host_cpu_state
> > +   mrs \tmp, tpidr_el2
> > +   add \reg, \reg, \tmp
> > +   kern_hyp_va \reg
> 
> Here, we're trading a load from the stack for a load from the constant
> pool. Can't we do something like:
> 
>   adr_l   \reg, kvm_host_cpu_state
>   msr \tmp, tpidr_el2
>   add \reg, \reg, \tmp
> 
> and that's it? 

That's definitely what the compiler generates from C code...

> This relies on the property that the kernel/hyp offset is
> constant, and that it doesn't matter if we add the offset to a kernel VA
> or a HYP VA... Completely untested of course!
> 

You're the hyp VA expert.  Is it valid to rely on that assumption?

> > +.endm
> > +
> > +.macro get_vcpu vcpu, ctxt
> > +   ldr \vcpu, [\ctxt, #HOST_CONTEXT_VCPU]
> > +   kern_hyp_va \vcpu
> > +.endm
> > +
> > +#endif
> > +
> >  #endif /* __ARM_KVM_ASM_H__ */
> > diff --git a/arch/arm64/kernel/asm-offsets.c 
> > b/arch/arm64/kernel/asm-offsets.c
> > index 71bf088..612021d 100644
> > --- a/arch/arm64/kernel/asm-offsets.c
> > +++ b/arch/arm64/kernel/asm-offsets.c
> > @@ -135,6 +135,7 @@ int main(void)
> >DEFINE(CPU_FP_REGS,  offsetof(struct kvm_regs, fp_regs));
> >DEFINE(VCPU_FPEXC32_EL2, offsetof(struct kvm_vcpu, 
> > arch.ctxt.sys_regs[FPEXC32_EL2]));
> >DEFINE(VCPU_HOST_CONTEXT,offsetof(struct kvm_vcpu, 
> > arch.host_cpu_context));
> > +  DEFINE(HOST_CONTEXT_VCPU,offsetof(struct kvm_cpu_context, 
> > __hyp_running_vcpu));
> >  #endif
> >  #ifdef CONFIG_CPU_PM
> >DEFINE(CPU_SUSPEND_SZ,   sizeof(struct cpu_suspend_ctx));
> > diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
> > index 9a8ab5d..76cd48f 100644
> > --- a/arch/arm64/kvm/hyp/entry.S
> > +++ b/arch/arm64/kvm/hyp/entry.S
> > @@ -62,9 +62,6 @@ ENTRY(__guest_enter)
> > // Store the host regs
> > save_callee_saved_regs x1
> >  
> > -   // Store host_ctxt and vcpu for use at exit time
> > -   stp x1, x0, [sp, #-16]!
> > -
> > add x18, x0, #VCPU_CONTEXT
> >  
> > // Restore guest regs x0-x17
> > @@ -119,7 +116,7 @@ ENTRY(__guest_exit)
> > save_callee_saved_regs x1
> >  
> > // Restore the host_ctxt from the stack
> 
> Stale comment again.
> 

yeah...

> > -   ldr x2, [sp], #16
> > +   get_host_ctxt   x2, x3
> >  
> > // Now restore the host regs
> > restore_callee_saved_regs x2
> > diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
> > index e4f37b9..2950f26 100644
> > --- a/arch/arm64/kvm/hyp/hyp-entry.S
> > +++ b/arch/arm64/kvm/hyp/hyp-entry.S
> > @@ -56,19 +56,16 @@ ENDPROC(__vhe_hyp_call)
> >  el1_sync:  // Guest trapped into EL2
> > stp x0, x1, [sp, #-16]!
> >  
> > -alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
> > -   mrs x1, esr_el2
> > -alternative_else
> > -   mrs x1, esr_el1
> > -alternative_endif
> > -   lsr x0, x1, #ESR_ELx_EC_SHIFT
> > -
> > -   cmp x0, 

[PATCH] arm64: KVM: set right LR register value for 32 bit guest when inject abort

2017-10-12 Thread Dongjiu Geng
When a exception is trapped to EL2, hardware uses  ELR_ELx to hold
the current fault instruction address. If KVM wants to inject a
abort to 32 bit guest, it needs to set the LR register for the
guest to emulate this abort  happened in the guest. Because ARM32
architecture is Multi-pipeline, so the LR value has an offset to
the fault instruction address.

The offsets applied to Link value for exceptions as shown below,
which should be added for the ARM32 link register(LR).

Exception   Offset, for PE state of:
A32   T32
Undefined Instruction   +4+2
Prefetch Abort  +4+4
Data Abort  +8+8
IRQ or FIQ  +4+4

Signed-off-by: Dongjiu Geng 
Signed-off-by: Haibin Zhang 

---
For example, to the undefined instruction injection:

1. Guest OS call SMC(Secure Monitor Call) instruction in the address
0xc025405c, then Guest traps to hypervisor

c0254050:   e59d5028ldr r5, [sp, #40]   ; 0x28
c0254054:   e3a03001mov r3, #1
c0254058:   e1a01003mov r1, r3
c025405c:   e1600070smc 0
c0254060:   e30a0270movwr0, #41584  ; 0xa270
c0254064:   e34c00bfmovtr0, #49343  ; 0xc0bf

2. KVM  injects undefined abort to guest
3. We will find the fault PC is 0xc0254058, not 0xc025405c.

[   12.348072] Internal error: Oops - undefined instruction: 0 [#1] SMP ARM
[   12.349786] Modules linked in:
[   12.350563] CPU: 1 PID: 71 Comm: cat Not tainted 4.1.0-dirty #25
[   12.352061] Hardware name: Generic DT based system
[   12.353275] task: d9d08000 ti: d9cfe000 task.ti: d9cfe000
[   12.354637] PC is at proc_dointvec+0x20/0x60
[   12.355717] LR is at proc_sys_call_handler+0xb0/0xc4
[   12.356972] pc : []lr : []psr: a0060013
[   12.356972] sp : d9cffe90  ip : c0254038  fp : 0001
[   12.359824] r10: d9cfff80  r9 : 0004  r8 : 
[   12.361132] r7 : bec21cb0  r6 : d9cffec4  r5 : d9cfff80  r4 : c0e82de0
[   12.362766] r3 : 0001  r2 : bec21cb0  r1 : 0001  r0 : c0e82de0
[   12.364400] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   12.366183] Control: 10c5383d  Table: 59d3406a  DAC: 0015
[   12.367623] Process cat (pid: 71, stack limit = 0xd9cfe220)

4. After correct the LR register, it will have right value

[  125.763370] Internal error: Oops - undefined instruction: 0 [#2] SMP ARM
[  125.767010] Modules linked in:
[  125.768472] CPU: 1 PID: 74 Comm: cat Tainted: G  D 4.1.0-dirty 
#25
[  125.771854] Hardware name: Generic DT based system
[  125.774053] task: db0bb900 ti: d9d1 task.ti: d9d1
[  125.776821] PC is at proc_dointvec+0x24/0x60
[  125.778919] LR is at proc_sys_call_handler+0xb0/0xc4
[  125.781269] pc : []lr : []psr: a0060013
[  125.781269] sp : d9d11e90  ip : c0254038  fp : 0001
[  125.786581] r10: d9d11f80  r9 : 0004  r8 : 
[  125.789673] r7 : be92ccb0  r6 : d9d11ec4  r5 : d9d11f80  r4 : c0e82de0
[  125.792828] r3 : 0001  r2 : be92ccb0  r1 : 0001  r0 : c0e82de0
[  125.795890] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user

For other exception injection, such as Data/Prefetch abort, also needs to 
correct
---
 arch/arm64/kvm/inject_fault.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index da6a8cf..da93508 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -33,12 +33,11 @@
 #define LOWER_EL_AArch64_VECTOR0x400
 #define LOWER_EL_AArch32_VECTOR0x600
 
-static void prepare_fault32(struct kvm_vcpu *vcpu, u32 mode, u32 vect_offset)
+static void prepare_fault32(struct kvm_vcpu *vcpu, u32 mode, u32 vect_offset,
+   u32 return_offset)
 {
unsigned long cpsr;
unsigned long new_spsr_value = *vcpu_cpsr(vcpu);
-   bool is_thumb = (new_spsr_value & COMPAT_PSR_T_BIT);
-   u32 return_offset = (is_thumb) ? 4 : 0;
u32 sctlr = vcpu_cp15(vcpu, c1_SCTLR);
 
cpsr = mode | COMPAT_PSR_I_BIT;
@@ -65,7 +64,11 @@ static void prepare_fault32(struct kvm_vcpu *vcpu, u32 mode, 
u32 vect_offset)
 
 static void inject_undef32(struct kvm_vcpu *vcpu)
 {
-   prepare_fault32(vcpu, COMPAT_PSR_MODE_UND, 4);
+   unsigned long spsr_value = *vcpu_cpsr(vcpu);
+   bool is_thumb = (spsr_value & COMPAT_PSR_T_BIT);
+   u32 return_offset = (is_thumb) ? 2 : 4;
+
+   prepare_fault32(vcpu, COMPAT_PSR_MODE_UND, 4, return_offset);
 }
 
 /*
@@ -75,21 +78,24 @@ static void inject_undef32(struct kvm_vcpu *vcpu)
 static void inject_abt32(struct kvm_vcpu *vcpu, bool is_pabt,
 unsigned long addr)
 {
-   u32 vect_offset;
+   u32 vect_offset, return_offset;
u32 *far, *fsr;
bool is_lpae;
 
if (is_pabt) {
vect_offset = 12;
+

Re: [PATCH 04/37] KVM: arm/arm64: Get rid of vcpu->arch.irq_lines

2017-10-12 Thread Marc Zyngier
On 12/10/17 11:41, Christoffer Dall wrote:
> We currently have a separate read-modify-write of the HCR_EL2 on entry
> to the guest for the sole purpose of setting the VF and VI bits, if set.
> Since this is most rarely the case (only when using userspace IRQ chip
> and interrupts are in flight), let's get rid of this operation and
> instead modify the bits in the vcpu->arch.hcr[_el2] directly when
> needed.
> 
> Signed-off-by: Christoffer Dall 
> ---
>  arch/arm/include/asm/kvm_emulate.h   |  9 ++---
>  arch/arm/include/asm/kvm_host.h  |  3 ---
>  arch/arm/kvm/emulate.c   |  2 +-
>  arch/arm/kvm/hyp/switch.c|  2 +-
>  arch/arm64/include/asm/kvm_emulate.h |  9 ++---
>  arch/arm64/include/asm/kvm_host.h|  3 ---
>  arch/arm64/kvm/hyp/switch.c  |  6 --
>  arch/arm64/kvm/inject_fault.c|  2 +-
>  virt/kvm/arm/arm.c   | 11 ++-
>  virt/kvm/arm/mmu.c   |  6 +++---
>  10 files changed, 16 insertions(+), 37 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_emulate.h 
> b/arch/arm/include/asm/kvm_emulate.h
> index 98089ff..34663a8 100644
> --- a/arch/arm/include/asm/kvm_emulate.h
> +++ b/arch/arm/include/asm/kvm_emulate.h
> @@ -62,14 +62,9 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
>   vcpu->arch.hcr = HCR_GUEST_MASK;
>  }
>  
> -static inline unsigned long vcpu_get_hcr(const struct kvm_vcpu *vcpu)
> +static inline unsigned long *vcpu_hcr(const struct kvm_vcpu *vcpu)
>  {
> - return vcpu->arch.hcr;
> -}
> -
> -static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr)
> -{
> - vcpu->arch.hcr = hcr;
> + return (unsigned long *)>arch.hcr;
>  }
>  
>  static inline bool vcpu_mode_is_32bit(const struct kvm_vcpu *vcpu)
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 4a879f6..1100170 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -153,9 +153,6 @@ struct kvm_vcpu_arch {
>   /* HYP trapping configuration */
>   u32 hcr;
>  
> - /* Interrupt related fields */
> - u32 irq_lines;  /* IRQ and FIQ levels */
> -
>   /* Exception Information */
>   struct kvm_vcpu_fault_info fault;
>  
> diff --git a/arch/arm/kvm/emulate.c b/arch/arm/kvm/emulate.c
> index 0064b86..4286a89 100644
> --- a/arch/arm/kvm/emulate.c
> +++ b/arch/arm/kvm/emulate.c
> @@ -313,5 +313,5 @@ void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long 
> addr)
>   */
>  void kvm_inject_vabt(struct kvm_vcpu *vcpu)
>  {
> - vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VA);
> + *vcpu_hcr(vcpu) |= HCR_VA;
>  }
> diff --git a/arch/arm/kvm/hyp/switch.c b/arch/arm/kvm/hyp/switch.c
> index 330c9ce..c3b9799 100644
> --- a/arch/arm/kvm/hyp/switch.c
> +++ b/arch/arm/kvm/hyp/switch.c
> @@ -43,7 +43,7 @@ static void __hyp_text __activate_traps(struct kvm_vcpu 
> *vcpu, u32 *fpexc_host)
>   isb();
>   }
>  
> - write_sysreg(vcpu->arch.hcr | vcpu->arch.irq_lines, HCR);
> + write_sysreg(vcpu->arch.hcr, HCR);
>   /* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
>   write_sysreg(HSTR_T(15), HSTR);
>   write_sysreg(HCPTR_TTA | HCPTR_TCP(10) | HCPTR_TCP(11), HCPTR);
> diff --git a/arch/arm64/include/asm/kvm_emulate.h 
> b/arch/arm64/include/asm/kvm_emulate.h
> index e5df3fc..1fbfe96 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -51,14 +51,9 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
>   vcpu->arch.hcr_el2 &= ~HCR_RW;
>  }
>  
> -static inline unsigned long vcpu_get_hcr(struct kvm_vcpu *vcpu)
> +static inline unsigned long *vcpu_hcr(struct kvm_vcpu *vcpu)
>  {
> - return vcpu->arch.hcr_el2;
> -}
> -
> -static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr)
> -{
> - vcpu->arch.hcr_el2 = hcr;
> + return (unsigned long *)>arch.hcr_el2;
>  }
>  
>  static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index 806ccef..27305e7 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -266,9 +266,6 @@ struct kvm_vcpu_arch {
>   /* IO related fields */
>   struct kvm_decode mmio_decode;
>  
> - /* Interrupt related fields */
> - u64 irq_lines;  /* IRQ and FIQ levels */
> -
>   /* Cache some mmu pages needed inside spinlock regions */
>   struct kvm_mmu_memory_cache mmu_page_cache;
>  
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index bcf1a79..7703d63 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -168,12 +168,6 @@ static void __hyp_text __vgic_save_state(struct kvm_vcpu 
> *vcpu)
>  
>  static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu)
>  {
> - u64 val;
> -
> - val = 

Re: [PATCH 03/37] KVM: arm64: Move HCR_INT_OVERRIDE to default HCR_EL2 guest flag

2017-10-12 Thread Marc Zyngier
On 12/10/17 11:41, Christoffer Dall wrote:
> From: Shih-Wei Li 
> 
> We always set the IMO and FMO bits in the HCR_EL2 when running the
> guest, regardless if we use the vgic or not.  By moving these flags to
> HCR_GUEST_FLAGS we can avoid one of the extra save/restore operations of
> HCR_EL2 in the world switch code, and we can also soon get rid of the
> other one.
> 
> Signed-off-by: Shih-Wei Li 
> Signed-off-by: Christoffer Dall 
> ---
>  arch/arm64/include/asm/kvm_arm.h | 4 ++--
>  arch/arm64/kvm/hyp/switch.c  | 3 ---
>  2 files changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_arm.h 
> b/arch/arm64/include/asm/kvm_arm.h
> index 61d694c..e67e279 100644
> --- a/arch/arm64/include/asm/kvm_arm.h
> +++ b/arch/arm64/include/asm/kvm_arm.h
> @@ -79,9 +79,9 @@
>   */
>  #define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \
>HCR_TVM | HCR_BSU_IS | HCR_FB | HCR_TAC | \
> -  HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW)
> +  HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | \
> +  HCR_FMO | HCR_IMO)
>  #define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF)
> -#define HCR_INT_OVERRIDE   (HCR_FMO | HCR_IMO)
>  #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
>  
>  /* TCR_EL2 Registers bits */
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index a50ddf3..bcf1a79 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -164,8 +164,6 @@ static void __hyp_text __vgic_save_state(struct kvm_vcpu 
> *vcpu)
>   __vgic_v3_save_state(vcpu);
>   else
>   __vgic_v2_save_state(vcpu);
> -
> - write_sysreg(read_sysreg(hcr_el2) & ~HCR_INT_OVERRIDE, hcr_el2);
>  }
>  
>  static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu)
> @@ -173,7 +171,6 @@ static void __hyp_text __vgic_restore_state(struct 
> kvm_vcpu *vcpu)
>   u64 val;
>  
>   val = read_sysreg(hcr_el2);
> - val |=  HCR_INT_OVERRIDE;
>   val |= vcpu->arch.irq_lines;
>   write_sysreg(val, hcr_el2);
>  
> 

To expand on why this is actually safe: IMO/FMO control both taking the
interrupts to EL2 and remapping ICC_*_EL1 to ICV_*_EL1 executed at EL1.
As long as we ensure that these bits are clear when returning to the EL1
host, we're OK.

Reviewed-by: Marc Zyngier 

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 13/28] arm64/sve: Signal handling support

2017-10-12 Thread Dave Martin
On Wed, Oct 11, 2017 at 05:40:52PM +0100, Catalin Marinas wrote:
> On Tue, Oct 10, 2017 at 07:38:30PM +0100, Dave P Martin wrote:
> > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> > index aabeaee..fa4ed34 100644
> > --- a/arch/arm64/kernel/fpsimd.c
> > +++ b/arch/arm64/kernel/fpsimd.c
> > @@ -310,6 +310,32 @@ static void fpsimd_to_sve(struct task_struct *task)
> >sizeof(fst->vregs[i]));
> >  }
> >  
> > +/*
> > + * Transfer the SVE state in task->thread.sve_state to
> > + * task->thread.fpsimd_state.
> > + *
> > + * Task can be a non-runnable task, or current.  In the latter case,
> > + * softirqs (and preemption) must be disabled.
> > + * task->thread.sve_state must point to at least sve_state_size(task)
> > + * bytes of allocated kernel memory.
> > + * task->thread.sve_state must be up to date before calling this function.
> > + */
> > +static void sve_to_fpsimd(struct task_struct *task)
> > +{
> > +   unsigned int vq;
> > +   void const *sst = task->thread.sve_state;
> > +   struct fpsimd_state *fst = >thread.fpsimd_state;
> > +   unsigned int i;
> > +
> > +   if (!system_supports_sve())
> > +   return;
> > +
> > +   vq = sve_vq_from_vl(task->thread.sve_vl);
> > +   for (i = 0; i < 32; ++i)
> > +   memcpy(>vregs[i], ZREG(sst, vq, i),
> > +  sizeof(fst->vregs[i]));
> > +}
> 
> Nit: could we actually just do an assignment with some pointer casting?
> It looks like we invoke memcpy for every 16 bytes (same for
> fpsimd_to_sve).

I was uneasy about what the type of ZREG(sst, vq, i) ought to be.
In any case, memest() is magic: my oldskool GCC (5.3.0) generates:

08084c70 :
08084c70:   1404b   08084c80 

08084c74:   d503201fnop
08084c78:   d65f03c0ret
08084c7c:   d503201fnop
08084c80:   f0007d61adrpx1, 09033000 

08084c84:   f942a021ldr x1, [x1,#1344]
08084c88:   36b001c1tbz w1, #22, 08084cc0 

08084c8c:   b94ca805ldr w5, [x0,#3240]
08084c90:   912a0001add x1, x0, #0xa80
08084c94:   91320004add x4, x0, #0xc80
08084c98:   f9465006ldr x6, [x0,#3232]
08084c9c:   121c6ca5and w5, w5, #0xfff0
08084ca0:   5280mov w0, #0x0
// #0
08084ca4:   8b2040c2add x2, x6, w0, uxtw
08084ca8:   0b05add w0, w0, w5
08084cac:   a9400c42ldp x2, x3, [x2]
08084cb0:   a8810c22stp x2, x3, [x1],#16
08084cb4:   eb01009fcmp x4, x1
08084cb8:   5461b.ne08084ca4 

08084cbc:   d65f03c0ret
08084cc0:   d65f03c0ret
08084cc4:   d503201fnop


Without volatile, I think assigning a single object and doing a memcpy()
are equivalent to the compiler: which it actually uses depends solely on
optimisation considerations.

(But then I'm not a language lawyer ... not a professional one anyway).


Are you concerned compilers may mess this up?

Cheers
---Dave
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 11/28] arm64/sve: Core task context handling

2017-10-12 Thread Dave Martin
On Wed, Oct 11, 2017 at 05:15:58PM +0100, Catalin Marinas wrote:
> On Tue, Oct 10, 2017 at 07:38:28PM +0100, Dave P Martin wrote:
> > diff --git a/arch/arm64/include/asm/fpsimd.h 
> > b/arch/arm64/include/asm/fpsimd.h
> > index 026a7c7..b1409de 100644
> > --- a/arch/arm64/include/asm/fpsimd.h
> > +++ b/arch/arm64/include/asm/fpsimd.h
> > @@ -72,6 +75,20 @@ extern void sve_load_state(void const *state, u32 const 
> > *pfpsr,
> >unsigned long vq_minus_1);
> >  extern unsigned int sve_get_vl(void);
> >  
> > +#ifdef CONFIG_ARM64_SVE
> > +
> > +extern size_t sve_state_size(struct task_struct const *task);
> > +
> > +extern void sve_alloc(struct task_struct *task);
> > +extern void fpsimd_release_thread(struct task_struct *task);
> > +
> > +#else /* ! CONFIG_ARM64_SVE */
> > +
> > +static void __maybe_unused sve_alloc(struct task_struct *task) { }
> > +static void __maybe_unused fpsimd_release_thread(struct task_struct *task) 
> > { }
> 
> Nitpick: usually we just add static inline functions here rather than
> __maybe_unused.

Fair enough -- come to think of it I've already converted some other
instances of this at Ard's request.

> 
> > diff --git a/arch/arm64/include/asm/processor.h 
> > b/arch/arm64/include/asm/processor.h
> > index 29adab8..4831d28 100644
> > --- a/arch/arm64/include/asm/processor.h
> > +++ b/arch/arm64/include/asm/processor.h
> > @@ -39,6 +47,8 @@
> >  #define FPEXC_IDF  (1 << 7)
> >  
> >  /*
> > + * (Note: in this discussion, statements about FPSIMD apply equally to 
> > SVE.)
> > + *
> >   * In order to reduce the number of times the FPSIMD state is needlessly 
> > saved
> >   * and restored, we need to keep track of two things:
> >   * (a) for each task, we need to remember which CPU was the last one to 
> > have
> > @@ -99,6 +109,287 @@
> >   */
> >  static DEFINE_PER_CPU(struct fpsimd_state *, fpsimd_last_state);
> >  
> > +static void sve_free(struct task_struct *task)
> > +{
> > +   kfree(task->thread.sve_state);
> > +   task->thread.sve_state = NULL;
> > +}
> 
> I think we need a WARN_ON if TIF_SVE is still set here (and the callers
> making sure it is cleared). I haven't checked the code paths via
> fpsimd_release_thread() but wondering what happens if we get an
> interrupt between freeing the state and making the pointer NULL, with
> some context switching in a preemptible kernel.

Having a WARN_ON() here may be a decent way to sanity-check that we
don't ever have sve_state NULL with TIF_SVE set.  This is a lot more
economical than putting a WARN_ON() at each dereference of sve_state
(of which there are quite a few).  sve_free() is also a slow path.

Currently, there are two callsites: sve_set_vector_length(), where we
test_and_clear_tsk_thread_flags(task, TIF_SVE) before calling sve_free();
and fpsimd_release_thread() where we "don't care" because the thread
is dying.

Looking more closely though, is the release_thread() path preemptible?
I can't see anything in the scheduler core to ensure this, nor any
general reason why it should be needed.

In which case preemption during thread exit after sve_free() could
result in a NULL deference in fpsimd_thread_switch().


So, I think my favoured approach is:

sve_release_thread()
{
local_bh_disable();
fpsimd_flush_task_state(current);
clear_thread_flag(TIF_SVE);
local_bh_enable();

sve_free();
}

The local_bh stuff is cumbersome here, and could be replaced with
barrier()s to force the order of fpsimd_flusk_task_state() versus
clearing TIF_SVE.  Or should the barrier really be in
fpsimd_flush_task_state()?  Disabling softirqs avoids the need to answer
such questions...


Then:

sve_free(task)
{
WARN_ON(test_thread_flag(TIF_SVE));

barrier();
kfree(task->thread.sve_state);
tash->thread.sve_state = NULL;
}

I'm assuming here that kfree() can't be called safely from atomic
context, but this is unclear.  I would expect to be able to free
GFP_ATOMIC memory from atomic context (though sve_statue is GFP_KERNEL,
so dunno).

> Alternatively, always clear TIF_SVE here before freeing (also wondering
> whether we should make sve_state NULL before the actual freeing but I
> think TIF_SVE clearing should suffice).

Could do.  I feel that the current placement of the TIF_SVE clearing in
sve_set_vector_length() feels "more natural", but this is a pretty
flimsy argument.  How strongly do you feel about this?

[...]

> > + *  * TIF_SVE set:

[...]

> > + *  * TIF_SVE clear:
> > + *
> > + *An attempt by the user task to execute an SVE instruction causes
> > + *do_sve_acc() to be called, which does some preparation and then
> > + *sets TIF_SVE.
> > + *
> > + *When stored, FPSIMD registers V0-V31 are encoded in
> > + *task->fpsimd_state; bits [max : 128] for each of Z0-Z31 are
> > + *logically zero but not stored anywhere; P0-P15 and FFR are not
> > + *stored and have unspecified values from userspace's point of
> > + *view.  For 

Re: [PATCH 02/37] KVM: arm64: Rework hyp_panic for VHE and non-VHE

2017-10-12 Thread Marc Zyngier
On 12/10/17 11:41, Christoffer Dall wrote:
> VHE actually doesn't rely on clearing the VTTBR when returning to the
> hsot kernel, and that is the current key mechanism of hyp_panic to

host

> figure out how to attempt to return to a state good enough to print a
> panic statement.
> 
> Therefore, we split the hyp_panic function into two functions, a VHE and
> a non-VHE, keeping the non-VHE version intact, but changing the VHE
> behavior.
> 
> The vttbr_el2 check on VHE doesn't really make that much sense, because
> the only situation where we can get here on VHE is when the hypervisor
> assembly code actually caleld into hyp_panic, which only happens when

called

> VBAR_EL2 has been set to the KVM exception vectors.  On VHE, we can
> always safely disable the traps and restore the host registers at this
> point, so we simply do that unconditionally and call into the panic
> function directly.
> 
> Signed-off-by: Christoffer Dall 
> ---
>  arch/arm64/kvm/hyp/switch.c | 45 
> +++--
>  1 file changed, 23 insertions(+), 22 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index a0123ad..a50ddf3 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -394,10 +394,20 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>  static const char __hyp_panic_string[] = "HYP panic:\nPS:%08llx PC:%016llx 
> ESR:%08llx\nFAR:%016llx HPFAR:%016llx PAR:%016llx\nVCPU:%p\n";
>  
>  static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par,
> -  struct kvm_vcpu *vcpu)
> +  struct kvm_cpu_context 
> *__host_ctxt)
>  {
> + struct kvm_vcpu *vcpu;
>   unsigned long str_va;
>  
> + vcpu = __host_ctxt->__hyp_running_vcpu;
> +
> + if (read_sysreg(vttbr_el2)) {
> + __timer_disable_traps(vcpu);
> + __deactivate_traps(vcpu);
> + __deactivate_vm(vcpu);
> + __sysreg_restore_host_state(__host_ctxt);
> + }
> +
>   /*
>* Force the panic string to be loaded from the literal pool,
>* making sure it is a kernel address and not a PC-relative
> @@ -411,40 +421,31 @@ static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, 
> u64 elr, u64 par,
>  read_sysreg(hpfar_el2), par, vcpu);
>  }
>  
> -static void __hyp_text __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par,
> - struct kvm_vcpu *vcpu)
> +static void __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par,
> +  struct kvm_cpu_context *host_ctxt)
>  {
> + struct kvm_vcpu *vcpu;
> + vcpu = host_ctxt->__hyp_running_vcpu;
> +
> + __deactivate_traps_vhe();

Is there a reason why we can't just call __deactivate_traps(), rather
than the VHE-specific subset? It doesn't really matter, as we're about
to panic, but still...

> + __sysreg_restore_host_state(host_ctxt);
> +
>   panic(__hyp_panic_string,
> spsr,  elr,
> read_sysreg_el2(esr),   read_sysreg_el2(far),
> read_sysreg(hpfar_el2), par, vcpu);
>  }
>  
> -static hyp_alternate_select(__hyp_call_panic,
> - __hyp_call_panic_nvhe, __hyp_call_panic_vhe,
> - ARM64_HAS_VIRT_HOST_EXTN);
> -
>  void __hyp_text __noreturn hyp_panic(struct kvm_cpu_context *__host_ctxt)
>  {
> - struct kvm_vcpu *vcpu = NULL;
> -
>   u64 spsr = read_sysreg_el2(spsr);
>   u64 elr = read_sysreg_el2(elr);
>   u64 par = read_sysreg(par_el1);
>  
> - if (read_sysreg(vttbr_el2)) {
> - struct kvm_cpu_context *host_ctxt;
> -
> - host_ctxt = __host_ctxt;
> - vcpu = host_ctxt->__hyp_running_vcpu;
> - __timer_disable_traps(vcpu);
> - __deactivate_traps(vcpu);
> - __deactivate_vm(vcpu);
> - __sysreg_restore_host_state(host_ctxt);
> - }
> -
> - /* Call panic for real */
> - __hyp_call_panic()(spsr, elr, par, vcpu);
> + if (!has_vhe())
> + __hyp_call_panic_nvhe(spsr, elr, par, __host_ctxt);
> + else
> + __hyp_call_panic_vhe(spsr, elr, par, __host_ctxt);
>  
>   unreachable();
>  }
> 

Otherwise looks good.

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 01/37] KVM: arm64: Avoid storing the vcpu pointer on the stack

2017-10-12 Thread Marc Zyngier
On 12/10/17 11:41, Christoffer Dall wrote:
> We already have the percpu area for the host cpu state, which points to
> the VCPU, so there's no need to store the VCPU pointer on the stack on
> every context switch.  We can be a little more clever and just use
> tpidr_el2 for the percpu offset and load the VCPU pointer from the host
> context.
> 
> This requires us to have a scratch register though, so we take the
> chance to rearrange some of the el1_sync code to only look at the
> vttbr_el2 to determine if this is a trap from the guest or an HVC from
> the host.  We do add an extra check to call the panic code if the kernel
> is configured with debugging enabled and we saw a trap from the host
> which wasn't an HVC, indicating that we left some EL2 trap configured by
> mistake.
> 
> Signed-off-by: Christoffer Dall 
> ---
>  arch/arm64/include/asm/kvm_asm.h | 20 
>  arch/arm64/kernel/asm-offsets.c  |  1 +
>  arch/arm64/kvm/hyp/entry.S   |  5 +
>  arch/arm64/kvm/hyp/hyp-entry.S   | 39 ++-
>  arch/arm64/kvm/hyp/switch.c  |  2 +-
>  5 files changed, 41 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_asm.h 
> b/arch/arm64/include/asm/kvm_asm.h
> index ab4d0a9..7e48a39 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -70,4 +70,24 @@ extern u32 __init_stage2_translation(void);
>  
>  #endif
>  
> +#ifdef __ASSEMBLY__
> +.macro get_host_ctxt reg, tmp
> + /*
> +  * '=kvm_host_cpu_state' is a host VA from the constant pool, it may
> +  * not be accessible by this address from EL2, hyp_panic() converts
> +  * it with kern_hyp_va() before use.
> +  */

This really looks like a stale comment, as there is no hyp_panic
involved here anymore (thankfully!).

> + ldr \reg, =kvm_host_cpu_state
> + mrs \tmp, tpidr_el2
> + add \reg, \reg, \tmp
> + kern_hyp_va \reg

Here, we're trading a load from the stack for a load from the constant
pool. Can't we do something like:

adr_l   \reg, kvm_host_cpu_state
msr \tmp, tpidr_el2
add \reg, \reg, \tmp

and that's it? This relies on the property that the kernel/hyp offset is
constant, and that it doesn't matter if we add the offset to a kernel VA
or a HYP VA... Completely untested of course!

> +.endm
> +
> +.macro get_vcpu vcpu, ctxt
> + ldr \vcpu, [\ctxt, #HOST_CONTEXT_VCPU]
> + kern_hyp_va \vcpu
> +.endm
> +
> +#endif
> +
>  #endif /* __ARM_KVM_ASM_H__ */
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 71bf088..612021d 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -135,6 +135,7 @@ int main(void)
>DEFINE(CPU_FP_REGS,offsetof(struct kvm_regs, fp_regs));
>DEFINE(VCPU_FPEXC32_EL2,   offsetof(struct kvm_vcpu, 
> arch.ctxt.sys_regs[FPEXC32_EL2]));
>DEFINE(VCPU_HOST_CONTEXT,  offsetof(struct kvm_vcpu, 
> arch.host_cpu_context));
> +  DEFINE(HOST_CONTEXT_VCPU,  offsetof(struct kvm_cpu_context, 
> __hyp_running_vcpu));
>  #endif
>  #ifdef CONFIG_CPU_PM
>DEFINE(CPU_SUSPEND_SZ, sizeof(struct cpu_suspend_ctx));
> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
> index 9a8ab5d..76cd48f 100644
> --- a/arch/arm64/kvm/hyp/entry.S
> +++ b/arch/arm64/kvm/hyp/entry.S
> @@ -62,9 +62,6 @@ ENTRY(__guest_enter)
>   // Store the host regs
>   save_callee_saved_regs x1
>  
> - // Store host_ctxt and vcpu for use at exit time
> - stp x1, x0, [sp, #-16]!
> -
>   add x18, x0, #VCPU_CONTEXT
>  
>   // Restore guest regs x0-x17
> @@ -119,7 +116,7 @@ ENTRY(__guest_exit)
>   save_callee_saved_regs x1
>  
>   // Restore the host_ctxt from the stack

Stale comment again.

> - ldr x2, [sp], #16
> + get_host_ctxt   x2, x3
>  
>   // Now restore the host regs
>   restore_callee_saved_regs x2
> diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
> index e4f37b9..2950f26 100644
> --- a/arch/arm64/kvm/hyp/hyp-entry.S
> +++ b/arch/arm64/kvm/hyp/hyp-entry.S
> @@ -56,19 +56,16 @@ ENDPROC(__vhe_hyp_call)
>  el1_sync:// Guest trapped into EL2
>   stp x0, x1, [sp, #-16]!
>  
> -alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
> - mrs x1, esr_el2
> -alternative_else
> - mrs x1, esr_el1
> -alternative_endif
> - lsr x0, x1, #ESR_ELx_EC_SHIFT
> -
> - cmp x0, #ESR_ELx_EC_HVC64
> - b.neel1_trap
> -
>   mrs x1, vttbr_el2   // If vttbr is valid, the 64bit guest
>   cbnzx1, el1_trap// called HVC

Comment is outdated. Any guest trap will take this path.

>  
> +#ifdef CONFIG_DEBUG
> + mrs x0, esr_el2
> + lsr x0, x0, #ESR_ELx_EC_SHIFT
> + cmp x0, #ESR_ELx_EC_HVC64
> + b.ne__hyp_panic
> +#endif
> +
>   /* Here, 

Re: [PATCH v3 16/28] arm64/sve: Probe SVE capabilities and usable vector lengths

2017-10-12 Thread Suzuki K Poulose

On 10/10/17 19:38, Dave Martin wrote:

This patch uses the cpufeatures framework to determine common SVE
capabilities and vector lengths, and configures the runtime SVE
support code appropriately.

ZCR_ELx is not really a feature register, but it is convenient to
use it as a template for recording the maximum vector length
supported by a CPU, using the LEN field.  This field is similar to
a feature field in that it is a contiguous bitfield for which we
want to determine the minimum system-wide value.  This patch adds
ZCR as a pseudo-register in cpuinfo/cpufeatures, with appropriate
custom code to populate it.  Finding the minimum supported value of
the LEN field is left to the cpufeatures framework in the usual
way.

The meaning of ID_AA64ZFR0_EL1 is not architecturally defined yet,
so for now we just require it to be zero.

Note that much of this code is dormant and SVE still won't be used
yet, since system_supports_sve() remains hardwired to false.

Signed-off-by: Dave Martin 
Cc: Alex Bennée 
Cc: Suzuki K Poulose 

---

Dropped Alex Bennée's Reviewed-by, since there is new logic in this
patch.

Changes since v2


Bug fixes:

  * Got rid of dynamic allocation of the shadow vector length map during
secondary boot.  Secondary CPU boot takes place in atomic context,
and relying on GFP_ATOMIC here doesn't seem justified.

Instead, the needed additional bitmap is allocated statically.  Only
one shadow map is needed, because CPUs don't boot concurrently.

Requested by Alex Bennée:

  * Reflowed untidy comment above read_zcr_features()

  * Added comments to read_zcr_features() to explain what it's trying to do
(which is otherwise not readily apparent).

Requested by Catalin Marinas:

  * Moved disabling of the EL1 SVE trap to the cpufeatures C code.
This allows addition of new assembler in __cpu_setup to be
avoided.

Miscellaneous:

  * Added comments explaining the intent, purpose and basic constraints
for fpsimd.c helpers.


...

  

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 92a9502..c5acf38 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c


...


@@ -670,6 +689,14 @@ void update_cpu_features(int cpu,
info->reg_mvfr2, boot->reg_mvfr2);
}
  
+	if (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) {

+   taint |= check_update_ftr_reg(SYS_ZCR_EL1, cpu,
+   info->reg_zcr, boot->reg_zcr);
+
+   if (!sys_caps_initialised)
+   sve_update_vq_map();
+   }


nit: I am not sure if we should also check if the "current" sanitised value
of the id_aa64pfr0 also supports sve and skip the update if it isn't. The code
is as such fine without the check, its just that we can avoid computing the
map. It is in the CPU boot up path and hence is not performance critical.
So, either way we are fine.

Reviewed-by: Suzuki K Poulose 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 19/20] KVM: arm64: Handle RAS SErrors from EL2 on guest exit

2017-10-12 Thread James Morse
Hi Marc,

On 11/10/17 11:37, Marc Zyngier wrote:
> On Thu, Oct 05 2017 at  8:18:11 pm BST, James Morse  
> wrote:
>> We expect to have firmware-first handling of RAS SErrors, with errors
>> notified via an APEI method. For systems without firmware-first, add
>> some minimal handling to KVM.
>>
>> There are two ways KVM can take an SError due to a guest, either may be a
>> RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
>> or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
>>
>> The current SError from EL2 code unmasks SError and tries to fence any
>> pending SError into a single instruction window. It then leaves SError
>> unmasked.
>>
>> With the v8.2 RAS Extensions we may take an SError for a 'corrected'
>> error, but KVM is only able to handle SError from EL2 if they occur
>> during this single instruction window...
>>
>> The RAS Extensions give us a new instruction to synchronise and
>> consume SErrors. The RAS Extensions document (ARM DDI0587),
>> '2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
>> SError interrupts generated by 'instructions, translation table walks,
>> hardware updates to the translation tables, and instruction fetches on
>> the same PE'. This makes ESB equivalent to KVMs existing
>> 'dsb, mrs-daifclr, isb' sequence.
>>
>> Use the alternatives to synchronise and consume any SError using ESB
>> instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
>> in the exit_code so that we can restart the vcpu if it turns out this
>> SError has no impact on the vcpu.

>> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
>> index 12ee62d6d410..96caa5328b3a 100644
>> --- a/arch/arm64/kvm/hyp/entry.S
>> +++ b/arch/arm64/kvm/hyp/entry.S
>> @@ -124,6 +124,17 @@ ENTRY(__guest_exit)
>>  // Now restore the host regs
>>  restore_callee_saved_regs x2
>>  
>> +alternative_if ARM64_HAS_RAS_EXTN
>> +// If we have the RAS extensions we can consume a pending error
>> +// without an unmask-SError and isb.
>> +esb
>> +mrs_s   x2, SYS_DISR_EL1
>> +str x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
>> +cbz x2, 1f
>> +msr_s   SYS_DISR_EL1, xzr
>> +orr x0, x0, #(1<> +1:  ret
>> +alternative_else
>>  // If we have a pending asynchronous abort, now is the
>>  // time to find out. From your VAXorcist book, page 666:
>>  // "Threaten me not, oh Evil one!  For I speak with
>> @@ -135,6 +146,8 @@ ENTRY(__guest_exit)
>>  
>>  dsb sy  // Synchronize against in-flight ld/st
>>  msr daifclr, #4 // Unmask aborts
>> +nop
> 
> Oops. You've now introduced an instruction in what was supposed to be a
> single instruction window (the isb). It means that we may fail to
> identify the Serror as having been generated by our synchronisation
> mechanism, and we'll panic for no good reason.

Gah! and I pointed this thing out in the commit message,

>> +alternative_endif
>>  
>>  // This is our single instruction exception window. A pending
>>  // SError is guaranteed to occur at the earliest when we unmask

and here it is in the diff.


> Moving the nop up will solve this.

Yes.

I've fixed this for v4, along with Julien's suggestions.


Thanks!

James
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 08/20] arm64: entry.S: convert elX_irq

2017-10-12 Thread James Morse
Hi Julien,

On 11/10/17 18:13, Julien Thierry wrote:
> On 05/10/17 20:18, James Morse wrote:
>> Following our 'dai' order, irqs should be processed with debug and
>> serror exceptions unmasked.
>>  > Add a helper to unmask these two, (and fiq for good measure).
>>
>> Signed-off-by: James Morse 
>> ---
>>   arch/arm64/include/asm/assembler.h | 4 
>>   arch/arm64/kernel/entry.S  | 4 ++--
>>   2 files changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/assembler.h
>> b/arch/arm64/include/asm/assembler.h
>> index c2a37e2f733c..7ffb2a629dc9 100644
>> --- a/arch/arm64/include/asm/assembler.h
>> +++ b/arch/arm64/include/asm/assembler.h
>> @@ -54,6 +54,10 @@
>>   msrdaif, \tmp
>>   .endm
>>   +.macro enable_da_f
>> +msrdaifclr, #(8 | 4 | 1)
>> +.endm
>> +
> 
> We use this in irq entries because we are inheriting daif + we want to disable
> irqs while we process irqs right?

In this case not inheriting, we only do that for synchronous exceptions because
we can't mask them, keeping the flags 'the same' lets us ignore them.

Here we are unconditionally unmasking things for handling interrupts. If we
stick with this dai order we know its safe to unmask SError and enable Debug.


> I don't know if it's worth adding a comment but I find it easier to think 
> about
> it like this.

If its not-obvious, there should be a comment:
/* IRQ is the lowest priority flag, unconditionally unmask the rest. */


(I was expecting flames for the hangman style naming!)

> Otherwise, for patches 3 to 8 (I don't have any comment on 3 to 7):
> Reviewed-by: Julien Thierry 

Thanks!

I'll post a v4 ~tomorrow with this and the renaming changes.



James

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 01/20] arm64: explicitly mask all exceptions

2017-10-12 Thread James Morse
Hi Julien,

On 11/10/17 17:30, Julien Thierry wrote:
> On 05/10/17 20:17, James Morse wrote:
>> There are a few places where we want to mask all exceptions. Today we
>> do this in a piecemeal fashion, typically we expect the caller to
>> have masked irqs and the arch code masks debug exceptions, ignoring
>> SError which is probably masked.
>>
>> Make it clear that 'mask all exceptions' is the intention by adding
>> helpers to do exactly that.
>>
>> This will let us unmask SError without having to add 'oh and SError'
>> to these paths.

>> diff --git a/arch/arm64/include/asm/daifflags.h
>> b/arch/arm64/include/asm/daifflags.h
>> new file mode 100644
>> index ..fb40da8e1457
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/daifflags.h

>> +#ifndef __ASM_DAIFFLAGS_H
>> +#define __ASM_DAIFFLAGS_H
>> +
>> +#include 
>> +#include 
>> +
>> +/* Mask/unmask/restore all exceptions, including interrupts. */
>> +static inline unsigned long local_mask_daif(void)
>> +{
>> +unsigned long flags;
>> +asm volatile(
>> +"mrs%0, daif// local_mask_daif\n"
>> +"msrdaifset, #0xf"
>> +: "=r" (flags)
>> +:
>> +: "memory");
>> +trace_hardirqs_off();
>> +return flags;
>> +}
>> +
> nit:
> Can we call this local_daif_save? 

Sure, having a save version fits better with the C irqflags versions.

mask/unmask is just being pedantic, irqs and SError aren't disabled by these
flags, (but curiously debug is...).


> (and maybe a version that only disables without saving?)

irqflags.h has one, so yes, why not.


> Also, I was wondering why use 'local' as prefix? Is there gonna be a similar 
> set
> of function for arm that could be called by common code? (e.g. drivers?)

Its a hangover from the C irqflags calls like arch_local_save_flags() etc. I
dropped the 'arch' as these should only be called by arch code and 'daif' should
prevent any name collisions.

Drivers should never touch this stuff, if they do its likely to be because they
want to defer bursting-into-flames until we get to userspace, where its harder
to work out which driver to blame.


> Otherwise:
> Reviewed-by: Julien Thierry 

Thanks!


James
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 22/28] arm64/sve: KVM: Prevent guests from using SVE

2017-10-12 Thread Marc Zyngier
On 12/10/17 12:04, Dave Martin wrote:
> On Wed, Oct 11, 2017 at 05:28:06PM +0100, Marc Zyngier wrote:
>> [+ Christoffer]
>>
>> On 10/10/17 19:38, Dave Martin wrote:
>>> Until KVM has full SVE support, guests must not be allowed to
>>> execute SVE instructions.
>>>
>>> This patch enables the necessary traps, and also ensures that the
>>> traps are disabled again on exit from the guest so that the host
>>> can still use SVE if it wants to.
>>>
>>> This patch introduces another instance of
>>> __this_cpu_write(fpsimd_last_state, NULL), so this flush operation
>>> is abstracted out as a separate helper fpsimd_flush_cpu_state().
>>> Other instances are ported appropriately.
>>>
>>> As a side effect of this refactoring, a this_cpu_write() in
>>> fpsimd_cpu_pm_notifier() is changed to __this_cpu_write().  This
>>> should be fine, since cpu_pm_enter() is supposed to be called only
>>> with interrupts disabled.
>>>
>>> Signed-off-by: Dave Martin 
>>> Reviewed-by: Alex Bennée 
>>> Cc: Marc Zyngier 
>>> Cc: Ard Biesheuvel 
>>> ---
> 
> [...]
> 
>>> diff --git a/arch/arm64/include/asm/kvm_host.h 
>>> b/arch/arm64/include/asm/kvm_host.h
>>> index e923b58..674912d 100644
>>> --- a/arch/arm64/include/asm/kvm_host.h
>>> +++ b/arch/arm64/include/asm/kvm_host.h
> 
> [...]
> 
>>> @@ -384,4 +385,14 @@ static inline void __cpu_init_stage2(void)
> 
> [...]
> 
>>> +static inline void kvm_fpsimd_flush_cpu_state(void)
>>> +{
>>> +   if (system_supports_sve())
>>> +   sve_flush_cpu_state();
>>
>> Hmmm. How does this work if...
> 
> !IS_ENABLED(CONFIG_ARM64_SVE) implies !system_supports_sve(), so
> if CONFIG_ARM64_SVE is not set, the call is optimised away.
> 
> [...]
> 
>>> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
>>> index a9cb794..6ae3703 100644
>>> --- a/arch/arm64/kernel/fpsimd.c
>>> +++ b/arch/arm64/kernel/fpsimd.c
>>> @@ -1073,6 +1073,33 @@ void fpsimd_flush_task_state(struct task_struct *t)
> 
> [...]
> 
>>> +#ifdef CONFIG_ARM64_SVE
>>> +void sve_flush_cpu_state(void)
>>> +{
>>> +   struct fpsimd_state *const fpstate = __this_cpu_read(fpsimd_last_state);
>>> +   struct task_struct *tsk;
>>> +
>>> +   if (!fpstate)
>>> +   return;
>>> +
>>> +   tsk = container_of(fpstate, struct task_struct, thread.fpsimd_state);
>>> +   if (test_tsk_thread_flag(tsk, TIF_SVE))
>>> +   fpsimd_flush_cpu_state();
>>> +}
>>> +#endif /* CONFIG_ARM64_SVE */
>>
>> ... CONFIG_ARM64_SVE is not set? Fixing this should just be a matter of
>> moving the #ifdef/#endif inside the function...
> 
> Because sve_flush_cpu_state() is not in the same compilation unit it
> can't be static, and that means the compiler won't remove it
> automatically if it's unused -- hence the #ifdef.
> 
> Because the call site is optimised away, there is no link failure.
> 
> Don't we rely on this sort of thing all over the place?
Dunno. It just feels weird. But if you are sure that it won't break,
fine by me. I guess we'll find out pretty quickly how this fares,
specially with older toolchains.

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 22/28] arm64/sve: KVM: Prevent guests from using SVE

2017-10-12 Thread Dave Martin
On Wed, Oct 11, 2017 at 05:28:06PM +0100, Marc Zyngier wrote:
> [+ Christoffer]
> 
> On 10/10/17 19:38, Dave Martin wrote:
> > Until KVM has full SVE support, guests must not be allowed to
> > execute SVE instructions.
> > 
> > This patch enables the necessary traps, and also ensures that the
> > traps are disabled again on exit from the guest so that the host
> > can still use SVE if it wants to.
> > 
> > This patch introduces another instance of
> > __this_cpu_write(fpsimd_last_state, NULL), so this flush operation
> > is abstracted out as a separate helper fpsimd_flush_cpu_state().
> > Other instances are ported appropriately.
> > 
> > As a side effect of this refactoring, a this_cpu_write() in
> > fpsimd_cpu_pm_notifier() is changed to __this_cpu_write().  This
> > should be fine, since cpu_pm_enter() is supposed to be called only
> > with interrupts disabled.
> > 
> > Signed-off-by: Dave Martin 
> > Reviewed-by: Alex Bennée 
> > Cc: Marc Zyngier 
> > Cc: Ard Biesheuvel 
> > ---

[...]

> > diff --git a/arch/arm64/include/asm/kvm_host.h 
> > b/arch/arm64/include/asm/kvm_host.h
> > index e923b58..674912d 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h

[...]

> > @@ -384,4 +385,14 @@ static inline void __cpu_init_stage2(void)

[...]

> > +static inline void kvm_fpsimd_flush_cpu_state(void)
> > +{
> > +   if (system_supports_sve())
> > +   sve_flush_cpu_state();
> 
> Hmmm. How does this work if...

!IS_ENABLED(CONFIG_ARM64_SVE) implies !system_supports_sve(), so
if CONFIG_ARM64_SVE is not set, the call is optimised away.

[...]

> > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> > index a9cb794..6ae3703 100644
> > --- a/arch/arm64/kernel/fpsimd.c
> > +++ b/arch/arm64/kernel/fpsimd.c
> > @@ -1073,6 +1073,33 @@ void fpsimd_flush_task_state(struct task_struct *t)

[...]

> > +#ifdef CONFIG_ARM64_SVE
> > +void sve_flush_cpu_state(void)
> > +{
> > +   struct fpsimd_state *const fpstate = __this_cpu_read(fpsimd_last_state);
> > +   struct task_struct *tsk;
> > +
> > +   if (!fpstate)
> > +   return;
> > +
> > +   tsk = container_of(fpstate, struct task_struct, thread.fpsimd_state);
> > +   if (test_tsk_thread_flag(tsk, TIF_SVE))
> > +   fpsimd_flush_cpu_state();
> > +}
> > +#endif /* CONFIG_ARM64_SVE */
> 
> ... CONFIG_ARM64_SVE is not set? Fixing this should just be a matter of
> moving the #ifdef/#endif inside the function...

Because sve_flush_cpu_state() is not in the same compilation unit it
can't be static, and that means the compiler won't remove it
automatically if it's unused -- hence the #ifdef.

Because the call site is optimised away, there is no link failure.

Don't we rely on this sort of thing all over the place?

Cheers
---Dave
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 18/28] arm64/sve: Preserve SVE registers around EFI runtime service calls

2017-10-12 Thread Catalin Marinas
On Tue, Oct 10, 2017 at 07:38:35PM +0100, Dave P Martin wrote:
> The EFI runtime services ABI allows EFI to make free use of the
> FPSIMD registers during EFI runtime service calls, subject to the
> callee-save requirements of the AArch64 procedure call standard.
> 
> However, the SVE architecture allows upper bits of the SVE vector
> registers to be zeroed as a side-effect of FPSIMD V-register
> writes.  This means that the SVE vector registers must be saved in
> their entirety in order to avoid data loss: non-SVE-aware EFI
> implementations cannot restore them correctly.
> 
> The non-IRQ case is already handled gracefully by
> kernel_neon_begin().  For the IRQ case, this patch allocates a
> suitable per-CPU stash buffer for the full SVE register state and
> uses it to preserve the affected registers around EFI calls.  It is
> currently unclear how the EFI runtime services ABI will be
> clarified with respect to SVE, so it safest to assume that the
> predicate registers and FFR must be saved and restored too.
> 
> No attempt is made to restore the restore the vector length after
> a call, for now.  It is deemed rather insane for EFI to change it,
> and contemporary EFI implementations certainly won't.
> 
> Signed-off-by: Dave Martin 
> Reviewed-by: Alex Bennée 
> Cc: Ard Biesheuvel 

Reviewed-by: Catalin Marinas 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 35/37] KVM: arm/arm64: Get rid of vgic_elrsr

2017-10-12 Thread Christoffer Dall
There is really no need to store the vgic_elrsr on the VGIC data
structures as the only need we have for the elrsr is to figure out if an
LR is inavtive when we save the VGIC state upon returning from the
guest.  We can might as well store this in a temporary local variable.

This also gets rid of the endianness conversion in the VGIC save
function, which is completely unnecessary and would actually result in
incorrect functionality on big-endian systems, because we are only using
typed values here and not converting pointers and reading different
types here.

Signed-off-by: Christoffer Dall 
---
 include/kvm/arm_vgic.h|  2 --
 virt/kvm/arm/hyp/vgic-v3-sr.c |  6 +++---
 virt/kvm/arm/vgic/vgic-v2.c   | 33 +++--
 virt/kvm/arm/vgic/vgic-v3.c   |  1 -
 4 files changed, 10 insertions(+), 32 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 34dba51..79c9e67 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -237,7 +237,6 @@ struct vgic_dist {
 struct vgic_v2_cpu_if {
u32 vgic_hcr;
u32 vgic_vmcr;
-   u64 vgic_elrsr; /* Saved only */
u32 vgic_apr;
u32 vgic_lr[VGIC_V2_MAX_LRS];
 };
@@ -246,7 +245,6 @@ struct vgic_v3_cpu_if {
u32 vgic_hcr;
u32 vgic_vmcr;
u32 vgic_sre;   /* Restored only, change ignored */
-   u32 vgic_elrsr; /* Saved only */
u32 vgic_ap0r[4];
u32 vgic_ap1r[4];
u64 vgic_lr[VGIC_V3_MAX_LRS];
diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c
index 91728fa..05548b2 100644
--- a/virt/kvm/arm/hyp/vgic-v3-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v3-sr.c
@@ -222,15 +222,16 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu 
*vcpu)
if (used_lrs) {
int i;
u32 nr_pre_bits;
+   u32 elrsr;
 
-   cpu_if->vgic_elrsr = read_gicreg(ICH_ELSR_EL2);
+   elrsr = read_gicreg(ICH_ELSR_EL2);
 
write_gicreg(0, ICH_HCR_EL2);
val = read_gicreg(ICH_VTR_EL2);
nr_pre_bits = vtr_to_nr_pre_bits(val);
 
for (i = 0; i < used_lrs; i++) {
-   if (cpu_if->vgic_elrsr & (1 << i))
+   if (elrsr & (1 << i))
cpu_if->vgic_lr[i] &= ~ICH_LR_STATE;
else
cpu_if->vgic_lr[i] = __gic_v3_get_lr(i);
@@ -261,7 +262,6 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
if (static_branch_unlikely(_v3_cpuif_trap))
write_gicreg(0, ICH_HCR_EL2);
 
-   cpu_if->vgic_elrsr = 0x;
cpu_if->vgic_ap0r[0] = 0;
cpu_if->vgic_ap0r[1] = 0;
cpu_if->vgic_ap0r[2] = 0;
diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
index 259079b..df7e03b 100644
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -247,7 +247,6 @@ void vgic_v2_enable(struct kvm_vcpu *vcpu)
 * anyway.
 */
vcpu->arch.vgic_cpu.vgic_v2.vgic_vmcr = 0;
-   vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr = ~0;
 
/* Get the show on the road... */
vcpu->arch.vgic_cpu.vgic_v2.vgic_hcr = GICH_HCR_EN;
@@ -404,33 +403,19 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
return ret;
 }
 
-static void save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
-{
-   struct vgic_v2_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v2;
-   int nr_lr = kvm_vgic_global_state.nr_lr;
-   u32 elrsr0, elrsr1;
-
-   elrsr0 = readl_relaxed(base + GICH_ELRSR0);
-   if (unlikely(nr_lr > 32))
-   elrsr1 = readl_relaxed(base + GICH_ELRSR1);
-   else
-   elrsr1 = 0;
-
-#ifdef CONFIG_CPU_BIG_ENDIAN
-   cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1;
-#else
-   cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0;
-#endif
-}
-
 static void save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
 {
struct vgic_v2_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v2;
-   int i;
u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
+   u64 elrsr;
+   int i;
+
+   elrsr = readl_relaxed(base + GICH_ELRSR0);
+   if (unlikely(used_lrs > 32))
+   elrsr |= ((u64)readl_relaxed(base + GICH_ELRSR1)) << 32;
 
for (i = 0; i < used_lrs; i++) {
-   if (cpu_if->vgic_elrsr & (1UL << i))
+   if (elrsr & (1UL << i))
cpu_if->vgic_lr[i] &= ~GICH_LR_STATE;
else
cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i 
* 4));
@@ -452,13 +437,9 @@ void vgic_v2_save_state(struct kvm_vcpu *vcpu)
 
if (used_lrs) {
cpu_if->vgic_apr = readl_relaxed(base + 

[PATCH 25/37] KVM: arm64: Prepare to handle traps on remaining deferred EL1 sysregs

2017-10-12 Thread Christoffer Dall
Handle accesses during traps to any remaining EL1 registers which can be
deferred to vcpu_load and vcpu_put, by either accessing them directly on
the physical CPU when the latest version is stored there, or by
synchronizing the memory representation with the CPU state.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/include/asm/kvm_emulate.h | 14 ---
 arch/arm64/kvm/inject_fault.c| 79 
 arch/arm64/kvm/sys_regs.c|  6 ++-
 3 files changed, 76 insertions(+), 23 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index 630dd60..69bb40d 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -66,11 +66,6 @@ static inline unsigned long *vcpu_pc(const struct kvm_vcpu 
*vcpu)
return (unsigned long *)_gp_regs(vcpu)->regs.pc;
 }
 
-static inline unsigned long *vcpu_elr_el1(const struct kvm_vcpu *vcpu)
-{
-   return (unsigned long *)_gp_regs(vcpu)->elr_el1;
-}
-
 static inline unsigned long *vcpu_cpsr(const struct kvm_vcpu *vcpu)
 {
return (unsigned long *)_gp_regs(vcpu)->regs.pstate;
@@ -120,15 +115,6 @@ static inline void vcpu_set_reg(struct kvm_vcpu *vcpu, u8 
reg_num,
vcpu_gp_regs(vcpu)->regs.regs[reg_num] = val;
 }
 
-/* Get vcpu SPSR for current mode */
-static inline unsigned long *vcpu_spsr(const struct kvm_vcpu *vcpu)
-{
-   if (vcpu_mode_is_32bit(vcpu))
-   return vcpu_spsr32(vcpu);
-
-   return (unsigned long *)_gp_regs(vcpu)->spsr[KVM_SPSR_EL1];
-}
-
 static inline bool vcpu_mode_priv(const struct kvm_vcpu *vcpu)
 {
u32 mode;
diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index 45c7026..f4513fc 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -23,6 +23,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #define PSTATE_FAULT_BITS_64   (PSR_MODE_EL1h | PSR_A_BIT | PSR_F_BIT | \
@@ -33,13 +34,55 @@
 #define LOWER_EL_AArch64_VECTOR0x400
 #define LOWER_EL_AArch32_VECTOR0x600
 
+static u64 vcpu_get_vbar_el1(struct kvm_vcpu *vcpu)
+{
+   unsigned long vbar;
+
+   if (vcpu->arch.sysregs_loaded_on_cpu)
+   vbar = read_sysreg_el1(vbar);
+   else
+   vbar = vcpu_sys_reg(vcpu, VBAR_EL1);
+
+   if (vcpu_el1_is_32bit(vcpu))
+   return lower_32_bits(vbar);
+   return vbar;
+}
+
+static void vcpu_set_elr_el1(struct kvm_vcpu *vcpu, u64 val)
+{
+   if (vcpu->arch.sysregs_loaded_on_cpu)
+   write_sysreg_el1(val, elr);
+   else
+   vcpu_gp_regs(vcpu)->elr_el1 = val;
+}
+
+/* Set the SPSR for the current mode */
+static void vcpu_set_spsr(struct kvm_vcpu *vcpu, u64 val)
+{
+   if (vcpu_mode_is_32bit(vcpu))
+   *vcpu_spsr32(vcpu) = val;
+
+   if (vcpu->arch.sysregs_loaded_on_cpu)
+   write_sysreg_el1(val, spsr);
+   else
+   vcpu_gp_regs(vcpu)->spsr[KVM_SPSR_EL1] = val;
+}
+
+static u32 vcpu_get_c1_sctlr(struct kvm_vcpu *vcpu)
+{
+   if (vcpu->arch.sysregs_loaded_on_cpu)
+   return lower_32_bits(read_sysreg_el1(sctlr));
+   else
+   return vcpu_cp15(vcpu, c1_SCTLR);
+}
+
 static void prepare_fault32(struct kvm_vcpu *vcpu, u32 mode, u32 vect_offset)
 {
unsigned long cpsr;
unsigned long new_spsr_value = *vcpu_cpsr(vcpu);
bool is_thumb = (new_spsr_value & COMPAT_PSR_T_BIT);
u32 return_offset = (is_thumb) ? 4 : 0;
-   u32 sctlr = vcpu_cp15(vcpu, c1_SCTLR);
+   u32 sctlr = vcpu_get_c1_sctlr(vcpu);
 
cpsr = mode | COMPAT_PSR_I_BIT;
 
@@ -51,14 +94,14 @@ static void prepare_fault32(struct kvm_vcpu *vcpu, u32 
mode, u32 vect_offset)
*vcpu_cpsr(vcpu) = cpsr;
 
/* Note: These now point to the banked copies */
-   *vcpu_spsr(vcpu) = new_spsr_value;
+   vcpu_set_spsr(vcpu, new_spsr_value);
*vcpu_reg32(vcpu, 14) = *vcpu_pc(vcpu) + return_offset;
 
/* Branch to exception vector */
if (sctlr & (1 << 13))
vect_offset += 0x;
else /* always have security exceptions */
-   vect_offset += vcpu_cp15(vcpu, c12_VBAR);
+   vect_offset += vcpu_get_vbar_el1(vcpu);
 
*vcpu_pc(vcpu) = vect_offset;
 }
@@ -79,6 +122,20 @@ static void inject_abt32(struct kvm_vcpu *vcpu, bool 
is_pabt,
u32 *far, *fsr;
bool is_lpae;
 
+   /*
+* We are going to need the latest values of the following system
+* regiters:
+*   DFAR:  mapped to FAR_EL1
+*   IFAR:  mapped to FAR_EL1
+*   DFSR:  mapped to ESR_EL1
+*   TTBCR: mapped to TCR_EL1
+*/
+   if (vcpu->arch.sysregs_loaded_on_cpu) {
+   vcpu->arch.ctxt.sys_regs[FAR_EL1] = read_sysreg_el1(far);
+   vcpu->arch.ctxt.sys_regs[ESR_EL1] = read_sysreg_el1(esr);
+  

[PATCH 17/37] KVM: arm64: Move userspace system registers into separate function

2017-10-12 Thread Christoffer Dall
There's a semantic difference between the EL1 registers that control
operation of a kernel running in EL1 and EL1 registers that only control
userspace execution in EL0.  Since we can defer saving/restoring the
latter, move them into their own function.

We also take this chance to rename the function saving/restoring the
remaining system register to make it clear this function deals with
the EL1 system registers.

No functional change.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/hyp/sysreg-sr.c | 34 +++---
 1 file changed, 23 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index c4a3714..193c2b0 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -34,14 +34,18 @@ static void __hyp_text __sysreg_do_nothing(struct 
kvm_cpu_context *ctxt) { }
 
 static void __hyp_text __sysreg_save_common_state(struct kvm_cpu_context *ctxt)
 {
+   ctxt->sys_regs[MDSCR_EL1]   = read_sysreg(mdscr_el1);
+   ctxt->gp_regs.regs.sp   = read_sysreg(sp_el0);
+}
+
+static void __hyp_text __sysreg_save_user_state(struct kvm_cpu_context *ctxt)
+{
ctxt->sys_regs[ACTLR_EL1]   = read_sysreg(actlr_el1);
ctxt->sys_regs[TPIDR_EL0]   = read_sysreg(tpidr_el0);
ctxt->sys_regs[TPIDRRO_EL0] = read_sysreg(tpidrro_el0);
-   ctxt->sys_regs[MDSCR_EL1]   = read_sysreg(mdscr_el1);
-   ctxt->gp_regs.regs.sp   = read_sysreg(sp_el0);
 }
 
-static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt)
+static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
 {
ctxt->sys_regs[MPIDR_EL1]   = read_sysreg(vmpidr_el2);
ctxt->sys_regs[CSSELR_EL1]  = read_sysreg(csselr_el1);
@@ -70,31 +74,37 @@ static void __hyp_text __sysreg_save_state(struct 
kvm_cpu_context *ctxt)
 }
 
 static hyp_alternate_select(__sysreg_call_save_host_state,
-   __sysreg_save_state, __sysreg_do_nothing,
+   __sysreg_save_el1_state, __sysreg_do_nothing,
ARM64_HAS_VIRT_HOST_EXTN);
 
 void __hyp_text __sysreg_save_host_state(struct kvm_cpu_context *ctxt)
 {
__sysreg_call_save_host_state()(ctxt);
__sysreg_save_common_state(ctxt);
+   __sysreg_save_user_state(ctxt);
 }
 
 void __hyp_text __sysreg_save_guest_state(struct kvm_cpu_context *ctxt)
 {
-   __sysreg_save_state(ctxt);
+   __sysreg_save_el1_state(ctxt);
__sysreg_save_common_state(ctxt);
+   __sysreg_save_user_state(ctxt);
 }
 
 static void __hyp_text __sysreg_restore_common_state(struct kvm_cpu_context 
*ctxt)
 {
-   write_sysreg(ctxt->sys_regs[ACTLR_EL1],   actlr_el1);
-   write_sysreg(ctxt->sys_regs[TPIDR_EL0],   tpidr_el0);
-   write_sysreg(ctxt->sys_regs[TPIDRRO_EL0], tpidrro_el0);
write_sysreg(ctxt->sys_regs[MDSCR_EL1],   mdscr_el1);
write_sysreg(ctxt->gp_regs.regs.sp,   sp_el0);
 }
 
-static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt)
+static void __hyp_text __sysreg_restore_user_state(struct kvm_cpu_context 
*ctxt)
+{
+   write_sysreg(ctxt->sys_regs[ACTLR_EL1], actlr_el1);
+   write_sysreg(ctxt->sys_regs[TPIDR_EL0], tpidr_el0);
+   write_sysreg(ctxt->sys_regs[TPIDRRO_EL0],   tpidrro_el0);
+}
+
+static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
 {
write_sysreg(ctxt->sys_regs[MPIDR_EL1], vmpidr_el2);
write_sysreg(ctxt->sys_regs[CSSELR_EL1],csselr_el1);
@@ -123,19 +133,21 @@ static void __hyp_text __sysreg_restore_state(struct 
kvm_cpu_context *ctxt)
 }
 
 static hyp_alternate_select(__sysreg_call_restore_host_state,
-   __sysreg_restore_state, __sysreg_do_nothing,
+   __sysreg_restore_el1_state, __sysreg_do_nothing,
ARM64_HAS_VIRT_HOST_EXTN);
 
 void __hyp_text __sysreg_restore_host_state(struct kvm_cpu_context *ctxt)
 {
__sysreg_call_restore_host_state()(ctxt);
__sysreg_restore_common_state(ctxt);
+   __sysreg_restore_user_state(ctxt);
 }
 
 void __hyp_text __sysreg_restore_guest_state(struct kvm_cpu_context *ctxt)
 {
-   __sysreg_restore_state(ctxt);
+   __sysreg_restore_el1_state(ctxt);
__sysreg_restore_common_state(ctxt);
+   __sysreg_restore_user_state(ctxt);
 }
 
 static void __hyp_text __fpsimd32_save_state(struct kvm_cpu_context *ctxt)
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 18/37] KVM: arm64: Rewrite sysreg alternatives to static keys

2017-10-12 Thread Christoffer Dall
As we are about to move calls around in the sysreg save/restore logic,
let's first rewrite the alternative function callers, because it is
going to make the next patches much easier to read.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/hyp/sysreg-sr.c | 17 -
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index 193c2b0..b55c4ad 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -22,9 +22,6 @@
 #include 
 #include 
 
-/* Yes, this does nothing, on purpose */
-static void __hyp_text __sysreg_do_nothing(struct kvm_cpu_context *ctxt) { }
-
 /*
  * Non-VHE: Both host and guest must save everything.
  *
@@ -73,13 +70,10 @@ static void __hyp_text __sysreg_save_el1_state(struct 
kvm_cpu_context *ctxt)
ctxt->gp_regs.regs.pstate   = read_sysreg_el2(spsr);
 }
 
-static hyp_alternate_select(__sysreg_call_save_host_state,
-   __sysreg_save_el1_state, __sysreg_do_nothing,
-   ARM64_HAS_VIRT_HOST_EXTN);
-
 void __hyp_text __sysreg_save_host_state(struct kvm_cpu_context *ctxt)
 {
-   __sysreg_call_save_host_state()(ctxt);
+   if (!has_vhe())
+   __sysreg_save_el1_state(ctxt);
__sysreg_save_common_state(ctxt);
__sysreg_save_user_state(ctxt);
 }
@@ -132,13 +126,10 @@ static void __hyp_text __sysreg_restore_el1_state(struct 
kvm_cpu_context *ctxt)
write_sysreg_el2(ctxt->gp_regs.regs.pstate, spsr);
 }
 
-static hyp_alternate_select(__sysreg_call_restore_host_state,
-   __sysreg_restore_el1_state, __sysreg_do_nothing,
-   ARM64_HAS_VIRT_HOST_EXTN);
-
 void __hyp_text __sysreg_restore_host_state(struct kvm_cpu_context *ctxt)
 {
-   __sysreg_call_restore_host_state()(ctxt);
+   if (!has_vhe())
+   __sysreg_restore_el1_state(ctxt);
__sysreg_restore_common_state(ctxt);
__sysreg_restore_user_state(ctxt);
 }
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 34/37] KVM: arm/arm64: Handle VGICv3 save/restore from the main VGIC code on VHE

2017-10-12 Thread Christoffer Dall
Just like we can program the GICv2 hypervisor control interface directly
from the core vgic code, we can do the same for the GICv3 hypervisor
control interface on VHE systems.

We do this by simply calling the save/restore functions when we have VHE
and we can then get rid of the save/restore function calls from the VHE
world switch function.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/hyp/switch.c | 3 ---
 virt/kvm/arm/vgic/vgic.c| 4 
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 90da506..0cdf6ae 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -366,8 +366,6 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu)
activate_traps_vhe(vcpu);
__activate_vm(vcpu);
 
-   __vgic_restore_state(vcpu);
-
sysreg_restore_guest_state_vhe(guest_ctxt);
__debug_switch_to_guest(vcpu);
 
@@ -380,7 +378,6 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu)
goto again;
 
sysreg_save_guest_state_vhe(guest_ctxt);
-   __vgic_save_state(vcpu);
 
deactivate_traps_vhe(vcpu);
 
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 790fd66..f0072f7 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -689,6 +689,8 @@ static inline void vgic_save_state(struct kvm_vcpu *vcpu)
 {
if (!static_branch_unlikely(_vgic_global_state.gicv3_cpuif))
vgic_v2_save_state(vcpu);
+   else if (has_vhe())
+   __vgic_v3_save_state(vcpu);
 }
 
 /* Sync back the hardware VGIC state into our emulation after a guest's run. */
@@ -711,6 +713,8 @@ static inline void vgic_restore_state(struct kvm_vcpu *vcpu)
 {
if (!static_branch_unlikely(_vgic_global_state.gicv3_cpuif))
vgic_v2_restore_state(vcpu);
+   else if (has_vhe())
+   __vgic_v3_restore_state(vcpu);
 }
 
 /* Flush our emulation state into the GIC hardware before entering the guest. 
*/
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 33/37] KVM: arm/arm64: Move arm64-only vgic-v2-sr.c file to arm64

2017-10-12 Thread Christoffer Dall
The vgic-v2-sr.c file now only contains the logic to replay unaligned
accesses to the virtual CPU interface on 16K and 64K page systems, which
is only relevant on 64-bit platforms.  Therefore move this file to the
arm64 KVM tree, remove the compile directive from the 32-bit side
makefile, and remove the ifdef in the C file.

Signed-off-by: Christoffer Dall 
---
 arch/arm/kvm/hyp/Makefile | 1 -
 arch/arm64/kvm/hyp/Makefile   | 2 +-
 {virt/kvm/arm => arch/arm64/kvm}/hyp/vgic-v2-sr.c | 2 --
 3 files changed, 1 insertion(+), 4 deletions(-)
 rename {virt/kvm/arm => arch/arm64/kvm}/hyp/vgic-v2-sr.c (98%)

diff --git a/arch/arm/kvm/hyp/Makefile b/arch/arm/kvm/hyp/Makefile
index 8679405..4be5061 100644
--- a/arch/arm/kvm/hyp/Makefile
+++ b/arch/arm/kvm/hyp/Makefile
@@ -6,7 +6,6 @@ ccflags-y += -fno-stack-protector
 
 KVM=../../../../virt/kvm
 
-obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v2-sr.o
 obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o
 obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/timer-sr.o
 
diff --git a/arch/arm64/kvm/hyp/Makefile b/arch/arm64/kvm/hyp/Makefile
index 14c4e3b..b746fa5 100644
--- a/arch/arm64/kvm/hyp/Makefile
+++ b/arch/arm64/kvm/hyp/Makefile
@@ -6,10 +6,10 @@ ccflags-y += -fno-stack-protector
 
 KVM=../../../../virt/kvm
 
-obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v2-sr.o
 obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o
 obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/timer-sr.o
 
+obj-$(CONFIG_KVM_ARM_HOST) += vgic-v2-sr.o
 obj-$(CONFIG_KVM_ARM_HOST) += sysreg-sr.o
 obj-$(CONFIG_KVM_ARM_HOST) += debug-sr.o
 obj-$(CONFIG_KVM_ARM_HOST) += entry.o
diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
similarity index 98%
rename from virt/kvm/arm/hyp/vgic-v2-sr.c
rename to arch/arm64/kvm/hyp/vgic-v2-sr.c
index b433257..fcd7b4e 100644
--- a/virt/kvm/arm/hyp/vgic-v2-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
@@ -22,7 +22,6 @@
 #include 
 #include 
 
-#ifdef CONFIG_ARM64
 /*
  * __vgic_v2_perform_cpuif_access -- perform a GICV access on behalf of the
  *  guest.
@@ -76,4 +75,3 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu 
*vcpu)
 
return 1;
 }
-#endif
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 27/37] KVM: arm64: Defer saving/restoring system registers to vcpu load/put on VHE

2017-10-12 Thread Christoffer Dall
Some system registers do not affect the host kernel's execution and can
therefore be loaded when we are about to run a VCPU and we don't have to
restore the host state to the hardware before the time when we are
actually about to return to userspace or schedule out the VCPU thread.

The EL1 system registers and the userspace state registers, which only
affect EL0 execution, do not affect the host kernel's execution.

The 32-bit system registers are not used by a VHE host kernel and
therefore don't need to be saved/restored on every entry/exit to/from
the guest, but can be deferred to vcpu_load and vcpu_put, respectively.

We have already prepared the trap handling code which accesses any of
these registers to directly access the registers on the physical CPU or
to sync the registers when needed.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/hyp/switch.c|  6 --
 arch/arm64/kvm/hyp/sysreg-sr.c | 46 ++
 2 files changed, 38 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 6356bec..6a12504 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -337,11 +337,6 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu)
 
__vgic_restore_state(vcpu);
 
-   /*
-* We must restore the 32-bit state before the sysregs, thanks
-* to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72).
-*/
-   __sysreg32_restore_state(vcpu);
sysreg_restore_guest_state_vhe(guest_ctxt);
__debug_switch_to_guest(vcpu);
 
@@ -354,7 +349,6 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu)
goto again;
 
sysreg_save_guest_state_vhe(guest_ctxt);
-   __sysreg32_save_state(vcpu);
__vgic_save_state(vcpu);
 
__deactivate_traps(vcpu);
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index 354ca02..6ff1ce5 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -25,8 +25,12 @@
 /*
  * Non-VHE: Both host and guest must save everything.
  *
- * VHE: Host must save tpidr*_el0, actlr_el1, mdscr_el1, sp_el0,
- * and guest must save everything.
+ * VHE: Host and guest must save mdscr_el1 and sp_el0 (and the PC and pstate,
+ * which are handled as part of the el2 return state) on every switch.
+ * tpidr_el0, tpidrro_el0, and actlr_el1 only need to be switched when going
+ * to host userspace or a different VCPU.  EL1 registers only need to be
+ * switched when potentially going to run a different VCPU.  The latter two
+ * classes are handled as part of kvm_arch_vcpu_load and kvm_arch_vcpu_put.
  */
 
 static void __hyp_text __sysreg_save_common_state(struct kvm_cpu_context *ctxt)
@@ -85,14 +89,11 @@ void __hyp_text __sysreg_save_state_nvhe(struct 
kvm_cpu_context *ctxt)
 void sysreg_save_host_state_vhe(struct kvm_cpu_context *ctxt)
 {
__sysreg_save_common_state(ctxt);
-   __sysreg_save_user_state(ctxt);
 }
 
 void sysreg_save_guest_state_vhe(struct kvm_cpu_context *ctxt)
 {
-   __sysreg_save_el1_state(ctxt);
__sysreg_save_common_state(ctxt);
-   __sysreg_save_user_state(ctxt);
__sysreg_save_el2_return_state(ctxt);
 }
 
@@ -153,14 +154,11 @@ void __hyp_text __sysreg_restore_state_nvhe(struct 
kvm_cpu_context *ctxt)
 void sysreg_restore_host_state_vhe(struct kvm_cpu_context *ctxt)
 {
__sysreg_restore_common_state(ctxt);
-   __sysreg_restore_user_state(ctxt);
 }
 
 void sysreg_restore_guest_state_vhe(struct kvm_cpu_context *ctxt)
 {
-   __sysreg_restore_el1_state(ctxt);
__sysreg_restore_common_state(ctxt);
-   __sysreg_restore_user_state(ctxt);
__sysreg_restore_el2_return_state(ctxt);
 }
 
@@ -224,6 +222,26 @@ void __hyp_text __sysreg32_restore_state(struct kvm_vcpu 
*vcpu)
  */
 void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu)
 {
+   struct kvm_cpu_context *host_ctxt = vcpu->arch.host_cpu_context;
+   struct kvm_cpu_context *guest_ctxt = >arch.ctxt;
+
+   if (!has_vhe())
+   return;
+
+   __sysreg_save_user_state(host_ctxt);
+
+
+   /*
+* Load guest EL1 and user state
+*
+* We must restore the 32-bit state before the sysregs, thanks
+* to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72).
+*/
+   __sysreg32_restore_state(vcpu);
+   __sysreg_restore_user_state(guest_ctxt);
+   __sysreg_restore_el1_state(guest_ctxt);
+
+   vcpu->arch.sysregs_loaded_on_cpu = true;
 }
 
 /**
@@ -250,4 +268,16 @@ void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu)
__fpsimd_restore_state(_ctxt->gp_regs.fp_regs);
vcpu->arch.guest_vfp_loaded = 0;
}
+
+   if (!has_vhe())
+   return;
+
+   __sysreg_save_el1_state(guest_ctxt);
+   __sysreg_save_user_state(guest_ctxt);
+   __sysreg32_save_state(vcpu);
+
+   /* Restore host user state */
+   

[PATCH 28/37] KVM: arm64: Move common VHE/non-VHE trap config in separate functions

2017-10-12 Thread Christoffer Dall
As we are about to be more lazy with some of the trap configuration
register read/writes for VHE systems, move the logic that is currently
shared between VHE and non-VHE into a separate function which can be
called from either the world-switch path or from vcpu_load/vcpu_put.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/hyp/switch.c | 70 +
 1 file changed, 39 insertions(+), 31 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 6a12504..c587416 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -23,6 +23,43 @@
 #include 
 #include 
 
+static void __hyp_text __activate_traps_common(struct kvm_vcpu *vcpu)
+{
+   /*
+* We are about to set CPTR_EL2.TFP to trap all floating point
+* register accesses to EL2, however, the ARM ARM clearly states that
+* traps are only taken to EL2 if the operation would not otherwise
+* trap to EL1.  Therefore, always make sure that for 32-bit guests,
+* we set FPEXC.EN to prevent traps to EL1, when setting the TFP bit.
+* If FP/ASIMD is not implemented, FPEXC is UNDEFINED and any access to
+* it will cause an exception.
+*/
+   if (vcpu_el1_is_32bit(vcpu) && system_supports_fpsimd() &&
+   !vcpu->arch.guest_vfp_loaded) {
+   write_sysreg(1 << 30, fpexc32_el2);
+   isb();
+   }
+   write_sysreg(vcpu->arch.hcr_el2, hcr_el2);
+
+   /* Trap on AArch32 cp15 c15 (impdef sysregs) accesses (EL1 or EL0) */
+   write_sysreg(1 << 15, hstr_el2);
+   /*
+* Make sure we trap PMU access from EL0 to EL2. Also sanitize
+* PMSELR_EL0 to make sure it never contains the cycle
+* counter, which could make a PMXEVCNTR_EL0 access UNDEF at
+* EL1 instead of being trapped to EL2.
+*/
+   write_sysreg(0, pmselr_el0);
+   write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
+   write_sysreg(vcpu->arch.mdcr_el2, mdcr_el2);
+}
+
+static void __hyp_text __deactivate_traps_common(void)
+{
+   write_sysreg(0, hstr_el2);
+   write_sysreg(0, pmuserenr_el0);
+}
+
 static void __hyp_text __activate_traps_vhe(struct kvm_vcpu *vcpu)
 {
u64 val;
@@ -57,35 +94,7 @@ static hyp_alternate_select(__activate_traps_arch,
 
 static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
 {
-   u64 val;
-
-   /*
-* We are about to set CPTR_EL2.TFP to trap all floating point
-* register accesses to EL2, however, the ARM ARM clearly states that
-* traps are only taken to EL2 if the operation would not otherwise
-* trap to EL1.  Therefore, always make sure that for 32-bit guests,
-* we set FPEXC.EN to prevent traps to EL1, when setting the TFP bit.
-* If FP/ASIMD is not implemented, FPEXC is UNDEFINED and any access to
-* it will cause an exception.
-*/
-   val = vcpu->arch.hcr_el2;
-   if (vcpu_el1_is_32bit(vcpu) && system_supports_fpsimd() &&
-   !vcpu->arch.guest_vfp_loaded) {
-   write_sysreg(1 << 30, fpexc32_el2);
-   isb();
-   }
-   write_sysreg(val, hcr_el2);
-   /* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
-   write_sysreg(1 << 15, hstr_el2);
-   /*
-* Make sure we trap PMU access from EL0 to EL2. Also sanitize
-* PMSELR_EL0 to make sure it never contains the cycle
-* counter, which could make a PMXEVCNTR_EL0 access UNDEF at
-* EL1 instead of being trapped to EL2.
-*/
-   write_sysreg(0, pmselr_el0);
-   write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
-   write_sysreg(vcpu->arch.mdcr_el2, mdcr_el2);
+   __activate_traps_common(vcpu);
__activate_traps_arch()(vcpu);
 }
 
@@ -131,9 +140,8 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu 
*vcpu)
if (vcpu->arch.hcr_el2 & HCR_VSE)
vcpu->arch.hcr_el2 = read_sysreg(hcr_el2);
 
+   __deactivate_traps_common();
__deactivate_traps_arch()();
-   write_sysreg(0, hstr_el2);
-   write_sysreg(0, pmuserenr_el0);
 }
 
 static inline void __hyp_text __activate_vm(struct kvm_vcpu *vcpu)
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 30/37] KVM: arm64: Configure c15, PMU, and debug register traps on cpu load/put for VHE

2017-10-12 Thread Christoffer Dall
We do not have to change the c15 trap setting on each switch to/from the
guest on VHE systems, because this setting only affects EL0.

The PMU and debug trap configuration can also be done on vcpu load/put
instead, because they don't affect how the host kernel can access the
debug registers while executing KVM kernel code and KVM doesn't use
floating point itself.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/hyp/switch.c | 35 ++-
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 09be10f..13e137e 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -42,8 +42,6 @@ static void __hyp_text __activate_traps_fpsimd32(struct 
kvm_vcpu *vcpu)
 
 static void __hyp_text __activate_traps_common(struct kvm_vcpu *vcpu)
 {
-   write_sysreg(vcpu->arch.hcr_el2, hcr_el2);
-
/* Trap on AArch32 cp15 c15 (impdef sysregs) accesses (EL1 or EL0) */
write_sysreg(1 << 15, hstr_el2);
/*
@@ -63,12 +61,15 @@ static void __hyp_text __deactivate_traps_common(void)
write_sysreg(0, pmuserenr_el0);
 }
 
+/* Activate the traps we can during vcpu_load with VHE */
 void activate_traps_vhe_load(struct kvm_vcpu *vcpu)
 {
u64 val;
 
+   /* Make sure 32-bit guests trap VFP */
__activate_traps_fpsimd32(vcpu);
 
+   /* Trap VFP accesses on a VHE system */
val = read_sysreg(cpacr_el1);
val |= CPACR_EL1_TTA;
if (vcpu->arch.guest_vfp_loaded)
@@ -76,11 +77,28 @@ void activate_traps_vhe_load(struct kvm_vcpu *vcpu)
else
val &= ~CPACR_EL1_FPEN;
write_sysreg(val, cpacr_el1);
+
+   /* Activate traps on impdef sysregs, PMU, and debug */
+   __activate_traps_common(vcpu);
 }
 
+/* Deactivate the traps we can during vcpu_put with VHE */
 void deactivate_traps_vhe_put(void)
 {
+   u64 mdcr_el2;
+
+   /* Re-enable host VFP access */
write_sysreg(CPACR_EL1_FPEN, cpacr_el1);
+
+   /* Re-enable host access to impdef sysregs and the PMU */
+   __deactivate_traps_common();
+
+   /* Re-enable host access to the debug regs */
+   mdcr_el2 = read_sysreg(mdcr_el2);
+   mdcr_el2 &= MDCR_EL2_HPMN_MASK |
+   MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT |
+   MDCR_EL2_TPMS;
+   write_sysreg(mdcr_el2, mdcr_el2);
 }
 
 static void __hyp_text __activate_traps_vhe(struct kvm_vcpu *vcpu)
@@ -92,8 +110,13 @@ static void __hyp_text __activate_traps_nvhe(struct 
kvm_vcpu *vcpu)
 {
u64 val;
 
+   /* Activate traps on impdef sysregs, PMU, and debug */
+   __activate_traps_common(vcpu);
+
+   /* Make sure 32-bit guests trap VFP */
__activate_traps_fpsimd32(vcpu);
 
+   /* Trap VFP accesses on a non-VHE system */
val = CPTR_EL2_DEFAULT;
val |= CPTR_EL2_TTA;
if (vcpu->arch.guest_vfp_loaded)
@@ -109,20 +132,14 @@ static hyp_alternate_select(__activate_traps_arch,
 
 static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
 {
-   __activate_traps_common(vcpu);
__activate_traps_arch()(vcpu);
+   write_sysreg(vcpu->arch.hcr_el2, hcr_el2);
 }
 
 static void __hyp_text __deactivate_traps_vhe(void)
 {
extern char vectors[];  /* kernel exception vectors */
-   u64 mdcr_el2 = read_sysreg(mdcr_el2);
 
-   mdcr_el2 &= MDCR_EL2_HPMN_MASK |
-   MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT |
-   MDCR_EL2_TPMS;
-
-   write_sysreg(mdcr_el2, mdcr_el2);
write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
write_sysreg(vectors, vbar_el1);
 }
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 31/37] KVM: arm64: Separate activate_traps and deactive_traps for VHE and non-VHE

2017-10-12 Thread Christoffer Dall
To make the code more readable and to avoid the overhead of a function
call, let's get rid of a pair of the alternative function selectors and
explicitly call the VHE and non-VHE functions instead, telling the
compiler to try to inline the static function if it can.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/hyp/switch.c | 78 +
 1 file changed, 37 insertions(+), 41 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 13e137e..5692aa0 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -101,9 +101,27 @@ void deactivate_traps_vhe_put(void)
write_sysreg(mdcr_el2, mdcr_el2);
 }
 
-static void __hyp_text __activate_traps_vhe(struct kvm_vcpu *vcpu)
+static inline void activate_traps_vhe(struct kvm_vcpu *vcpu)
 {
-   write_sysreg(__kvm_hyp_vector, vbar_el1);
+   write_sysreg(vcpu->arch.hcr_el2, hcr_el2);
+   write_sysreg_el2(__kvm_hyp_vector, vbar);
+}
+
+static inline void deactivate_traps_vhe(struct kvm_vcpu *vcpu)
+{
+   extern char vectors[];  /* kernel exception vectors */
+
+   /*
+* If we pended a virtual abort, preserve it until it gets
+* cleared. See D1.14.3 (Virtual Interrupts) for details, but
+* the crucial bit is "On taking a vSError interrupt,
+* HCR_EL2.VSE is cleared to 0."
+*/
+   if (vcpu->arch.hcr_el2 & HCR_VSE)
+   vcpu->arch.hcr_el2 = read_sysreg(hcr_el2);
+
+   write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
+   write_sysreg(vectors, vbar_el1);
 }
 
 static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
@@ -124,44 +142,15 @@ static void __hyp_text __activate_traps_nvhe(struct 
kvm_vcpu *vcpu)
else
val |= CPTR_EL2_TFP;
write_sysreg(val, cptr_el2);
-}
 
-static hyp_alternate_select(__activate_traps_arch,
-   __activate_traps_nvhe, __activate_traps_vhe,
-   ARM64_HAS_VIRT_HOST_EXTN);
-
-static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
-{
-   __activate_traps_arch()(vcpu);
+   /* Configure all other hypervisor traps and features */
write_sysreg(vcpu->arch.hcr_el2, hcr_el2);
 }
 
-static void __hyp_text __deactivate_traps_vhe(void)
-{
-   extern char vectors[];  /* kernel exception vectors */
-
-   write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
-   write_sysreg(vectors, vbar_el1);
-}
-
-static void __hyp_text __deactivate_traps_nvhe(void)
+static void __hyp_text __deactivate_traps_nvhe(struct kvm_vcpu *vcpu)
 {
-   u64 mdcr_el2 = read_sysreg(mdcr_el2);
-
-   mdcr_el2 &= MDCR_EL2_HPMN_MASK;
-   mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
-
-   write_sysreg(mdcr_el2, mdcr_el2);
-   write_sysreg(HCR_RW, hcr_el2);
-   write_sysreg(CPTR_EL2_DEFAULT, cptr_el2);
-}
-
-static hyp_alternate_select(__deactivate_traps_arch,
-   __deactivate_traps_nvhe, __deactivate_traps_vhe,
-   ARM64_HAS_VIRT_HOST_EXTN);
+   u64 mdcr_el2;
 
-static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu)
-{
/*
 * If we pended a virtual abort, preserve it until it gets
 * cleared. See D1.14.3 (Virtual Interrupts) for details, but
@@ -172,7 +161,14 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu 
*vcpu)
vcpu->arch.hcr_el2 = read_sysreg(hcr_el2);
 
__deactivate_traps_common();
-   __deactivate_traps_arch()();
+
+   mdcr_el2 = read_sysreg(mdcr_el2);
+   mdcr_el2 &= MDCR_EL2_HPMN_MASK;
+   mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
+
+   write_sysreg(mdcr_el2, mdcr_el2);
+   write_sysreg(HCR_RW, hcr_el2);
+   write_sysreg(CPTR_EL2_DEFAULT, cptr_el2);
 }
 
 static inline void __hyp_text __activate_vm(struct kvm_vcpu *vcpu)
@@ -371,7 +367,7 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu)
 
sysreg_save_host_state_vhe(host_ctxt);
 
-   __activate_traps(vcpu);
+   activate_traps_vhe(vcpu);
__activate_vm(vcpu);
 
__vgic_restore_state(vcpu);
@@ -390,7 +386,7 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu)
sysreg_save_guest_state_vhe(guest_ctxt);
__vgic_save_state(vcpu);
 
-   __deactivate_traps(vcpu);
+   deactivate_traps_vhe(vcpu);
 
sysreg_restore_host_state_vhe(host_ctxt);
 
@@ -418,7 +414,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 
__sysreg_save_state_nvhe(host_ctxt);
 
-   __activate_traps(vcpu);
+   __activate_traps_nvhe(vcpu);
__activate_vm(vcpu);
 
__vgic_restore_state(vcpu);
@@ -445,7 +441,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
__timer_disable_traps(vcpu);
__vgic_save_state(vcpu);
 
-   __deactivate_traps(vcpu);
+   __deactivate_traps_nvhe(vcpu);
__deactivate_vm(vcpu);
 
__sysreg_restore_state_nvhe(host_ctxt);
@@ -471,7 

[PATCH 37/37] KVM: arm/arm64: Avoid VGICv3 save/restore on VHE with no IRQs

2017-10-12 Thread Christoffer Dall
We can finally get completely rid of any calls to the VGICv3
save/restore functions when the AP lists are empty on VHE systems.  This
requires carefully factoring out trap configuration from saving and
restoring state, and carefully choosing what to do on the VHE and
non-VHE path.

One of the challenges is that we cannot save/restore the VMCR lazily
because we can only write the VMCR when ICC_SRE_EL1.SRE is cleared when
emulating a GICv2-on-GICv3, since otherwise all Group-0 interrupts end
up being delivered as FIQ.

To solve this problem, and still provide fast performance in the fast
path of exiting a VM when no interrupts are pending (which also
optimized the latency for actually delivering virtual interrupts coming
from physical interrupts), we orchestrate a dance of only doing the
activate/deactivate traps in vgic load/put for VHE systems (which can
have ICC_SRE_EL1.SRE cleared when running in the host), and doing the
configuration on every round-trip on non-VHE systems.

Signed-off-by: Christoffer Dall 
---
 arch/arm/include/asm/kvm_hyp.h   |   2 +
 arch/arm/kvm/hyp/switch.c|   8 ++-
 arch/arm64/include/asm/kvm_hyp.h |   2 +
 arch/arm64/kvm/hyp/switch.c  |   8 ++-
 virt/kvm/arm/hyp/vgic-v3-sr.c| 116 +--
 virt/kvm/arm/vgic/vgic-v3.c  |   6 ++
 virt/kvm/arm/vgic/vgic.c |   7 +--
 7 files changed, 101 insertions(+), 48 deletions(-)

diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
index b3dd4f4..d01676e 100644
--- a/arch/arm/include/asm/kvm_hyp.h
+++ b/arch/arm/include/asm/kvm_hyp.h
@@ -109,6 +109,8 @@ void __sysreg_restore_state(struct kvm_cpu_context *ctxt);
 
 void __vgic_v3_save_state(struct kvm_vcpu *vcpu);
 void __vgic_v3_restore_state(struct kvm_vcpu *vcpu);
+void __vgic_v3_activate_traps(struct kvm_vcpu *vcpu);
+void __vgic_v3_deactivate_traps(struct kvm_vcpu *vcpu);
 void __vgic_v3_save_aprs(struct kvm_vcpu *vcpu);
 void __vgic_v3_restore_aprs(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm/kvm/hyp/switch.c b/arch/arm/kvm/hyp/switch.c
index 0d834f8..6b487c2 100644
--- a/arch/arm/kvm/hyp/switch.c
+++ b/arch/arm/kvm/hyp/switch.c
@@ -89,14 +89,18 @@ static void __hyp_text __deactivate_vm(struct kvm_vcpu 
*vcpu)
 
 static void __hyp_text __vgic_save_state(struct kvm_vcpu *vcpu)
 {
-   if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif))
+   if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) {
__vgic_v3_save_state(vcpu);
+   __vgic_v3_deactivate_traps(vcpu);
+   }
 }
 
 static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu)
 {
-   if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif))
+   if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) {
+   __vgic_v3_activate_traps(vcpu);
__vgic_v3_restore_state(vcpu);
+   }
 }
 
 static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 62872ce..1c21a70 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -125,6 +125,8 @@ int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu);
 
 void __vgic_v3_save_state(struct kvm_vcpu *vcpu);
 void __vgic_v3_restore_state(struct kvm_vcpu *vcpu);
+void __vgic_v3_activate_traps(struct kvm_vcpu *vcpu);
+void __vgic_v3_deactivate_traps(struct kvm_vcpu *vcpu);
 void __vgic_v3_save_aprs(struct kvm_vcpu *vcpu);
 void __vgic_v3_restore_aprs(struct kvm_vcpu *vcpu);
 int __vgic_v3_perform_cpuif_access(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 0cdf6ae..42e6d56 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -184,14 +184,18 @@ static inline void __hyp_text __deactivate_vm(struct 
kvm_vcpu *vcpu)
 
 static void __hyp_text __vgic_save_state(struct kvm_vcpu *vcpu)
 {
-   if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif))
+   if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) {
__vgic_v3_save_state(vcpu);
+   __vgic_v3_deactivate_traps(vcpu);
+   }
 }
 
 static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu)
 {
-   if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif))
+   if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif)) {
+   __vgic_v3_activate_traps(vcpu);
__vgic_v3_restore_state(vcpu);
+   }
 }
 
 static bool __hyp_text __true_value(void)
diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c
index ed5da75..34d71d2 100644
--- a/virt/kvm/arm/hyp/vgic-v3-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v3-sr.c
@@ -208,15 +208,15 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu 
*vcpu)
 {
struct vgic_v3_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v3;
u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
-   u64 val;
 
/*
 * 

[PATCH 32/37] KVM: arm/arm64: Handle VGICv2 save/restore from the main VGIC code

2017-10-12 Thread Christoffer Dall
We can program the GICv2 hypervisor control interface logic directly
from the core vgic code and can instead do the save/restore directly
from the flush/sync functions, which can lead to a number of future
optimizations.

Signed-off-by: Christoffer Dall 
---
 arch/arm/kvm/hyp/switch.c|  4 --
 arch/arm64/include/asm/kvm_hyp.h |  2 -
 arch/arm64/kvm/hyp/switch.c  |  4 --
 virt/kvm/arm/hyp/vgic-v2-sr.c| 83 
 virt/kvm/arm/vgic/vgic-init.c| 22 ++
 virt/kvm/arm/vgic/vgic-v2.c  | 92 
 virt/kvm/arm/vgic/vgic.c | 21 -
 virt/kvm/arm/vgic/vgic.h |  5 +++
 8 files changed, 130 insertions(+), 103 deletions(-)

diff --git a/arch/arm/kvm/hyp/switch.c b/arch/arm/kvm/hyp/switch.c
index c3b9799..0d834f8 100644
--- a/arch/arm/kvm/hyp/switch.c
+++ b/arch/arm/kvm/hyp/switch.c
@@ -91,16 +91,12 @@ static void __hyp_text __vgic_save_state(struct kvm_vcpu 
*vcpu)
 {
if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif))
__vgic_v3_save_state(vcpu);
-   else
-   __vgic_v2_save_state(vcpu);
 }
 
 static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu)
 {
if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif))
__vgic_v3_restore_state(vcpu);
-   else
-   __vgic_v2_restore_state(vcpu);
 }
 
 static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 28d5f3c..bd3fe64 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -121,8 +121,6 @@ typeof(orig) * __hyp_text fname(void)   
\
return val; \
 }
 
-void __vgic_v2_save_state(struct kvm_vcpu *vcpu);
-void __vgic_v2_restore_state(struct kvm_vcpu *vcpu);
 int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu);
 
 void __vgic_v3_save_state(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 5692aa0..90da506 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -186,16 +186,12 @@ static void __hyp_text __vgic_save_state(struct kvm_vcpu 
*vcpu)
 {
if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif))
__vgic_v3_save_state(vcpu);
-   else
-   __vgic_v2_save_state(vcpu);
 }
 
 static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu)
 {
if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif))
__vgic_v3_restore_state(vcpu);
-   else
-   __vgic_v2_restore_state(vcpu);
 }
 
 static bool __hyp_text __true_value(void)
diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/virt/kvm/arm/hyp/vgic-v2-sr.c
index a3f18d3..b433257 100644
--- a/virt/kvm/arm/hyp/vgic-v2-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v2-sr.c
@@ -22,89 +22,6 @@
 #include 
 #include 
 
-static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
-{
-   struct vgic_v2_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v2;
-   int nr_lr = (kern_hyp_va(_vgic_global_state))->nr_lr;
-   u32 elrsr0, elrsr1;
-
-   elrsr0 = readl_relaxed(base + GICH_ELRSR0);
-   if (unlikely(nr_lr > 32))
-   elrsr1 = readl_relaxed(base + GICH_ELRSR1);
-   else
-   elrsr1 = 0;
-
-#ifdef CONFIG_CPU_BIG_ENDIAN
-   cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1;
-#else
-   cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0;
-#endif
-}
-
-static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
-{
-   struct vgic_v2_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v2;
-   int i;
-   u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
-
-   for (i = 0; i < used_lrs; i++) {
-   if (cpu_if->vgic_elrsr & (1UL << i))
-   cpu_if->vgic_lr[i] &= ~GICH_LR_STATE;
-   else
-   cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i 
* 4));
-
-   writel_relaxed(0, base + GICH_LR0 + (i * 4));
-   }
-}
-
-/* vcpu is already in the HYP VA space */
-void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
-{
-   struct kvm *kvm = kern_hyp_va(vcpu->kvm);
-   struct vgic_v2_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v2;
-   struct vgic_dist *vgic = >arch.vgic;
-   void __iomem *base = kern_hyp_va(vgic->vctrl_base);
-   u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
-
-   if (!base)
-   return;
-
-   if (used_lrs) {
-   cpu_if->vgic_apr = readl_relaxed(base + GICH_APR);
-
-   save_elrsr(vcpu, base);
-   save_lrs(vcpu, base);
-
-   writel_relaxed(0, base + GICH_HCR);
-   } else {
-   cpu_if->vgic_elrsr = ~0UL;
-   cpu_if->vgic_apr = 0;
-   }
-}
-
-/* vcpu is already in the HYP VA space */

[PATCH 26/37] KVM: arm64: Prepare to handle traps on deferred AArch32 sysregs

2017-10-12 Thread Christoffer Dall
Handle accesses to any AArch32 EL1 system registers where we can defer
saving and restoring them to vcpu_load and vcpu_put, and which are
stored in special EL2 registers only used support 32-bit guests.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/inject_fault.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index f4513fc..02990f5 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -59,9 +59,18 @@ static void vcpu_set_elr_el1(struct kvm_vcpu *vcpu, u64 val)
 /* Set the SPSR for the current mode */
 static void vcpu_set_spsr(struct kvm_vcpu *vcpu, u64 val)
 {
-   if (vcpu_mode_is_32bit(vcpu))
+   if (vcpu_mode_is_32bit(vcpu)) {
+   if (vcpu->arch.sysregs_loaded_on_cpu)
+   __sysreg32_save_state(vcpu);
+
*vcpu_spsr32(vcpu) = val;
 
+   if (vcpu->arch.sysregs_loaded_on_cpu)
+   __sysreg32_restore_state(vcpu);
+
+   return;
+   }
+
if (vcpu->arch.sysregs_loaded_on_cpu)
write_sysreg_el1(val, spsr);
else
@@ -129,11 +138,13 @@ static void inject_abt32(struct kvm_vcpu *vcpu, bool 
is_pabt,
 *   IFAR:  mapped to FAR_EL1
 *   DFSR:  mapped to ESR_EL1
 *   TTBCR: mapped to TCR_EL1
+*   IFSR:  stored in IFSR32_EL2
 */
if (vcpu->arch.sysregs_loaded_on_cpu) {
vcpu->arch.ctxt.sys_regs[FAR_EL1] = read_sysreg_el1(far);
vcpu->arch.ctxt.sys_regs[ESR_EL1] = read_sysreg_el1(esr);
vcpu->arch.ctxt.sys_regs[TCR_EL1] = read_sysreg_el1(tcr);
+   vcpu->arch.ctxt.sys_regs[IFSR32_EL2] = read_sysreg(ifsr32_el2);
}
 
if (is_pabt) {
@@ -161,6 +172,7 @@ static void inject_abt32(struct kvm_vcpu *vcpu, bool 
is_pabt,
if (vcpu->arch.sysregs_loaded_on_cpu) {
write_sysreg_el1(vcpu->arch.ctxt.sys_regs[FAR_EL1], far);
write_sysreg_el1(vcpu->arch.ctxt.sys_regs[ESR_EL1], esr);
+   write_sysreg(vcpu->arch.ctxt.sys_regs[IFSR32_EL2], ifsr32_el2);
}
 }
 
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 20/37] KVM: arm64: Unify non-VHE host/guest sysreg save and restore functions

2017-10-12 Thread Christoffer Dall
There is no need to have multiple identical functions with different
names for saving host and guest state.  When saving and restoring state
for the host and guest, the state is the same for both contexts, and
that's why we have the kvm_cpu_context structure.  Delete one
version and rename the other into simply save/restore.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/include/asm/kvm_hyp.h |  6 ++
 arch/arm64/kvm/hyp/switch.c  | 10 +-
 arch/arm64/kvm/hyp/sysreg-sr.c   | 18 ++
 3 files changed, 9 insertions(+), 25 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 998152d..3f54c55 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -132,10 +132,8 @@ int __vgic_v3_perform_cpuif_access(struct kvm_vcpu *vcpu);
 void __timer_enable_traps(struct kvm_vcpu *vcpu);
 void __timer_disable_traps(struct kvm_vcpu *vcpu);
 
-void __sysreg_save_host_state_nvhe(struct kvm_cpu_context *ctxt);
-void __sysreg_restore_host_state_nvhe(struct kvm_cpu_context *ctxt);
-void __sysreg_save_guest_state_nvhe(struct kvm_cpu_context *ctxt);
-void __sysreg_restore_guest_state_nvhe(struct kvm_cpu_context *ctxt);
+void __sysreg_save_state_nvhe(struct kvm_cpu_context *ctxt);
+void __sysreg_restore_state_nvhe(struct kvm_cpu_context *ctxt);
 void sysreg_save_host_state_vhe(struct kvm_cpu_context *ctxt);
 void sysreg_restore_host_state_vhe(struct kvm_cpu_context *ctxt);
 void sysreg_save_guest_state_vhe(struct kvm_cpu_context *ctxt);
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 7c4d430..6356bec 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -383,7 +383,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
host_ctxt->__hyp_running_vcpu = vcpu;
guest_ctxt = >arch.ctxt;
 
-   __sysreg_save_host_state_nvhe(host_ctxt);
+   __sysreg_save_state_nvhe(host_ctxt);
 
__activate_traps(vcpu);
__activate_vm(vcpu);
@@ -396,7 +396,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 * to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72).
 */
__sysreg32_restore_state(vcpu);
-   __sysreg_restore_guest_state_nvhe(guest_ctxt);
+   __sysreg_restore_state_nvhe(guest_ctxt);
__debug_switch_to_guest(vcpu);
 
/* Jump in the fire! */
@@ -407,7 +407,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
if (fixup_guest_exit(vcpu, _code))
goto again;
 
-   __sysreg_save_guest_state_nvhe(guest_ctxt);
+   __sysreg_save_state_nvhe(guest_ctxt);
__sysreg32_save_state(vcpu);
__timer_disable_traps(vcpu);
__vgic_save_state(vcpu);
@@ -415,7 +415,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
__deactivate_traps(vcpu);
__deactivate_vm(vcpu);
 
-   __sysreg_restore_host_state_nvhe(host_ctxt);
+   __sysreg_restore_state_nvhe(host_ctxt);
 
/*
 * This must come after restoring the host sysregs, since a non-VHE
@@ -440,7 +440,7 @@ static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 
elr, u64 par,
__timer_disable_traps(vcpu);
__deactivate_traps(vcpu);
__deactivate_vm(vcpu);
-   __sysreg_restore_host_state_nvhe(__host_ctxt);
+   __sysreg_restore_state_nvhe(__host_ctxt);
}
 
/*
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index d8c42de..f5c1b44 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -70,14 +70,7 @@ static void __hyp_text __sysreg_save_el1_state(struct 
kvm_cpu_context *ctxt)
ctxt->gp_regs.regs.pstate   = read_sysreg_el2(spsr);
 }
 
-void __hyp_text __sysreg_save_host_state_nvhe(struct kvm_cpu_context *ctxt)
-{
-   __sysreg_save_el1_state(ctxt);
-   __sysreg_save_common_state(ctxt);
-   __sysreg_save_user_state(ctxt);
-}
-
-void __hyp_text __sysreg_save_guest_state_nvhe(struct kvm_cpu_context *ctxt)
+void __hyp_text __sysreg_save_state_nvhe(struct kvm_cpu_context *ctxt)
 {
__sysreg_save_el1_state(ctxt);
__sysreg_save_common_state(ctxt);
@@ -138,14 +131,7 @@ static void __hyp_text __sysreg_restore_el1_state(struct 
kvm_cpu_context *ctxt)
write_sysreg_el2(ctxt->gp_regs.regs.pstate, spsr);
 }
 
-void __hyp_text __sysreg_restore_host_state_nvhe(struct kvm_cpu_context *ctxt)
-{
-   __sysreg_restore_el1_state(ctxt);
-   __sysreg_restore_common_state(ctxt);
-   __sysreg_restore_user_state(ctxt);
-}
-
-void __hyp_text __sysreg_restore_guest_state_nvhe(struct kvm_cpu_context *ctxt)
+void __hyp_text __sysreg_restore_state_nvhe(struct kvm_cpu_context *ctxt)
 {
__sysreg_restore_el1_state(ctxt);
__sysreg_restore_common_state(ctxt);
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu

[PATCH 22/37] KVM: arm64: Change 32-bit handling of VM system registers

2017-10-12 Thread Christoffer Dall
We currently handle 32-bit accesses to trapped VM system registers using
the 32-bit index into the coproc array on the vcpu structure, which is a
union of the coprog array and the sysreg array.

Since all the 32-bit coproc indicies are created to correspond to the
architectural mapping between 64-bit system registers and 32-bit
coprocessor registers, and because the AArch64 system registers are the
double in size of the AArch32 coprocessor registers, we can always find
the system register entry that we must update by dividing the 32-bit
coproc index by 2.

This is going to make our lives much easier when we have to start
accessing system registers that uses deferred save/restore and might
have to be read directly from the physical CPU.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/include/asm/kvm_host.h |  8 
 arch/arm64/kvm/sys_regs.c | 20 +++-
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 5e09eb9..9f5761f 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -289,14 +289,6 @@ struct kvm_vcpu_arch {
 #define vcpu_cp14(v,r) ((v)->arch.ctxt.copro[(r)])
 #define vcpu_cp15(v,r) ((v)->arch.ctxt.copro[(r)])
 
-#ifdef CONFIG_CPU_BIG_ENDIAN
-#define vcpu_cp15_64_high(v,r) vcpu_cp15((v),(r))
-#define vcpu_cp15_64_low(v,r)  vcpu_cp15((v),(r) + 1)
-#else
-#define vcpu_cp15_64_high(v,r) vcpu_cp15((v),(r) + 1)
-#define vcpu_cp15_64_low(v,r)  vcpu_cp15((v),(r))
-#endif
-
 struct kvm_vm_stat {
ulong remote_tlb_flush;
 };
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index bb0e41b..dbe35fd 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -120,16 +120,26 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
  const struct sys_reg_desc *r)
 {
bool was_enabled = vcpu_has_cache_enabled(vcpu);
+   u64 val;
+   int reg = r->reg;
 
BUG_ON(!p->is_write);
 
-   if (!p->is_aarch32) {
-   vcpu_sys_reg(vcpu, r->reg) = p->regval;
+   /* See the 32bit mapping in kvm_host.h */
+   if (p->is_aarch32)
+   reg = r->reg / 2;
+
+   if (!p->is_aarch32 || !p->is_32bit) {
+   val = p->regval;
} else {
-   if (!p->is_32bit)
-   vcpu_cp15_64_high(vcpu, r->reg) = 
upper_32_bits(p->regval);
-   vcpu_cp15_64_low(vcpu, r->reg) = lower_32_bits(p->regval);
+   val = vcpu_sys_reg(vcpu, reg);
+   if (r->reg % 2)
+   val = (p->regval << 32) | (u64)lower_32_bits(val);
+   else
+   val = ((u64)upper_32_bits(val) << 32) |
+   (u64)lower_32_bits(p->regval);
}
+   vcpu_sys_reg(vcpu, reg) = val;
 
kvm_toggle_cache(vcpu, was_enabled);
return true;
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 36/37] KVM: arm/arm64: Move VGIC APR save/restore to vgic put/load

2017-10-12 Thread Christoffer Dall
The APRs can only have bits set when the guest acknowledges an interrupt
in the LR and can only have a bit cleared when the guest EOIs an
interrupt in the LR.  Therefore, if we have no LRs with any
pending/active interrupts, the APR cannot change value and there is no
need to clear it on every exit from the VM (hint: it will have already
been cleared when we exited the guest the last time with the LRs all
EOIed).

The only case we need to take care of is when we migrate the VCPU away
from a CPU or migrate a new VCPU onto a CPU, or when we return to
userspace to capture the state of the VCPU for migration.  To make sure
this works, factor out the APR save/restore functionality into separate
functions called from the VCPU (and by extension VGIC) put/load hooks.

Signed-off-by: Christoffer Dall 
---
 arch/arm/include/asm/kvm_hyp.h   |   2 +
 arch/arm64/include/asm/kvm_hyp.h |   2 +
 virt/kvm/arm/hyp/vgic-v3-sr.c| 123 +--
 virt/kvm/arm/vgic/vgic-v2.c  |   7 +--
 virt/kvm/arm/vgic/vgic-v3.c  |   5 ++
 5 files changed, 77 insertions(+), 62 deletions(-)

diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
index ab20ffa..b3dd4f4 100644
--- a/arch/arm/include/asm/kvm_hyp.h
+++ b/arch/arm/include/asm/kvm_hyp.h
@@ -109,6 +109,8 @@ void __sysreg_restore_state(struct kvm_cpu_context *ctxt);
 
 void __vgic_v3_save_state(struct kvm_vcpu *vcpu);
 void __vgic_v3_restore_state(struct kvm_vcpu *vcpu);
+void __vgic_v3_save_aprs(struct kvm_vcpu *vcpu);
+void __vgic_v3_restore_aprs(struct kvm_vcpu *vcpu);
 
 asmlinkage void __vfp_save_state(struct vfp_hard_struct *vfp);
 asmlinkage void __vfp_restore_state(struct vfp_hard_struct *vfp);
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index bd3fe64..62872ce 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -125,6 +125,8 @@ int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu);
 
 void __vgic_v3_save_state(struct kvm_vcpu *vcpu);
 void __vgic_v3_restore_state(struct kvm_vcpu *vcpu);
+void __vgic_v3_save_aprs(struct kvm_vcpu *vcpu);
+void __vgic_v3_restore_aprs(struct kvm_vcpu *vcpu);
 int __vgic_v3_perform_cpuif_access(struct kvm_vcpu *vcpu);
 
 void __timer_enable_traps(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c
index 05548b2..ed5da75 100644
--- a/virt/kvm/arm/hyp/vgic-v3-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v3-sr.c
@@ -221,14 +221,11 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu 
*vcpu)
 
if (used_lrs) {
int i;
-   u32 nr_pre_bits;
u32 elrsr;
 
elrsr = read_gicreg(ICH_ELSR_EL2);
 
write_gicreg(0, ICH_HCR_EL2);
-   val = read_gicreg(ICH_VTR_EL2);
-   nr_pre_bits = vtr_to_nr_pre_bits(val);
 
for (i = 0; i < used_lrs; i++) {
if (elrsr & (1 << i))
@@ -238,38 +235,9 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
 
__gic_v3_set_lr(0, i);
}
-
-   switch (nr_pre_bits) {
-   case 7:
-   cpu_if->vgic_ap0r[3] = __vgic_v3_read_ap0rn(3);
-   cpu_if->vgic_ap0r[2] = __vgic_v3_read_ap0rn(2);
-   case 6:
-   cpu_if->vgic_ap0r[1] = __vgic_v3_read_ap0rn(1);
-   default:
-   cpu_if->vgic_ap0r[0] = __vgic_v3_read_ap0rn(0);
-   }
-
-   switch (nr_pre_bits) {
-   case 7:
-   cpu_if->vgic_ap1r[3] = __vgic_v3_read_ap1rn(3);
-   cpu_if->vgic_ap1r[2] = __vgic_v3_read_ap1rn(2);
-   case 6:
-   cpu_if->vgic_ap1r[1] = __vgic_v3_read_ap1rn(1);
-   default:
-   cpu_if->vgic_ap1r[0] = __vgic_v3_read_ap1rn(0);
-   }
} else {
if (static_branch_unlikely(_v3_cpuif_trap))
write_gicreg(0, ICH_HCR_EL2);
-
-   cpu_if->vgic_ap0r[0] = 0;
-   cpu_if->vgic_ap0r[1] = 0;
-   cpu_if->vgic_ap0r[2] = 0;
-   cpu_if->vgic_ap0r[3] = 0;
-   cpu_if->vgic_ap1r[0] = 0;
-   cpu_if->vgic_ap1r[1] = 0;
-   cpu_if->vgic_ap1r[2] = 0;
-   cpu_if->vgic_ap1r[3] = 0;
}
 
val = read_gicreg(ICC_SRE_EL2);
@@ -286,8 +254,6 @@ void __hyp_text __vgic_v3_restore_state(struct kvm_vcpu 
*vcpu)
 {
struct vgic_v3_cpu_if *cpu_if = >arch.vgic_cpu.vgic_v3;
u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
-   u64 val;
-   u32 nr_pre_bits;
int i;
 
/*
@@ -305,32 +271,9 @@ void __hyp_text __vgic_v3_restore_state(struct kvm_vcpu 
*vcpu)
write_gicreg(cpu_if->vgic_vmcr, ICH_VMCR_EL2);
}
 
-   val = read_gicreg(ICH_VTR_EL2);
-   

[PATCH 29/37] KVM: arm64: Configure FPSIMD traps on vcpu load/put for VHE

2017-10-12 Thread Christoffer Dall
There is no need to enable/disable traps to FP registers on every switch
to/from the VM, because the host kernel does not use this resource
without calling vcpu_put.  We can therefore move things around enough
that we still always write FPEXC32_EL2 before programming CPTR_EL2 but
only program these during vcpu load/put.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/include/asm/kvm_hyp.h |  3 +++
 arch/arm64/kvm/hyp/switch.c  | 34 --
 arch/arm64/kvm/hyp/sysreg-sr.c   |  4 
 3 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 3f54c55..28d5f3c 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -148,6 +148,9 @@ void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
 void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
 bool __fpsimd_enabled(void);
 
+void activate_traps_vhe_load(struct kvm_vcpu *vcpu);
+void deactivate_traps_vhe_put(void);
+
 u64 __guest_enter(struct kvm_vcpu *vcpu, struct kvm_cpu_context *host_ctxt);
 void __noreturn __hyp_do_panic(unsigned long, ...);
 
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index c587416..09be10f 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -23,22 +23,25 @@
 #include 
 #include 
 
-static void __hyp_text __activate_traps_common(struct kvm_vcpu *vcpu)
+static void __hyp_text __activate_traps_fpsimd32(struct kvm_vcpu *vcpu)
 {
/*
-* We are about to set CPTR_EL2.TFP to trap all floating point
-* register accesses to EL2, however, the ARM ARM clearly states that
-* traps are only taken to EL2 if the operation would not otherwise
-* trap to EL1.  Therefore, always make sure that for 32-bit guests,
-* we set FPEXC.EN to prevent traps to EL1, when setting the TFP bit.
-* If FP/ASIMD is not implemented, FPEXC is UNDEFINED and any access to
-* it will cause an exception.
+* We are about to trap all floating point register accesses to EL2,
+* however, traps are only taken to EL2 if the operation would not
+* otherwise trap to EL1.  Therefore, always make sure that for 32-bit
+* guests, we set FPEXC.EN to prevent traps to EL1, when setting the
+* TFP bit.  If FP/ASIMD is not implemented, FPEXC is UNDEFINED and
+* any access to it will cause an exception.
 */
if (vcpu_el1_is_32bit(vcpu) && system_supports_fpsimd() &&
!vcpu->arch.guest_vfp_loaded) {
write_sysreg(1 << 30, fpexc32_el2);
isb();
}
+}
+
+static void __hyp_text __activate_traps_common(struct kvm_vcpu *vcpu)
+{
write_sysreg(vcpu->arch.hcr_el2, hcr_el2);
 
/* Trap on AArch32 cp15 c15 (impdef sysregs) accesses (EL1 or EL0) */
@@ -60,10 +63,12 @@ static void __hyp_text __deactivate_traps_common(void)
write_sysreg(0, pmuserenr_el0);
 }
 
-static void __hyp_text __activate_traps_vhe(struct kvm_vcpu *vcpu)
+void activate_traps_vhe_load(struct kvm_vcpu *vcpu)
 {
u64 val;
 
+   __activate_traps_fpsimd32(vcpu);
+
val = read_sysreg(cpacr_el1);
val |= CPACR_EL1_TTA;
if (vcpu->arch.guest_vfp_loaded)
@@ -71,7 +76,15 @@ static void __hyp_text __activate_traps_vhe(struct kvm_vcpu 
*vcpu)
else
val &= ~CPACR_EL1_FPEN;
write_sysreg(val, cpacr_el1);
+}
 
+void deactivate_traps_vhe_put(void)
+{
+   write_sysreg(CPACR_EL1_FPEN, cpacr_el1);
+}
+
+static void __hyp_text __activate_traps_vhe(struct kvm_vcpu *vcpu)
+{
write_sysreg(__kvm_hyp_vector, vbar_el1);
 }
 
@@ -79,6 +92,8 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu 
*vcpu)
 {
u64 val;
 
+   __activate_traps_fpsimd32(vcpu);
+
val = CPTR_EL2_DEFAULT;
val |= CPTR_EL2_TTA;
if (vcpu->arch.guest_vfp_loaded)
@@ -109,7 +124,6 @@ static void __hyp_text __deactivate_traps_vhe(void)
 
write_sysreg(mdcr_el2, mdcr_el2);
write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
-   write_sysreg(CPACR_EL1_FPEN, cpacr_el1);
write_sysreg(vectors, vbar_el1);
 }
 
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index 6ff1ce5..54a4f55 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -242,6 +242,8 @@ void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu)
__sysreg_restore_el1_state(guest_ctxt);
 
vcpu->arch.sysregs_loaded_on_cpu = true;
+
+   activate_traps_vhe_load(vcpu);
 }
 
 /**
@@ -272,6 +274,8 @@ void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu)
if (!has_vhe())
return;
 
+   deactivate_traps_vhe_put();
+
__sysreg_save_el1_state(guest_ctxt);
__sysreg_save_user_state(guest_ctxt);
__sysreg32_save_state(vcpu);
-- 
2.9.0


[PATCH 24/37] KVM: arm64: Prepare to handle traps on deferred EL0 sysregs

2017-10-12 Thread Christoffer Dall
We can trap access to ACTLR_EL1 which we can later defer to only
save/restore during vcpu_load and vcpu_put, so let's read the value
directly from the CPU when necessary.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/sys_regs_generic_v8.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/sys_regs_generic_v8.c 
b/arch/arm64/kvm/sys_regs_generic_v8.c
index 969ade1..348f227 100644
--- a/arch/arm64/kvm/sys_regs_generic_v8.c
+++ b/arch/arm64/kvm/sys_regs_generic_v8.c
@@ -38,7 +38,10 @@ static bool access_actlr(struct kvm_vcpu *vcpu,
if (p->is_write)
return ignore_write(vcpu, p);
 
-   p->regval = vcpu_sys_reg(vcpu, ACTLR_EL1);
+   if (vcpu->arch.sysregs_loaded_on_cpu)
+   read_sysreg(actlr_el1);
+   else
+   p->regval = vcpu_sys_reg(vcpu, ACTLR_EL1);
return true;
 }
 
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 23/37] KVM: arm64: Prepare to handle traps on deferred VM sysregs

2017-10-12 Thread Christoffer Dall
When we defer the save/restore of system registers to vcpu_load and
vcpu_put, we need to take care of the emulation code that handles traps
to these registers, since simply reading the memory array will return
stale data.

Therefore, introduce two functions to directly read/write the registers
from the physical CPU when we're on a VHE system that has loaded the
system registers onto the physical CPU.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/include/asm/kvm_host.h |  4 +++
 arch/arm64/kvm/sys_regs.c | 54 +--
 2 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 9f5761f..dcded44 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -278,6 +278,10 @@ struct kvm_vcpu_arch {
 
/* Detect first run of a vcpu */
bool has_run_once;
+
+   /* True when deferrable sysregs are loaded on the physical CPU,
+* see kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs. */
+   bool sysregs_loaded_on_cpu;
 };
 
 #define vcpu_gp_regs(v)(&(v)->arch.ctxt.gp_regs)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index dbe35fd..f7887dd 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -110,8 +111,57 @@ static bool access_dcsw(struct kvm_vcpu *vcpu,
return true;
 }
 
+static u64 read_deferrable_vm_reg(struct kvm_vcpu *vcpu, int reg)
+{
+   if (vcpu->arch.sysregs_loaded_on_cpu) {
+   switch (reg) {
+   case SCTLR_EL1: return read_sysreg_el1(sctlr);
+   case TTBR0_EL1: return read_sysreg_el1(ttbr0);
+   case TTBR1_EL1: return read_sysreg_el1(ttbr1);
+   case TCR_EL1:   return read_sysreg_el1(tcr);
+   case ESR_EL1:   return read_sysreg_el1(esr);
+   case FAR_EL1:   return read_sysreg_el1(far);
+   case AFSR0_EL1: return read_sysreg_el1(afsr0);
+   case AFSR1_EL1: return read_sysreg_el1(afsr1);
+   case MAIR_EL1:  return read_sysreg_el1(mair);
+   case AMAIR_EL1: return read_sysreg_el1(amair);
+   case CONTEXTIDR_EL1:return read_sysreg_el1(contextidr);
+   case DACR32_EL2:return read_sysreg(dacr32_el2);
+   case IFSR32_EL2:return read_sysreg(ifsr32_el2);
+   default:BUG();
+   }
+   }
+
+   return vcpu_sys_reg(vcpu, reg);
+}
+
+static void write_deferrable_vm_reg(struct kvm_vcpu *vcpu, int reg, u64 val)
+{
+   if (vcpu->arch.sysregs_loaded_on_cpu) {
+   switch (reg) {
+   case SCTLR_EL1: write_sysreg_el1(val, sctlr);   return;
+   case TTBR0_EL1: write_sysreg_el1(val, ttbr0);   return;
+   case TTBR1_EL1: write_sysreg_el1(val, ttbr1);   return;
+   case TCR_EL1:   write_sysreg_el1(val, tcr); return;
+   case ESR_EL1:   write_sysreg_el1(val, esr); return;
+   case FAR_EL1:   write_sysreg_el1(val, far); return;
+   case AFSR0_EL1: write_sysreg_el1(val, afsr0);   return;
+   case AFSR1_EL1: write_sysreg_el1(val, afsr1);   return;
+   case MAIR_EL1:  write_sysreg_el1(val, mair);return;
+   case AMAIR_EL1: write_sysreg_el1(val, amair);   return;
+   case CONTEXTIDR_EL1:write_sysreg_el1(val, contextidr); 
return;
+   case DACR32_EL2:write_sysreg(val, dacr32_el2); return;
+   case IFSR32_EL2:write_sysreg(val, ifsr32_el2); return;
+   default:BUG();
+   }
+   }
+
+   vcpu_sys_reg(vcpu, reg) = val;
+}
+
 /*
  * Generic accessor for VM registers. Only called as long as HCR_TVM
+ *
  * is set. If the guest enables the MMU, we stop trapping the VM
  * sys_regs and leave it in complete control of the caches.
  */
@@ -132,14 +182,14 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
if (!p->is_aarch32 || !p->is_32bit) {
val = p->regval;
} else {
-   val = vcpu_sys_reg(vcpu, reg);
+   val = read_deferrable_vm_reg(vcpu, reg);
if (r->reg % 2)
val = (p->regval << 32) | (u64)lower_32_bits(val);
else
val = ((u64)upper_32_bits(val) << 32) |
(u64)lower_32_bits(p->regval);
}
-   vcpu_sys_reg(vcpu, reg) = val;
+   write_deferrable_vm_reg(vcpu, reg, val);
 
kvm_toggle_cache(vcpu, was_enabled);
return true;
-- 
2.9.0


[PATCH 12/37] KVM: arm64: Factor out fault info population and gic workarounds

2017-10-12 Thread Christoffer Dall
The current world-switch function has functionality to detect a number
of cases where we need to fixup some part of the exit condition and
possibly run the guest again, before having restored the host state.

This includes populating missing fault info, emulating GICv2 CPU
interface accesses when mapped at unaligned addresses, and emulating
the GICv3 CPU interfaceon systems that need that.

As we are about to have an alternative switch function for VHE systems,
but VHE systems still need the same early fixup logic, factor out this
logic into a separate function that can be shared by both switch
functions.

No functional change.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/hyp/switch.c | 91 ++---
 1 file changed, 52 insertions(+), 39 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index e270cba..ed30af5 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -258,50 +258,24 @@ static void __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
write_sysreg_el2(*vcpu_pc(vcpu), elr);
 }
 
-int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
+/*
+ * Return true when we were able to fixup the guest exit and should return to
+ * the guest, false when we should restore the host state and return to the
+ * main run loop.
+ */
+static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
-   struct kvm_cpu_context *host_ctxt;
-   struct kvm_cpu_context *guest_ctxt;
-   u64 exit_code;
-
-   vcpu = kern_hyp_va(vcpu);
-
-   host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
-   host_ctxt->__hyp_running_vcpu = vcpu;
-   guest_ctxt = >arch.ctxt;
-
-   __sysreg_save_host_state(host_ctxt);
-
-   __activate_traps(vcpu);
-   __activate_vm(vcpu);
-
-   __vgic_restore_state(vcpu);
-   __timer_enable_traps(vcpu);
-
-   /*
-* We must restore the 32-bit state before the sysregs, thanks
-* to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72).
-*/
-   __sysreg32_restore_state(vcpu);
-   __sysreg_restore_guest_state(guest_ctxt);
-   __debug_switch_to_guest(vcpu);
-
-   /* Jump in the fire! */
-again:
-   exit_code = __guest_enter(vcpu, host_ctxt);
-   /* And we're baaack! */
-
/*
 * We're using the raw exception code in order to only process
 * the trap if no SError is pending. We will come back to the
 * same PC once the SError has been injected, and replay the
 * trapping instruction.
 */
-   if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
-   goto again;
+   if (*exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
+   return true;
 
if (static_branch_unlikely(_v2_cpuif_trap) &&
-   exit_code == ARM_EXCEPTION_TRAP) {
+   *exit_code == ARM_EXCEPTION_TRAP) {
bool valid;
 
valid = kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_DABT_LOW &&
@@ -315,13 +289,13 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 
if (ret == 1) {
__skip_instr(vcpu);
-   goto again;
+   return true;
}
 
if (ret == -1) {
/* Promote an illegal access to an SError */
__skip_instr(vcpu);
-   exit_code = ARM_EXCEPTION_EL1_SERROR;
+   *exit_code = ARM_EXCEPTION_EL1_SERROR;
}
 
/* 0 falls through to be handler out of EL2 */
@@ -329,19 +303,58 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
}
 
if (static_branch_unlikely(_v3_cpuif_trap) &&
-   exit_code == ARM_EXCEPTION_TRAP &&
+   *exit_code == ARM_EXCEPTION_TRAP &&
(kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 ||
 kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_CP15_32)) {
int ret = __vgic_v3_perform_cpuif_access(vcpu);
 
if (ret == 1) {
__skip_instr(vcpu);
-   goto again;
+   return true;
}
 
/* 0 falls through to be handled out of EL2 */
}
 
+   return false;
+}
+
+int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
+{
+   struct kvm_cpu_context *host_ctxt;
+   struct kvm_cpu_context *guest_ctxt;
+   u64 exit_code;
+
+   vcpu = kern_hyp_va(vcpu);
+
+   host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
+   host_ctxt->__hyp_running_vcpu = vcpu;
+   guest_ctxt = >arch.ctxt;
+
+   __sysreg_save_host_state(host_ctxt);
+
+   __activate_traps(vcpu);
+   __activate_vm(vcpu);
+
+   __vgic_restore_state(vcpu);
+   

[PATCH 13/37] KVM: arm64: Introduce VHE-specific kvm_vcpu_run

2017-10-12 Thread Christoffer Dall
So far this is just a copy of the legacy non-VHE switch function, where
we only change the existing calls to has_vhe() in both the original and
new functions.

Signed-off-by: Christoffer Dall 
---
 arch/arm/include/asm/kvm_asm.h   |  4 +++
 arch/arm64/include/asm/kvm_asm.h |  2 ++
 arch/arm64/kvm/hyp/switch.c  | 57 
 virt/kvm/arm/arm.c   |  5 +++-
 4 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 36dd296..1a7bc5f 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -70,8 +70,12 @@ extern void __kvm_tlb_flush_local_vmid(struct kvm_vcpu 
*vcpu);
 
 extern void __kvm_timer_set_cntvoff(u32 cntvoff_low, u32 cntvoff_high);
 
+/* no VHE on 32-bit :( */
+static inline int kvm_vcpu_run(struct kvm_vcpu *vcpu) { return 0; }
+
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 
+
 extern void __init_stage2_translation(void);
 
 extern u64 __vgic_v3_get_ich_vtr_el2(void);
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 7e48a39..2eb5b23 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -57,6 +57,8 @@ extern void __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu);
 
 extern void __kvm_timer_set_cntvoff(u32 cntvoff_low, u32 cntvoff_high);
 
+extern int kvm_vcpu_run(struct kvm_vcpu *vcpu);
+
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 
 extern u64 __vgic_v3_get_ich_vtr_el2(void);
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index ed30af5..8a0f38f 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -319,6 +319,63 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu 
*vcpu, u64 *exit_code)
return false;
 }
 
+/* Switch to the guest for VHE systems running in EL2 */
+int kvm_vcpu_run(struct kvm_vcpu *vcpu)
+{
+   struct kvm_cpu_context *host_ctxt;
+   struct kvm_cpu_context *guest_ctxt;
+   u64 exit_code;
+
+   vcpu = kern_hyp_va(vcpu);
+
+   host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
+   host_ctxt->__hyp_running_vcpu = vcpu;
+   guest_ctxt = >arch.ctxt;
+
+   __sysreg_save_host_state(host_ctxt);
+
+   __activate_traps(vcpu);
+   __activate_vm(vcpu);
+
+   __vgic_restore_state(vcpu);
+   __timer_enable_traps(vcpu);
+
+   /*
+* We must restore the 32-bit state before the sysregs, thanks
+* to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72).
+*/
+   __sysreg32_restore_state(vcpu);
+   __sysreg_restore_guest_state(guest_ctxt);
+   __debug_switch_to_guest(vcpu);
+
+   /* Jump in the fire! */
+again:
+   exit_code = __guest_enter(vcpu, host_ctxt);
+   /* And we're baaack! */
+
+   if (fixup_guest_exit(vcpu, _code))
+   goto again;
+
+   __sysreg_save_guest_state(guest_ctxt);
+   __sysreg32_save_state(vcpu);
+   __timer_disable_traps(vcpu);
+   __vgic_save_state(vcpu);
+
+   __deactivate_traps(vcpu);
+   __deactivate_vm(vcpu);
+
+   __sysreg_restore_host_state(host_ctxt);
+
+   /*
+* This must come after restoring the host sysregs, since a non-VHE
+* system may enable SPE here and make use of the TTBRs.
+*/
+   __debug_switch_to_host(vcpu);
+
+   return exit_code;
+}
+
+/* Switch to the guest for legacy non-VHE systems */
 int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 {
struct kvm_cpu_context *host_ctxt;
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index cf121b2..b11647a 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -706,7 +706,10 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
trace_kvm_entry(*vcpu_pc(vcpu));
guest_enter_irqoff();
 
-   ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
+   if (has_vhe())
+   ret = kvm_vcpu_run(vcpu);
+   else
+   ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
 
vcpu->mode = OUTSIDE_GUEST_MODE;
vcpu->stat.exits++;
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 15/37] KVM: arm64: Don't deactivate VM on VHE systems

2017-10-12 Thread Christoffer Dall
There is no need to reset the VTTBR to zero when exiting the guest on
VHE systems.  VHE systems don't use stage 2 translations for the EL2&0
translation regime used by the host.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/hyp/switch.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index b72dc66..2cedf12 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -136,13 +136,13 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu 
*vcpu)
write_sysreg(0, pmuserenr_el0);
 }
 
-static void __hyp_text __activate_vm(struct kvm_vcpu *vcpu)
+static inline void __hyp_text __activate_vm(struct kvm_vcpu *vcpu)
 {
struct kvm *kvm = kern_hyp_va(vcpu->kvm);
write_sysreg(kvm->arch.vttbr, vttbr_el2);
 }
 
-static void __hyp_text __deactivate_vm(struct kvm_vcpu *vcpu)
+static inline void __hyp_text __deactivate_vm(struct kvm_vcpu *vcpu)
 {
write_sysreg(0, vttbr_el2);
 }
@@ -360,7 +360,6 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu)
__vgic_save_state(vcpu);
 
__deactivate_traps(vcpu);
-   __deactivate_vm(vcpu);
 
__sysreg_restore_host_state(host_ctxt);
 
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 19/37] KVM: arm64: Introduce separate VHE/non-VHE sysreg save/restore functions

2017-10-12 Thread Christoffer Dall
As we are about to handle system registers quite differently between VHE
and non-VHE systems.  In preparation for that, we need to split some of
the handling functions between VHE and non-VHE functionality.

For now, we simply copy the non-VHE functions, but we do change the use
of static keys for VHE and non-VHE functionality now that we have
separate functions.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/include/asm/kvm_hyp.h | 12 
 arch/arm64/kvm/hyp/switch.c  | 20 ++--
 arch/arm64/kvm/hyp/sysreg-sr.c   | 40 
 3 files changed, 50 insertions(+), 22 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index a0e5a70..998152d 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -132,10 +132,14 @@ int __vgic_v3_perform_cpuif_access(struct kvm_vcpu *vcpu);
 void __timer_enable_traps(struct kvm_vcpu *vcpu);
 void __timer_disable_traps(struct kvm_vcpu *vcpu);
 
-void __sysreg_save_host_state(struct kvm_cpu_context *ctxt);
-void __sysreg_restore_host_state(struct kvm_cpu_context *ctxt);
-void __sysreg_save_guest_state(struct kvm_cpu_context *ctxt);
-void __sysreg_restore_guest_state(struct kvm_cpu_context *ctxt);
+void __sysreg_save_host_state_nvhe(struct kvm_cpu_context *ctxt);
+void __sysreg_restore_host_state_nvhe(struct kvm_cpu_context *ctxt);
+void __sysreg_save_guest_state_nvhe(struct kvm_cpu_context *ctxt);
+void __sysreg_restore_guest_state_nvhe(struct kvm_cpu_context *ctxt);
+void sysreg_save_host_state_vhe(struct kvm_cpu_context *ctxt);
+void sysreg_restore_host_state_vhe(struct kvm_cpu_context *ctxt);
+void sysreg_save_guest_state_vhe(struct kvm_cpu_context *ctxt);
+void sysreg_restore_guest_state_vhe(struct kvm_cpu_context *ctxt);
 void __sysreg32_save_state(struct kvm_vcpu *vcpu);
 void __sysreg32_restore_state(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index b98b73b..7c4d430 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -330,7 +330,7 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu)
host_ctxt->__hyp_running_vcpu = vcpu;
guest_ctxt = >arch.ctxt;
 
-   __sysreg_save_host_state(host_ctxt);
+   sysreg_save_host_state_vhe(host_ctxt);
 
__activate_traps(vcpu);
__activate_vm(vcpu);
@@ -342,7 +342,7 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu)
 * to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72).
 */
__sysreg32_restore_state(vcpu);
-   __sysreg_restore_guest_state(guest_ctxt);
+   sysreg_restore_guest_state_vhe(guest_ctxt);
__debug_switch_to_guest(vcpu);
 
/* Jump in the fire! */
@@ -353,13 +353,13 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu)
if (fixup_guest_exit(vcpu, _code))
goto again;
 
-   __sysreg_save_guest_state(guest_ctxt);
+   sysreg_save_guest_state_vhe(guest_ctxt);
__sysreg32_save_state(vcpu);
__vgic_save_state(vcpu);
 
__deactivate_traps(vcpu);
 
-   __sysreg_restore_host_state(host_ctxt);
+   sysreg_restore_host_state_vhe(host_ctxt);
 
/*
 * This must come after restoring the host sysregs, since a non-VHE
@@ -383,7 +383,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
host_ctxt->__hyp_running_vcpu = vcpu;
guest_ctxt = >arch.ctxt;
 
-   __sysreg_save_host_state(host_ctxt);
+   __sysreg_save_host_state_nvhe(host_ctxt);
 
__activate_traps(vcpu);
__activate_vm(vcpu);
@@ -396,7 +396,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 * to erratum #852523 (Cortex-A57) or #853709 (Cortex-A72).
 */
__sysreg32_restore_state(vcpu);
-   __sysreg_restore_guest_state(guest_ctxt);
+   __sysreg_restore_guest_state_nvhe(guest_ctxt);
__debug_switch_to_guest(vcpu);
 
/* Jump in the fire! */
@@ -407,7 +407,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
if (fixup_guest_exit(vcpu, _code))
goto again;
 
-   __sysreg_save_guest_state(guest_ctxt);
+   __sysreg_save_guest_state_nvhe(guest_ctxt);
__sysreg32_save_state(vcpu);
__timer_disable_traps(vcpu);
__vgic_save_state(vcpu);
@@ -415,7 +415,7 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
__deactivate_traps(vcpu);
__deactivate_vm(vcpu);
 
-   __sysreg_restore_host_state(host_ctxt);
+   __sysreg_restore_host_state_nvhe(host_ctxt);
 
/*
 * This must come after restoring the host sysregs, since a non-VHE
@@ -440,7 +440,7 @@ static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 
elr, u64 par,
__timer_disable_traps(vcpu);
__deactivate_traps(vcpu);
__deactivate_vm(vcpu);
-   __sysreg_restore_host_state(__host_ctxt);
+   

[PATCH 21/37] KVM: arm64: Don't save the host ELR_EL2 and SPSR_EL2 on VHE systems

2017-10-12 Thread Christoffer Dall
On non-VHE systems we need to save the ELR_EL2 and SPSR_EL2 so that we
can return to the host in EL1 in the same state and location where we
issued a hypercall to EL2, but these registers don't contain anything
important on VHE, because all of the host runs in EL2.  Therefore,
exclude them when saving/restoring the host state.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/hyp/sysreg-sr.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index f5c1b44..354ca02 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -66,6 +66,10 @@ static void __hyp_text __sysreg_save_el1_state(struct 
kvm_cpu_context *ctxt)
ctxt->gp_regs.sp_el1= read_sysreg(sp_el1);
ctxt->gp_regs.elr_el1   = read_sysreg_el1(elr);
ctxt->gp_regs.spsr[KVM_SPSR_EL1]= read_sysreg_el1(spsr);
+}
+
+static void __hyp_text __sysreg_save_el2_return_state(struct kvm_cpu_context 
*ctxt)
+{
ctxt->gp_regs.regs.pc   = read_sysreg_el2(elr);
ctxt->gp_regs.regs.pstate   = read_sysreg_el2(spsr);
 }
@@ -75,6 +79,7 @@ void __hyp_text __sysreg_save_state_nvhe(struct 
kvm_cpu_context *ctxt)
__sysreg_save_el1_state(ctxt);
__sysreg_save_common_state(ctxt);
__sysreg_save_user_state(ctxt);
+   __sysreg_save_el2_return_state(ctxt);
 }
 
 void sysreg_save_host_state_vhe(struct kvm_cpu_context *ctxt)
@@ -88,6 +93,7 @@ void sysreg_save_guest_state_vhe(struct kvm_cpu_context *ctxt)
__sysreg_save_el1_state(ctxt);
__sysreg_save_common_state(ctxt);
__sysreg_save_user_state(ctxt);
+   __sysreg_save_el2_return_state(ctxt);
 }
 
 static void __hyp_text __sysreg_restore_common_state(struct kvm_cpu_context 
*ctxt)
@@ -127,6 +133,11 @@ static void __hyp_text __sysreg_restore_el1_state(struct 
kvm_cpu_context *ctxt)
write_sysreg(ctxt->gp_regs.sp_el1,  sp_el1);
write_sysreg_el1(ctxt->gp_regs.elr_el1, elr);
write_sysreg_el1(ctxt->gp_regs.spsr[KVM_SPSR_EL1],spsr);
+}
+
+static void __hyp_text
+__sysreg_restore_el2_return_state(struct kvm_cpu_context *ctxt)
+{
write_sysreg_el2(ctxt->gp_regs.regs.pc, elr);
write_sysreg_el2(ctxt->gp_regs.regs.pstate, spsr);
 }
@@ -136,6 +147,7 @@ void __hyp_text __sysreg_restore_state_nvhe(struct 
kvm_cpu_context *ctxt)
__sysreg_restore_el1_state(ctxt);
__sysreg_restore_common_state(ctxt);
__sysreg_restore_user_state(ctxt);
+   __sysreg_restore_el2_return_state(ctxt);
 }
 
 void sysreg_restore_host_state_vhe(struct kvm_cpu_context *ctxt)
@@ -149,6 +161,7 @@ void sysreg_restore_guest_state_vhe(struct kvm_cpu_context 
*ctxt)
__sysreg_restore_el1_state(ctxt);
__sysreg_restore_common_state(ctxt);
__sysreg_restore_user_state(ctxt);
+   __sysreg_restore_el2_return_state(ctxt);
 }
 
 static void __hyp_text __fpsimd32_save_state(struct kvm_cpu_context *ctxt)
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 10/37] KVM: arm64: Slightly improve debug save/restore functions

2017-10-12 Thread Christoffer Dall
The debug save/restore functions can be improved by using the has_vhe()
static key instead of the instruction alternative.  Using the static key
uses the same paradigm as we're going to use elsewhere, it makes the
code more readable, and it generates slightly better code (no
stack setups and function calls unless necessary).

We also use a static key on the restore path, because it will be
marginally faster than loading a value from memory.

Finally, we don't have to conditionally clear the debug dirty flag if
it's set, we can just clear it.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/hyp/debug-sr.c | 22 +-
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kvm/hyp/debug-sr.c b/arch/arm64/kvm/hyp/debug-sr.c
index 0fc0758..a2291b6 100644
--- a/arch/arm64/kvm/hyp/debug-sr.c
+++ b/arch/arm64/kvm/hyp/debug-sr.c
@@ -75,11 +75,6 @@
 
 #define psb_csync()asm volatile("hint #17")
 
-static void __hyp_text __debug_save_spe_vhe(u64 *pmscr_el1)
-{
-   /* The vcpu can run. but it can't hide. */
-}
-
 static void __hyp_text __debug_save_spe_nvhe(u64 *pmscr_el1)
 {
u64 reg;
@@ -109,10 +104,6 @@ static void __hyp_text __debug_save_spe_nvhe(u64 
*pmscr_el1)
dsb(nsh);
 }
 
-static hyp_alternate_select(__debug_save_spe,
-   __debug_save_spe_nvhe, __debug_save_spe_vhe,
-   ARM64_HAS_VIRT_HOST_EXTN);
-
 static void __hyp_text __debug_restore_spe(u64 pmscr_el1)
 {
if (!pmscr_el1)
@@ -174,17 +165,22 @@ void __hyp_text __debug_cond_save_host_state(struct 
kvm_vcpu *vcpu)
 {
__debug_save_state(vcpu, >arch.host_debug_state.regs,
   kern_hyp_va(vcpu->arch.host_cpu_context));
-   __debug_save_spe()(>arch.host_debug_state.pmscr_el1);
+
+   /* Non-VHE: Disable and flush SPE data generation
+* VHE: The vcpu can run. but it can't hide. */
+   if (!has_vhe())
+   __debug_save_spe_nvhe(>arch.host_debug_state.pmscr_el1);
 }
 
 void __hyp_text __debug_cond_restore_host_state(struct kvm_vcpu *vcpu)
 {
-   __debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1);
+   if (!has_vhe())
+   __debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1);
+
__debug_restore_state(vcpu, >arch.host_debug_state.regs,
  kern_hyp_va(vcpu->arch.host_cpu_context));
 
-   if (vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY)
-   vcpu->arch.debug_flags &= ~KVM_ARM64_DEBUG_DIRTY;
+   vcpu->arch.debug_flags &= ~KVM_ARM64_DEBUG_DIRTY;
 }
 
 u32 __hyp_text __kvm_get_mdcr_el2(void)
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 16/37] KVM: arm64: Remove noop calls to timer save/restore from VHE switch

2017-10-12 Thread Christoffer Dall
The VHE switch function calls __timer_enable_traps and
__timer_disable_traps which don't do anything on VHE systems.
Therefore, simply remove these calls from the VHE switch function and
make the functions non-conditional as they are now only called from the
non-VHE switch path.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/hyp/switch.c |  2 --
 virt/kvm/arm/hyp/timer-sr.c | 10 ++
 2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 2cedf12..b98b73b 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -336,7 +336,6 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu)
__activate_vm(vcpu);
 
__vgic_restore_state(vcpu);
-   __timer_enable_traps(vcpu);
 
/*
 * We must restore the 32-bit state before the sysregs, thanks
@@ -356,7 +355,6 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu)
 
__sysreg_save_guest_state(guest_ctxt);
__sysreg32_save_state(vcpu);
-   __timer_disable_traps(vcpu);
__vgic_save_state(vcpu);
 
__deactivate_traps(vcpu);
diff --git a/virt/kvm/arm/hyp/timer-sr.c b/virt/kvm/arm/hyp/timer-sr.c
index f398616..82c217e 100644
--- a/virt/kvm/arm/hyp/timer-sr.c
+++ b/virt/kvm/arm/hyp/timer-sr.c
@@ -53,16 +53,10 @@ void __hyp_text disable_el1_phys_timer_access(void)
 
 void __hyp_text __timer_disable_traps(struct kvm_vcpu *vcpu)
 {
-   /*
-* We don't need to do this for VHE since the host kernel runs in EL2
-* with HCR_EL2.TGE ==1, which makes those bits have no impact.
-*/
-   if (!has_vhe())
-   enable_el1_phys_timer_access();
+   enable_el1_phys_timer_access();
 }
 
 void __hyp_text __timer_enable_traps(struct kvm_vcpu *vcpu)
 {
-   if (!has_vhe())
-   disable_el1_phys_timer_access();
+   disable_el1_phys_timer_access();
 }
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 14/37] KVM: arm64: Remove kern_hyp_va() use in VHE switch function

2017-10-12 Thread Christoffer Dall
VHE kernels run completely in EL2 and therefore don't have a notion of
kernel and hyp addresses, they are all just kernel addresses.  Therefore
don't call kern_hyp_va() in the VHE switch function.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/hyp/switch.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 8a0f38f..b72dc66 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -326,9 +326,7 @@ int kvm_vcpu_run(struct kvm_vcpu *vcpu)
struct kvm_cpu_context *guest_ctxt;
u64 exit_code;
 
-   vcpu = kern_hyp_va(vcpu);
-
-   host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
+   host_ctxt = vcpu->arch.host_cpu_context;
host_ctxt->__hyp_running_vcpu = vcpu;
guest_ctxt = >arch.ctxt;
 
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 09/37] KVM: arm64: Move debug dirty flag calculation out of world switch

2017-10-12 Thread Christoffer Dall
There is no need to figure out inside the world-switch if we should
save/restore the debug registers or not, we can might as well do that in
the higher level debug setup code, making it easier to optimize down the
line.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/debug.c| 9 +
 arch/arm64/kvm/hyp/debug-sr.c | 6 --
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index dbadfaf..62550de19 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -193,6 +193,15 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
if (trap_debug)
vcpu->arch.mdcr_el2 |= MDCR_EL2_TDA;
 
+   /*
+* If any of KDE, MDE or KVM_ARM64_DEBUG_DIRTY is set, perform
+* a full save/restore cycle.
+*/
+   if ((vcpu_sys_reg(vcpu, MDSCR_EL1) & DBG_MDSCR_KDE) ||
+   (vcpu_sys_reg(vcpu, MDSCR_EL1) & DBG_MDSCR_MDE))
+   vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
+
+
trace_kvm_arm_set_dreg32("MDCR_EL2", vcpu->arch.mdcr_el2);
trace_kvm_arm_set_dreg32("MDSCR_EL1", vcpu_sys_reg(vcpu, MDSCR_EL1));
 }
diff --git a/arch/arm64/kvm/hyp/debug-sr.c b/arch/arm64/kvm/hyp/debug-sr.c
index f5154ed..0fc0758 100644
--- a/arch/arm64/kvm/hyp/debug-sr.c
+++ b/arch/arm64/kvm/hyp/debug-sr.c
@@ -172,12 +172,6 @@ void __hyp_text __debug_restore_state(struct kvm_vcpu 
*vcpu,
 
 void __hyp_text __debug_cond_save_host_state(struct kvm_vcpu *vcpu)
 {
-   /* If any of KDE, MDE or KVM_ARM64_DEBUG_DIRTY is set, perform
-* a full save/restore cycle. */
-   if ((vcpu->arch.ctxt.sys_regs[MDSCR_EL1] & DBG_MDSCR_KDE) ||
-   (vcpu->arch.ctxt.sys_regs[MDSCR_EL1] & DBG_MDSCR_MDE))
-   vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
-
__debug_save_state(vcpu, >arch.host_debug_state.regs,
   kern_hyp_va(vcpu->arch.host_cpu_context));
__debug_save_spe()(>arch.host_debug_state.pmscr_el1);
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 06/37] KVM: arm/arm64: Only load/put VCPU state for KVM_RUN

2017-10-12 Thread Christoffer Dall
We only want to do things like invalidating TLBs or load timer state
onto the physical CPU if we really intend to run the VCPU, not if we are
simply retrieving some in-kernel value from userspace, for example.

Signed-off-by: Christoffer Dall 
---
 virt/kvm/arm/arm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 6e9513e..d495453 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -338,6 +338,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
int *last_ran;
 
+   if (vcpu->ioctl != KVM_RUN)
+   return;
+
last_ran = this_cpu_ptr(vcpu->kvm->arch.last_vcpu_ran);
 
/*
@@ -359,6 +362,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
+   if (vcpu->ioctl != KVM_RUN)
+   return;
+
kvm_timer_vcpu_put(vcpu);
kvm_vgic_put(vcpu);
 
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 05/37] KVM: Record the executing ioctl number on the vcpu struct

2017-10-12 Thread Christoffer Dall
Some architectures may decide to do different things during
kvm_arch_vcpu_load depending on the ioctl being executed.  For example,
arm64 is about to do significant work in vcpu load/put when running a
vcpu, but not when doing things like KVM_SET_ONE_REG or
KVM_SET_MP_STATE.

Therefore, store the ioctl number that we are executing on the VCPU
during the first vcpu_load() which succeeds in getting the vcpu->mutex
and set the ioctl number to 0 when exiting kvm_vcpu_ioctl() after
successfully loading the vcpu.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Christoffer Dall 
---
 arch/x86/kvm/vmx.c   | 2 +-
 arch/x86/kvm/x86.c   | 8 
 include/linux/kvm_host.h | 3 ++-
 virt/kvm/kvm_main.c  | 6 --
 4 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 6970249..d729889 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9493,7 +9493,7 @@ static void vmx_free_vcpu_nested(struct kvm_vcpu *vcpu)
struct vcpu_vmx *vmx = to_vmx(vcpu);
int r;
 
-   r = vcpu_load(vcpu);
+   r = vcpu_load(vcpu, vcpu->ioctl);
BUG_ON(r);
vmx_switch_vmcs(vcpu, >vmcs01);
free_nested(vmx);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index cd17b7d..68d9c95 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7723,7 +7723,7 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
int r;
 
kvm_vcpu_mtrr_init(vcpu);
-   r = vcpu_load(vcpu);
+   r = vcpu_load(vcpu, vcpu->ioctl);
if (r)
return r;
kvm_vcpu_reset(vcpu, false);
@@ -7739,7 +7739,7 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
 
kvm_hv_vcpu_postcreate(vcpu);
 
-   if (vcpu_load(vcpu))
+   if (vcpu_load(vcpu, vcpu->ioctl))
return;
msr.data = 0x0;
msr.index = MSR_IA32_TSC;
@@ -7759,7 +7759,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
int r;
vcpu->arch.apf.msr_val = 0;
 
-   r = vcpu_load(vcpu);
+   r = vcpu_load(vcpu, vcpu->ioctl);
BUG_ON(r);
kvm_mmu_unload(vcpu);
vcpu_put(vcpu);
@@ -8116,7 +8116,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 static void kvm_unload_vcpu_mmu(struct kvm_vcpu *vcpu)
 {
int r;
-   r = vcpu_load(vcpu);
+   r = vcpu_load(vcpu, vcpu->ioctl);
BUG_ON(r);
kvm_mmu_unload(vcpu);
vcpu_put(vcpu);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6882538..da0acc0 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -274,6 +274,7 @@ struct kvm_vcpu {
bool preempted;
struct kvm_vcpu_arch arch;
struct dentry *debugfs_dentry;
+   unsigned int ioctl; /* ioctl currently executing or 0 */
 };
 
 static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu)
@@ -533,7 +534,7 @@ static inline int kvm_vcpu_get_idx(struct kvm_vcpu *vcpu)
 int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id);
 void kvm_vcpu_uninit(struct kvm_vcpu *vcpu);
 
-int __must_check vcpu_load(struct kvm_vcpu *vcpu);
+int __must_check vcpu_load(struct kvm_vcpu *vcpu, unsigned int ioctl);
 void vcpu_put(struct kvm_vcpu *vcpu);
 
 #ifdef __KVM_HAVE_IOAPIC
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 9deb5a2..1911ef0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -147,12 +147,13 @@ bool kvm_is_reserved_pfn(kvm_pfn_t pfn)
 /*
  * Switches to specified vcpu, until a matching vcpu_put()
  */
-int vcpu_load(struct kvm_vcpu *vcpu)
+int vcpu_load(struct kvm_vcpu *vcpu, unsigned int ioctl)
 {
int cpu;
 
if (mutex_lock_killable(>mutex))
return -EINTR;
+   vcpu->ioctl = ioctl;
cpu = get_cpu();
preempt_notifier_register(>preempt_notifier);
kvm_arch_vcpu_load(vcpu, cpu);
@@ -2529,7 +2530,7 @@ static long kvm_vcpu_ioctl(struct file *filp,
 #endif
 
 
-   r = vcpu_load(vcpu);
+   r = vcpu_load(vcpu, ioctl);
if (r)
return r;
switch (ioctl) {
@@ -2704,6 +2705,7 @@ static long kvm_vcpu_ioctl(struct file *filp,
}
 out:
vcpu_put(vcpu);
+   vcpu->ioctl = 0;
kfree(fpu);
kfree(kvm_sregs);
return r;
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 07/37] KVM: arm/arm64: Add kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs

2017-10-12 Thread Christoffer Dall
As we are about to move a buch of save/restore logic for VHE kernels to
the load and put functions, we need some infrastructure to do this.

Signed-off-by: Christoffer Dall 
---
 arch/arm/include/asm/kvm_host.h   |  3 +++
 arch/arm64/include/asm/kvm_host.h |  3 +++
 arch/arm64/kvm/hyp/sysreg-sr.c| 27 +++
 virt/kvm/arm/arm.c|  2 ++
 4 files changed, 35 insertions(+)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 1100170..13f8165 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -290,4 +290,7 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
 int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
   struct kvm_device_attr *attr);
 
+static inline void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu) {}
+static inline void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu) {}
+
 #endif /* __ARM_KVM_HOST_H__ */
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 27305e7..7d3bfa7 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -383,4 +383,7 @@ static inline void __cpu_init_stage2(void)
  "PARange is %d bits, unsupported configuration!", parange);
 }
 
+void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu);
+void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu);
+
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index c54cc2a..b7438c8 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -183,3 +183,30 @@ void __hyp_text __sysreg32_restore_state(struct kvm_vcpu 
*vcpu)
if (vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY)
write_sysreg(sysreg[DBGVCR32_EL2], dbgvcr32_el2);
 }
+
+/**
+ * kvm_vcpu_load_sysregs - Load guest system register to physical CPU
+ *
+ * @vcpu: The VCPU pointer
+ *
+ * If the kernel runs in EL2 then load the system register state for the VCPU
+ * for EL1 onto the physical CPU so that we can go back and foward between the
+ * VM and the hypervisor without switching all this state around.
+ */
+void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu)
+{
+}
+
+/**
+ * kvm_vcpu_put_sysregs - Restore host system register state to physical CPU
+ *
+ * @vcpu: The VCPU pointer
+ *
+ * If the kernel runs in EL2 and the physical register state belongs to the
+ * VCPU, then restore the system register state for the host for EL1 onto the
+ * physical CPU so that we can run userspace and other threads on this
+ * physical CPU.
+ */
+void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu)
+{
+}
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index d495453..cf121b2 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -358,6 +358,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
kvm_arm_set_running_vcpu(vcpu);
kvm_vgic_load(vcpu);
kvm_timer_vcpu_load(vcpu);
+   kvm_vcpu_load_sysregs(vcpu);
 }
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
@@ -365,6 +366,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
if (vcpu->ioctl != KVM_RUN)
return;
 
+   kvm_vcpu_put_sysregs(vcpu);
kvm_timer_vcpu_put(vcpu);
kvm_vgic_put(vcpu);
 
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 04/37] KVM: arm/arm64: Get rid of vcpu->arch.irq_lines

2017-10-12 Thread Christoffer Dall
We currently have a separate read-modify-write of the HCR_EL2 on entry
to the guest for the sole purpose of setting the VF and VI bits, if set.
Since this is most rarely the case (only when using userspace IRQ chip
and interrupts are in flight), let's get rid of this operation and
instead modify the bits in the vcpu->arch.hcr[_el2] directly when
needed.

Signed-off-by: Christoffer Dall 
---
 arch/arm/include/asm/kvm_emulate.h   |  9 ++---
 arch/arm/include/asm/kvm_host.h  |  3 ---
 arch/arm/kvm/emulate.c   |  2 +-
 arch/arm/kvm/hyp/switch.c|  2 +-
 arch/arm64/include/asm/kvm_emulate.h |  9 ++---
 arch/arm64/include/asm/kvm_host.h|  3 ---
 arch/arm64/kvm/hyp/switch.c  |  6 --
 arch/arm64/kvm/inject_fault.c|  2 +-
 virt/kvm/arm/arm.c   | 11 ++-
 virt/kvm/arm/mmu.c   |  6 +++---
 10 files changed, 16 insertions(+), 37 deletions(-)

diff --git a/arch/arm/include/asm/kvm_emulate.h 
b/arch/arm/include/asm/kvm_emulate.h
index 98089ff..34663a8 100644
--- a/arch/arm/include/asm/kvm_emulate.h
+++ b/arch/arm/include/asm/kvm_emulate.h
@@ -62,14 +62,9 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
vcpu->arch.hcr = HCR_GUEST_MASK;
 }
 
-static inline unsigned long vcpu_get_hcr(const struct kvm_vcpu *vcpu)
+static inline unsigned long *vcpu_hcr(const struct kvm_vcpu *vcpu)
 {
-   return vcpu->arch.hcr;
-}
-
-static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr)
-{
-   vcpu->arch.hcr = hcr;
+   return (unsigned long *)>arch.hcr;
 }
 
 static inline bool vcpu_mode_is_32bit(const struct kvm_vcpu *vcpu)
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 4a879f6..1100170 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -153,9 +153,6 @@ struct kvm_vcpu_arch {
/* HYP trapping configuration */
u32 hcr;
 
-   /* Interrupt related fields */
-   u32 irq_lines;  /* IRQ and FIQ levels */
-
/* Exception Information */
struct kvm_vcpu_fault_info fault;
 
diff --git a/arch/arm/kvm/emulate.c b/arch/arm/kvm/emulate.c
index 0064b86..4286a89 100644
--- a/arch/arm/kvm/emulate.c
+++ b/arch/arm/kvm/emulate.c
@@ -313,5 +313,5 @@ void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long 
addr)
  */
 void kvm_inject_vabt(struct kvm_vcpu *vcpu)
 {
-   vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VA);
+   *vcpu_hcr(vcpu) |= HCR_VA;
 }
diff --git a/arch/arm/kvm/hyp/switch.c b/arch/arm/kvm/hyp/switch.c
index 330c9ce..c3b9799 100644
--- a/arch/arm/kvm/hyp/switch.c
+++ b/arch/arm/kvm/hyp/switch.c
@@ -43,7 +43,7 @@ static void __hyp_text __activate_traps(struct kvm_vcpu 
*vcpu, u32 *fpexc_host)
isb();
}
 
-   write_sysreg(vcpu->arch.hcr | vcpu->arch.irq_lines, HCR);
+   write_sysreg(vcpu->arch.hcr, HCR);
/* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
write_sysreg(HSTR_T(15), HSTR);
write_sysreg(HCPTR_TTA | HCPTR_TCP(10) | HCPTR_TCP(11), HCPTR);
diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index e5df3fc..1fbfe96 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -51,14 +51,9 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
vcpu->arch.hcr_el2 &= ~HCR_RW;
 }
 
-static inline unsigned long vcpu_get_hcr(struct kvm_vcpu *vcpu)
+static inline unsigned long *vcpu_hcr(struct kvm_vcpu *vcpu)
 {
-   return vcpu->arch.hcr_el2;
-}
-
-static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr)
-{
-   vcpu->arch.hcr_el2 = hcr;
+   return (unsigned long *)>arch.hcr_el2;
 }
 
 static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 806ccef..27305e7 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -266,9 +266,6 @@ struct kvm_vcpu_arch {
/* IO related fields */
struct kvm_decode mmio_decode;
 
-   /* Interrupt related fields */
-   u64 irq_lines;  /* IRQ and FIQ levels */
-
/* Cache some mmu pages needed inside spinlock regions */
struct kvm_mmu_memory_cache mmu_page_cache;
 
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index bcf1a79..7703d63 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -168,12 +168,6 @@ static void __hyp_text __vgic_save_state(struct kvm_vcpu 
*vcpu)
 
 static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu)
 {
-   u64 val;
-
-   val = read_sysreg(hcr_el2);
-   val |= vcpu->arch.irq_lines;
-   write_sysreg(val, hcr_el2);
-
if (static_branch_unlikely(_vgic_global_state.gicv3_cpuif))
__vgic_v3_restore_state(vcpu);
else
diff --git 

[PATCH 01/37] KVM: arm64: Avoid storing the vcpu pointer on the stack

2017-10-12 Thread Christoffer Dall
We already have the percpu area for the host cpu state, which points to
the VCPU, so there's no need to store the VCPU pointer on the stack on
every context switch.  We can be a little more clever and just use
tpidr_el2 for the percpu offset and load the VCPU pointer from the host
context.

This requires us to have a scratch register though, so we take the
chance to rearrange some of the el1_sync code to only look at the
vttbr_el2 to determine if this is a trap from the guest or an HVC from
the host.  We do add an extra check to call the panic code if the kernel
is configured with debugging enabled and we saw a trap from the host
which wasn't an HVC, indicating that we left some EL2 trap configured by
mistake.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/include/asm/kvm_asm.h | 20 
 arch/arm64/kernel/asm-offsets.c  |  1 +
 arch/arm64/kvm/hyp/entry.S   |  5 +
 arch/arm64/kvm/hyp/hyp-entry.S   | 39 ++-
 arch/arm64/kvm/hyp/switch.c  |  2 +-
 5 files changed, 41 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index ab4d0a9..7e48a39 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -70,4 +70,24 @@ extern u32 __init_stage2_translation(void);
 
 #endif
 
+#ifdef __ASSEMBLY__
+.macro get_host_ctxt reg, tmp
+   /*
+* '=kvm_host_cpu_state' is a host VA from the constant pool, it may
+* not be accessible by this address from EL2, hyp_panic() converts
+* it with kern_hyp_va() before use.
+*/
+   ldr \reg, =kvm_host_cpu_state
+   mrs \tmp, tpidr_el2
+   add \reg, \reg, \tmp
+   kern_hyp_va \reg
+.endm
+
+.macro get_vcpu vcpu, ctxt
+   ldr \vcpu, [\ctxt, #HOST_CONTEXT_VCPU]
+   kern_hyp_va \vcpu
+.endm
+
+#endif
+
 #endif /* __ARM_KVM_ASM_H__ */
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 71bf088..612021d 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -135,6 +135,7 @@ int main(void)
   DEFINE(CPU_FP_REGS,  offsetof(struct kvm_regs, fp_regs));
   DEFINE(VCPU_FPEXC32_EL2, offsetof(struct kvm_vcpu, 
arch.ctxt.sys_regs[FPEXC32_EL2]));
   DEFINE(VCPU_HOST_CONTEXT,offsetof(struct kvm_vcpu, 
arch.host_cpu_context));
+  DEFINE(HOST_CONTEXT_VCPU,offsetof(struct kvm_cpu_context, 
__hyp_running_vcpu));
 #endif
 #ifdef CONFIG_CPU_PM
   DEFINE(CPU_SUSPEND_SZ,   sizeof(struct cpu_suspend_ctx));
diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index 9a8ab5d..76cd48f 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -62,9 +62,6 @@ ENTRY(__guest_enter)
// Store the host regs
save_callee_saved_regs x1
 
-   // Store host_ctxt and vcpu for use at exit time
-   stp x1, x0, [sp, #-16]!
-
add x18, x0, #VCPU_CONTEXT
 
// Restore guest regs x0-x17
@@ -119,7 +116,7 @@ ENTRY(__guest_exit)
save_callee_saved_regs x1
 
// Restore the host_ctxt from the stack
-   ldr x2, [sp], #16
+   get_host_ctxt   x2, x3
 
// Now restore the host regs
restore_callee_saved_regs x2
diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
index e4f37b9..2950f26 100644
--- a/arch/arm64/kvm/hyp/hyp-entry.S
+++ b/arch/arm64/kvm/hyp/hyp-entry.S
@@ -56,19 +56,16 @@ ENDPROC(__vhe_hyp_call)
 el1_sync:  // Guest trapped into EL2
stp x0, x1, [sp, #-16]!
 
-alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
-   mrs x1, esr_el2
-alternative_else
-   mrs x1, esr_el1
-alternative_endif
-   lsr x0, x1, #ESR_ELx_EC_SHIFT
-
-   cmp x0, #ESR_ELx_EC_HVC64
-   b.neel1_trap
-
mrs x1, vttbr_el2   // If vttbr is valid, the 64bit guest
cbnzx1, el1_trap// called HVC
 
+#ifdef CONFIG_DEBUG
+   mrs x0, esr_el2
+   lsr x0, x0, #ESR_ELx_EC_SHIFT
+   cmp x0, #ESR_ELx_EC_HVC64
+   b.ne__hyp_panic
+#endif
+
/* Here, we're pretty sure the host called HVC. */
ldp x0, x1, [sp], #16
 
@@ -101,10 +98,15 @@ alternative_endif
eret
 
 el1_trap:
+   get_host_ctxt   x0, x1
+   get_vcpux1, x0
+
+   mrs x0, esr_el2
+   lsr x0, x0, #ESR_ELx_EC_SHIFT
/*
 * x0: ESR_EC
+* x1: vcpu pointer
 */
-   ldr x1, [sp, #16 + 8]   // vcpu stored by __guest_enter
 
/*
 * We trap the first access to the FP/SIMD to save the host context
@@ -122,13 +124,15 @@ alternative_else_nop_endif
 
 el1_irq:
stp x0, x1, [sp, #-16]!
-   ldr x1, [sp, #16 + 8]
+   get_host_ctxt   x0, x1
+   get_vcpux1, x0
mov x0, #ARM_EXCEPTION_IRQ
b   __guest_exit
 
 el1_error:
   

[PATCH 03/37] KVM: arm64: Move HCR_INT_OVERRIDE to default HCR_EL2 guest flag

2017-10-12 Thread Christoffer Dall
From: Shih-Wei Li 

We always set the IMO and FMO bits in the HCR_EL2 when running the
guest, regardless if we use the vgic or not.  By moving these flags to
HCR_GUEST_FLAGS we can avoid one of the extra save/restore operations of
HCR_EL2 in the world switch code, and we can also soon get rid of the
other one.

Signed-off-by: Shih-Wei Li 
Signed-off-by: Christoffer Dall 
---
 arch/arm64/include/asm/kvm_arm.h | 4 ++--
 arch/arm64/kvm/hyp/switch.c  | 3 ---
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 61d694c..e67e279 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -79,9 +79,9 @@
  */
 #define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \
 HCR_TVM | HCR_BSU_IS | HCR_FB | HCR_TAC | \
-HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW)
+HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | \
+HCR_FMO | HCR_IMO)
 #define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF)
-#define HCR_INT_OVERRIDE   (HCR_FMO | HCR_IMO)
 #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
 
 /* TCR_EL2 Registers bits */
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index a50ddf3..bcf1a79 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -164,8 +164,6 @@ static void __hyp_text __vgic_save_state(struct kvm_vcpu 
*vcpu)
__vgic_v3_save_state(vcpu);
else
__vgic_v2_save_state(vcpu);
-
-   write_sysreg(read_sysreg(hcr_el2) & ~HCR_INT_OVERRIDE, hcr_el2);
 }
 
 static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu)
@@ -173,7 +171,6 @@ static void __hyp_text __vgic_restore_state(struct kvm_vcpu 
*vcpu)
u64 val;
 
val = read_sysreg(hcr_el2);
-   val |=  HCR_INT_OVERRIDE;
val |= vcpu->arch.irq_lines;
write_sysreg(val, hcr_el2);
 
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 02/37] KVM: arm64: Rework hyp_panic for VHE and non-VHE

2017-10-12 Thread Christoffer Dall
VHE actually doesn't rely on clearing the VTTBR when returning to the
hsot kernel, and that is the current key mechanism of hyp_panic to
figure out how to attempt to return to a state good enough to print a
panic statement.

Therefore, we split the hyp_panic function into two functions, a VHE and
a non-VHE, keeping the non-VHE version intact, but changing the VHE
behavior.

The vttbr_el2 check on VHE doesn't really make that much sense, because
the only situation where we can get here on VHE is when the hypervisor
assembly code actually caleld into hyp_panic, which only happens when
VBAR_EL2 has been set to the KVM exception vectors.  On VHE, we can
always safely disable the traps and restore the host registers at this
point, so we simply do that unconditionally and call into the panic
function directly.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/kvm/hyp/switch.c | 45 +++--
 1 file changed, 23 insertions(+), 22 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index a0123ad..a50ddf3 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -394,10 +394,20 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 static const char __hyp_panic_string[] = "HYP panic:\nPS:%08llx PC:%016llx 
ESR:%08llx\nFAR:%016llx HPFAR:%016llx PAR:%016llx\nVCPU:%p\n";
 
 static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par,
-struct kvm_vcpu *vcpu)
+struct kvm_cpu_context 
*__host_ctxt)
 {
+   struct kvm_vcpu *vcpu;
unsigned long str_va;
 
+   vcpu = __host_ctxt->__hyp_running_vcpu;
+
+   if (read_sysreg(vttbr_el2)) {
+   __timer_disable_traps(vcpu);
+   __deactivate_traps(vcpu);
+   __deactivate_vm(vcpu);
+   __sysreg_restore_host_state(__host_ctxt);
+   }
+
/*
 * Force the panic string to be loaded from the literal pool,
 * making sure it is a kernel address and not a PC-relative
@@ -411,40 +421,31 @@ static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, 
u64 elr, u64 par,
   read_sysreg(hpfar_el2), par, vcpu);
 }
 
-static void __hyp_text __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par,
-   struct kvm_vcpu *vcpu)
+static void __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par,
+struct kvm_cpu_context *host_ctxt)
 {
+   struct kvm_vcpu *vcpu;
+   vcpu = host_ctxt->__hyp_running_vcpu;
+
+   __deactivate_traps_vhe();
+   __sysreg_restore_host_state(host_ctxt);
+
panic(__hyp_panic_string,
  spsr,  elr,
  read_sysreg_el2(esr),   read_sysreg_el2(far),
  read_sysreg(hpfar_el2), par, vcpu);
 }
 
-static hyp_alternate_select(__hyp_call_panic,
-   __hyp_call_panic_nvhe, __hyp_call_panic_vhe,
-   ARM64_HAS_VIRT_HOST_EXTN);
-
 void __hyp_text __noreturn hyp_panic(struct kvm_cpu_context *__host_ctxt)
 {
-   struct kvm_vcpu *vcpu = NULL;
-
u64 spsr = read_sysreg_el2(spsr);
u64 elr = read_sysreg_el2(elr);
u64 par = read_sysreg(par_el1);
 
-   if (read_sysreg(vttbr_el2)) {
-   struct kvm_cpu_context *host_ctxt;
-
-   host_ctxt = __host_ctxt;
-   vcpu = host_ctxt->__hyp_running_vcpu;
-   __timer_disable_traps(vcpu);
-   __deactivate_traps(vcpu);
-   __deactivate_vm(vcpu);
-   __sysreg_restore_host_state(host_ctxt);
-   }
-
-   /* Call panic for real */
-   __hyp_call_panic()(spsr, elr, par, vcpu);
+   if (!has_vhe())
+   __hyp_call_panic_nvhe(spsr, elr, par, __host_ctxt);
+   else
+   __hyp_call_panic_vhe(spsr, elr, par, __host_ctxt);
 
unreachable();
 }
-- 
2.9.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 08/37] KVM: arm64: Defer restoring host VFP state to vcpu_put

2017-10-12 Thread Christoffer Dall
Avoid saving the guest VFP registers and restoring the host VFP
registers on every exit from the VM.  Only when we're about to run
userspace or other threads in the kernel do we really have to switch the
state back to the host state.

We still initially configure the VFP registers to trap when entering the
VM, but the difference is that we now leave the guest state in the
hardware registers while running the VM.

Signed-off-by: Christoffer Dall 
---
 arch/arm64/include/asm/kvm_emulate.h |  5 
 arch/arm64/include/asm/kvm_host.h|  3 +++
 arch/arm64/kernel/asm-offsets.c  |  1 +
 arch/arm64/kvm/hyp/entry.S   |  3 +++
 arch/arm64/kvm/hyp/switch.c  | 47 +++-
 arch/arm64/kvm/hyp/sysreg-sr.c   | 21 +---
 6 files changed, 44 insertions(+), 36 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index 1fbfe96..630dd60 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -56,6 +56,11 @@ static inline unsigned long *vcpu_hcr(struct kvm_vcpu *vcpu)
return (unsigned long *)>arch.hcr_el2;
 }
 
+static inline bool vcpu_el1_is_32bit(struct kvm_vcpu *vcpu)
+{
+   return (!(vcpu->arch.hcr_el2 & HCR_RW));
+}
+
 static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
 {
return (unsigned long *)_gp_regs(vcpu)->regs.pc;
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 7d3bfa7..5e09eb9 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -210,6 +210,9 @@ struct kvm_vcpu_arch {
/* Guest debug state */
u64 debug_flags;
 
+   /* 1 if the guest VFP state is loaded into the hardware */
+   u64 guest_vfp_loaded;
+
/*
 * We maintain more than a single set of debug registers to support
 * debugging the guest from the host and to maintain separate host and
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 612021d..9946732 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -133,6 +133,7 @@ int main(void)
   DEFINE(CPU_GP_REGS,  offsetof(struct kvm_cpu_context, gp_regs));
   DEFINE(CPU_USER_PT_REGS, offsetof(struct kvm_regs, regs));
   DEFINE(CPU_FP_REGS,  offsetof(struct kvm_regs, fp_regs));
+  DEFINE(VCPU_GUEST_VFP_LOADED,offsetof(struct kvm_vcpu, 
arch.guest_vfp_loaded));
   DEFINE(VCPU_FPEXC32_EL2, offsetof(struct kvm_vcpu, 
arch.ctxt.sys_regs[FPEXC32_EL2]));
   DEFINE(VCPU_HOST_CONTEXT,offsetof(struct kvm_vcpu, 
arch.host_cpu_context));
   DEFINE(HOST_CONTEXT_VCPU,offsetof(struct kvm_cpu_context, 
__hyp_running_vcpu));
diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index 76cd48f..b3e7191 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -185,6 +185,9 @@ alternative_endif
add x0, x2, #CPU_GP_REG_OFFSET(CPU_FP_REGS)
bl  __fpsimd_restore_state
 
+   mov x0, #1
+   str x0, [x3, #VCPU_GUEST_VFP_LOADED]
+
// Skip restoring fpexc32 for AArch64 guests
mrs x1, hcr_el2
tbnzx1, #HCR_RW_SHIFT, 1f
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 7703d63..ef05c59 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -23,43 +23,31 @@
 #include 
 #include 
 
-static bool __hyp_text __fpsimd_enabled_nvhe(void)
-{
-   return !(read_sysreg(cptr_el2) & CPTR_EL2_TFP);
-}
-
-static bool __hyp_text __fpsimd_enabled_vhe(void)
-{
-   return !!(read_sysreg(cpacr_el1) & CPACR_EL1_FPEN);
-}
-
-static hyp_alternate_select(__fpsimd_is_enabled,
-   __fpsimd_enabled_nvhe, __fpsimd_enabled_vhe,
-   ARM64_HAS_VIRT_HOST_EXTN);
-
-bool __hyp_text __fpsimd_enabled(void)
-{
-   return __fpsimd_is_enabled()();
-}
-
-static void __hyp_text __activate_traps_vhe(void)
+static void __hyp_text __activate_traps_vhe(struct kvm_vcpu *vcpu)
 {
u64 val;
 
val = read_sysreg(cpacr_el1);
val |= CPACR_EL1_TTA;
-   val &= ~CPACR_EL1_FPEN;
+   if (vcpu->arch.guest_vfp_loaded)
+   val |= CPACR_EL1_FPEN;
+   else
+   val &= ~CPACR_EL1_FPEN;
write_sysreg(val, cpacr_el1);
 
write_sysreg(__kvm_hyp_vector, vbar_el1);
 }
 
-static void __hyp_text __activate_traps_nvhe(void)
+static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
 {
u64 val;
 
val = CPTR_EL2_DEFAULT;
-   val |= CPTR_EL2_TTA | CPTR_EL2_TFP;
+   val |= CPTR_EL2_TTA;
+   if (vcpu->arch.guest_vfp_loaded)
+   val &= ~CPTR_EL2_TFP;
+   else
+   val |= CPTR_EL2_TFP;
write_sysreg(val, cptr_el2);
 }
 
@@ -81,7 +69,8 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
 * it will cause 

[PATCH 00/37] Optimize KVM/ARM for VHE systems

2017-10-12 Thread Christoffer Dall
This series redesigns parts of KVM/ARM to optimize the performance on
VHE systems.  The general approach is to try to do as little work as
possible when transitioning betwen the VM and the hypervisor.  This has
the benefit of lower latency when waiting for interrupts and delivering
virtual interrupts, and reduces the overhead of emulating behavior and
I/O in the host kernel.

Patches 01 through 04 are not VHE specific, but rework parts of KVM/ARM
that can be generally improved.  We then add infrastructure to move more
logic into vcpu_load and vcpu_put, we improve handling of VFP and debug
registers.

We then introduce a new world-switch function for VHE systems, which we
can tweak and optimize for VHE systems.  To do that, we rework a lot of
the sytem register save/restore handling and emulation code that may
need access to system registers, so that we can defer as many system
register save/restore operations to vcpu_load and vcpu_put, and move
this logic out of the VHE world switch function.

We then optimize the configuration of traps.  On non-VHE systems, both
the host and VM kernels run in EL1, but because the host kernel should
have full access to the underlying hardware, but the VM kernel should
not, we essentially make the host kernel more privieleged than the VM
kernel despite them both running at the same privilege level by enabling
VE traps when entering the VM and disabling those traps when exiting the
VM.  On VHE systems, the host kernel runs in EL2 and has full access to
the hardware (as much as allowed by secure side software), and is
unaffected by the trap configuration.  That means we can configure the
traps for VMs running in EL1 once, and don't have to switch them on and
off for every entry/exit to/from the VM.

Finally, we improve our VGIC handling by moving all save/restore logic
out of the VHE world-switch, and we make it possible to truly only
evaluate if the AP list is empty and not do *any* VGIC work if that is
the case, and only do the minimal amount of work required in the course
of the VGIC processing when we have virtual interrupts in flight.

The patches are based on v4.14-rc2 plus the timer optimization series
[1] and the patches that use the TPIDR_EL2 for per-cpu offsets in the
host kernel and stops storing the VCPU pointer in TPIDR_EL2 from the
SDEI series [2].

I've given the patches a fair amount of testing on Thunder-X, Mustang,
Seattle, and TC2 (32-bit) for non-VHE testing, and tested VHE
functionality on the Foundation model, running both 64-bit VMs and
32-bit VMs side-by-side and using both GICv3-on-GICv3 and
GICv2-on-GICv3.

The paches are also available in the vhe-optimize branch on my
kernel.org repository [3].

Thanks,
-Christoffer

[1]: https://lists.cs.columbia.edu/pipermail/kvmarm/2017-September/027245.html
[2]: https://lists.cs.columbia.edu/pipermail/kvmarm/2017-September/027230.html
[3]: git://git.kernel.org/pub/scm/linux/kernel/git/cdall/linux.git vhe-optimize

Christoffer Dall (36):
  KVM: arm64: Avoid storing the vcpu pointer on the stack
  KVM: arm64: Rework hyp_panic for VHE and non-VHE
  KVM: arm/arm64: Get rid of vcpu->arch.irq_lines
  KVM: Record the executing ioctl number on the vcpu struct
  KVM: arm/arm64: Only load/put VCPU state for KVM_RUN
  KVM: arm/arm64: Add kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs
  KVM: arm64: Defer restoring host VFP state to vcpu_put
  KVM: arm64: Move debug dirty flag calculation out of world switch
  KVM: arm64: Slightly improve debug save/restore functions
  KVM: arm64: Improve debug register save/restore flow
  KVM: arm64: Factor out fault info population and gic workarounds
  KVM: arm64: Introduce VHE-specific kvm_vcpu_run
  KVM: arm64: Remove kern_hyp_va() use in VHE switch function
  KVM: arm64: Don't deactivate VM on VHE systems
  KVM: arm64: Remove noop calls to timer save/restore from VHE switch
  KVM: arm64: Move userspace system registers into separate function
  KVM: arm64: Rewrite sysreg alternatives to static keys
  KVM: arm64: Introduce separate VHE/non-VHE sysreg save/restore
functions
  KVM: arm64: Unify non-VHE host/guest sysreg save and restore functions
  KVM: arm64: Don't save the host ELR_EL2 and SPSR_EL2 on VHE systems
  KVM: arm64: Change 32-bit handling of VM system registers
  KVM: arm64: Prepare to handle traps on deferred VM sysregs
  KVM: arm64: Prepare to handle traps on deferred EL0 sysregs
  KVM: arm64: Prepare to handle traps on remaining deferred EL1 sysregs
  KVM: arm64: Prepare to handle traps on deferred AArch32 sysregs
  KVM: arm64: Defer saving/restoring system registers to vcpu load/put
on VHE
  KVM: arm64: Move common VHE/non-VHE trap config in separate functions
  KVM: arm64: Configure FPSIMD traps on vcpu load/put for VHE
  KVM: arm64: Configure c15, PMU, and debug register traps on cpu
load/put for VHE
  KVM: arm64: Separate activate_traps and deactive_traps for VHE and
non-VHE
  KVM: arm/arm64: Handle VGICv2 save/restore from the main VGIC code
  KVM: