Re: [PATCH 2/2] KVM: arm64: Leave vcpu FPSIMD synchronization in host

2020-07-14 Thread Dave Martin
On Mon, Jul 13, 2020 at 09:42:04PM +0100, Andrew Scull wrote:
> On Mon, Jul 13, 2020 at 05:04:21PM +0100, Dave Martin wrote:
> > On Fri, Jul 10, 2020 at 10:57:54AM +0100, Andrew Scull wrote:
> > > The task state can be checked by the host and the vcpu flags updated
> > > before calling into hyp. This more neatly separates the concerns and
> > > removes the need to map the task flags to EL2.
> > > 
> > > Hyp acts on the state provided to it by the host and updates it when
> > > switching to the vcpu state.
> > 
> > Can this patch be split up?  We have a few overlapping changes here.
> > 
> > i.e., renaming and refactoring of hooks; moving some code around; and a
> > couple of other changes that are not directly related (noted below).
> 
> Indeed it can, into at least 3.
> 
> > Overall this looks like a decent cleanup however.  It was always a bit
> > nasty to have to map the thread flags into Hyp.
> 
> Glad to hear, I'll have to get it in a better shape.
> 
> > Side question: currently we do fpsimd_save_and_flush_cpu_state() in
> > kvm_arch_vcpu_put_fp().  Can we remove the flush so that the vcpu state
> > lingers in the CPU regs and can be reclaimed when switching back to the
> > KVM thread?
> > 
> > This could remove some overhead when the KVM run loop is preempted by a
> > kernel thread and subsequently resumed without passing through userspace.
> > 
> > We would need to flush this state when anything else tries to change the
> > vcpu FP regs, which is one reason I skipped this previously: it would
> > require a bit of refactoring of fpsimd_flush_task_state() so that a non-
> > task context can also be flushed.
> > 
> > (This isn't directly related to this series.)
> 
> I don't plan to address this at the moment but I do believe there are
> chances to reduce the need for saves and restores. If the flush is
> removed a similar check to that done for tasks could also apply to vCPUs
> i.e. if the last FPSIMD state this CPU had was the vCPU and the vCPU
> last ran on this CPU then the vCPU's FPSIMD state is already loaded.

Sounds reasonable.

As you observe, this is mostly a case of refactoring the code a bit and
making the vcpu context slightly less of a special case.

(And if you don't do it, no worries -- I just couldn't resist getting it
done "for free" ;)

> 
> > Additional minor comments below.
> > 
> > > 
> > > No functional change.
> > > 
> > > Signed-off-by: Andrew Scull 
> > > ---
> > >  arch/arm64/include/asm/kvm_host.h   |  5 +--
> > >  arch/arm64/kvm/arm.c|  4 +-
> > >  arch/arm64/kvm/fpsimd.c | 57 ++---
> > >  arch/arm64/kvm/hyp/include/hyp/switch.h | 19 -
> > >  arch/arm64/kvm/hyp/nvhe/switch.c|  3 +-
> > >  arch/arm64/kvm/hyp/vhe/switch.c |  3 +-
> > >  6 files changed, 48 insertions(+), 43 deletions(-)
> > > 
> > > diff --git a/arch/arm64/include/asm/kvm_host.h 
> > > b/arch/arm64/include/asm/kvm_host.h
> > > index b06f24b5f443..ca1621eeb9d9 100644
> > > --- a/arch/arm64/include/asm/kvm_host.h
> > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > @@ -24,7 +24,6 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > -#include 
> > >  
> > >  #define __KVM_HAVE_ARCH_INTC_INITIALIZED
> > >  
> > > @@ -320,7 +319,6 @@ struct kvm_vcpu_arch {
> > >   struct kvm_guest_debug_arch vcpu_debug_state;
> > >   struct kvm_guest_debug_arch external_debug_state;
> > >  
> > > - struct thread_info *host_thread_info;   /* hyp VA */
> > >   struct user_fpsimd_state *host_fpsimd_state;/* hyp VA */
> > >  
> > >   struct {
> > > @@ -616,7 +614,8 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
> > >  /* Guest/host FPSIMD coordination helpers */
> > >  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
> > >  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> > > -void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
> > > +void kvm_arch_vcpu_enter_ctxsync_fp(struct kvm_vcpu *vcpu);
> > > +void kvm_arch_vcpu_exit_ctxsync_fp(struct kvm_vcpu *vcpu);
> > 
> > I find these names a bit confusing.
> > 
> > Maybe
> > 
> > kvm_arch_vcpu_ctxsync_fp_before_guest_enter()
> > kvm_arch_vcpu_ctxsync_fp_after_guest_exit()
> > 
> > (we could probably drop the "ctx" to make these slightly shorter).
> 
> Changed to kvm_arch_vcpu_sync_fp_{before,after}_run()

Fair enough.

> > >  void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
> > >  
> > >  static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
> > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > index 98f05bdac3c1..c7a711ca840e 100644
> > > --- a/arch/arm64/kvm/arm.c
> > > +++ b/arch/arm64/kvm/arm.c
> > > @@ -682,6 +682,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > >  
> > >   local_irq_disable();
> > >  
> > > + kvm_arch_vcpu_enter_ctxsync_fp(vcpu);
> > > +
> > >   kvm_vgic_flush_hwstate(vcpu);
> > >  
> > >   /*
> > > @@ -769,7 +771,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu 

Re: [PATCH 2/2] KVM: arm64: Leave vcpu FPSIMD synchronization in host

2020-07-13 Thread Andrew Scull
On Mon, Jul 13, 2020 at 05:04:21PM +0100, Dave Martin wrote:
> On Fri, Jul 10, 2020 at 10:57:54AM +0100, Andrew Scull wrote:
> > The task state can be checked by the host and the vcpu flags updated
> > before calling into hyp. This more neatly separates the concerns and
> > removes the need to map the task flags to EL2.
> > 
> > Hyp acts on the state provided to it by the host and updates it when
> > switching to the vcpu state.
> 
> Can this patch be split up?  We have a few overlapping changes here.
> 
> i.e., renaming and refactoring of hooks; moving some code around; and a
> couple of other changes that are not directly related (noted below).

Indeed it can, into at least 3.

> Overall this looks like a decent cleanup however.  It was always a bit
> nasty to have to map the thread flags into Hyp.

Glad to hear, I'll have to get it in a better shape.

> Side question: currently we do fpsimd_save_and_flush_cpu_state() in
> kvm_arch_vcpu_put_fp().  Can we remove the flush so that the vcpu state
> lingers in the CPU regs and can be reclaimed when switching back to the
> KVM thread?
> 
> This could remove some overhead when the KVM run loop is preempted by a
> kernel thread and subsequently resumed without passing through userspace.
> 
> We would need to flush this state when anything else tries to change the
> vcpu FP regs, which is one reason I skipped this previously: it would
> require a bit of refactoring of fpsimd_flush_task_state() so that a non-
> task context can also be flushed.
> 
> (This isn't directly related to this series.)

I don't plan to address this at the moment but I do believe there are
chances to reduce the need for saves and restores. If the flush is
removed a similar check to that done for tasks could also apply to vCPUs
i.e. if the last FPSIMD state this CPU had was the vCPU and the vCPU
last ran on this CPU then the vCPU's FPSIMD state is already loaded.

> Additional minor comments below.
> 
> > 
> > No functional change.
> > 
> > Signed-off-by: Andrew Scull 
> > ---
> >  arch/arm64/include/asm/kvm_host.h   |  5 +--
> >  arch/arm64/kvm/arm.c|  4 +-
> >  arch/arm64/kvm/fpsimd.c | 57 ++---
> >  arch/arm64/kvm/hyp/include/hyp/switch.h | 19 -
> >  arch/arm64/kvm/hyp/nvhe/switch.c|  3 +-
> >  arch/arm64/kvm/hyp/vhe/switch.c |  3 +-
> >  6 files changed, 48 insertions(+), 43 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h 
> > b/arch/arm64/include/asm/kvm_host.h
> > index b06f24b5f443..ca1621eeb9d9 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -24,7 +24,6 @@
> >  #include 
> >  #include 
> >  #include 
> > -#include 
> >  
> >  #define __KVM_HAVE_ARCH_INTC_INITIALIZED
> >  
> > @@ -320,7 +319,6 @@ struct kvm_vcpu_arch {
> > struct kvm_guest_debug_arch vcpu_debug_state;
> > struct kvm_guest_debug_arch external_debug_state;
> >  
> > -   struct thread_info *host_thread_info;   /* hyp VA */
> > struct user_fpsimd_state *host_fpsimd_state;/* hyp VA */
> >  
> > struct {
> > @@ -616,7 +614,8 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
> >  /* Guest/host FPSIMD coordination helpers */
> >  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
> >  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> > -void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
> > +void kvm_arch_vcpu_enter_ctxsync_fp(struct kvm_vcpu *vcpu);
> > +void kvm_arch_vcpu_exit_ctxsync_fp(struct kvm_vcpu *vcpu);
> 
> I find these names a bit confusing.
> 
> Maybe
> 
>   kvm_arch_vcpu_ctxsync_fp_before_guest_enter()
>   kvm_arch_vcpu_ctxsync_fp_after_guest_exit()
> 
> (we could probably drop the "ctx" to make these slightly shorter).

Changed to kvm_arch_vcpu_sync_fp_{before,after}_run()

> >  void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
> >  
> >  static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 98f05bdac3c1..c7a711ca840e 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -682,6 +682,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> >  
> > local_irq_disable();
> >  
> > +   kvm_arch_vcpu_enter_ctxsync_fp(vcpu);
> > +
> > kvm_vgic_flush_hwstate(vcpu);
> >  
> > /*
> > @@ -769,7 +771,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > if (static_branch_unlikely(_irqchip_in_use))
> > kvm_timer_sync_user(vcpu);
> >  
> > -   kvm_arch_vcpu_ctxsync_fp(vcpu);
> > +   kvm_arch_vcpu_exit_ctxsync_fp(vcpu);
> >  
> > /*
> >  * We may have taken a host interrupt in HYP mode (ie
> > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> > index 3e081d556e81..aec91f43df62 100644
> > --- a/arch/arm64/kvm/fpsimd.c
> > +++ b/arch/arm64/kvm/fpsimd.c
> > @@ -27,22 +27,13 @@ int 

Re: [PATCH 2/2] KVM: arm64: Leave vcpu FPSIMD synchronization in host

2020-07-13 Thread Dave Martin
On Fri, Jul 10, 2020 at 10:57:54AM +0100, Andrew Scull wrote:
> The task state can be checked by the host and the vcpu flags updated
> before calling into hyp. This more neatly separates the concerns and
> removes the need to map the task flags to EL2.
> 
> Hyp acts on the state provided to it by the host and updates it when
> switching to the vcpu state.

Can this patch be split up?  We have a few overlapping changes here.

i.e., renaming and refactoring of hooks; moving some code around; and a
couple of other changes that are not directly related (noted below).

Overall this looks like a decent cleanup however.  It was always a bit
nasty to have to map the thread flags into Hyp.



Side question: currently we do fpsimd_save_and_flush_cpu_state() in
kvm_arch_vcpu_put_fp().  Can we remove the flush so that the vcpu state
lingers in the CPU regs and can be reclaimed when switching back to the
KVM thread?

This could remove some overhead when the KVM run loop is preempted by a
kernel thread and subsequently resumed without passing through userspace.

We would need to flush this state when anything else tries to change the
vcpu FP regs, which is one reason I skipped this previously: it would
require a bit of refactoring of fpsimd_flush_task_state() so that a non-
task context can also be flushed.

(This isn't directly related to this series.)



Additional minor comments below.

> 
> No functional change.
> 
> Signed-off-by: Andrew Scull 
> ---
>  arch/arm64/include/asm/kvm_host.h   |  5 +--
>  arch/arm64/kvm/arm.c|  4 +-
>  arch/arm64/kvm/fpsimd.c | 57 ++---
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 19 -
>  arch/arm64/kvm/hyp/nvhe/switch.c|  3 +-
>  arch/arm64/kvm/hyp/vhe/switch.c |  3 +-
>  6 files changed, 48 insertions(+), 43 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index b06f24b5f443..ca1621eeb9d9 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -24,7 +24,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  
>  #define __KVM_HAVE_ARCH_INTC_INITIALIZED
>  
> @@ -320,7 +319,6 @@ struct kvm_vcpu_arch {
>   struct kvm_guest_debug_arch vcpu_debug_state;
>   struct kvm_guest_debug_arch external_debug_state;
>  
> - struct thread_info *host_thread_info;   /* hyp VA */
>   struct user_fpsimd_state *host_fpsimd_state;/* hyp VA */
>  
>   struct {
> @@ -616,7 +614,8 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>  /* Guest/host FPSIMD coordination helpers */
>  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
>  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> -void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
> +void kvm_arch_vcpu_enter_ctxsync_fp(struct kvm_vcpu *vcpu);
> +void kvm_arch_vcpu_exit_ctxsync_fp(struct kvm_vcpu *vcpu);

I find these names a bit confusing.

Maybe

kvm_arch_vcpu_ctxsync_fp_before_guest_enter()
kvm_arch_vcpu_ctxsync_fp_after_guest_exit()

(we could probably drop the "ctx" to make these slightly shorter).

>  void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
>  
>  static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 98f05bdac3c1..c7a711ca840e 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -682,6 +682,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  
>   local_irq_disable();
>  
> + kvm_arch_vcpu_enter_ctxsync_fp(vcpu);
> +
>   kvm_vgic_flush_hwstate(vcpu);
>  
>   /*
> @@ -769,7 +771,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>   if (static_branch_unlikely(_irqchip_in_use))
>   kvm_timer_sync_user(vcpu);
>  
> - kvm_arch_vcpu_ctxsync_fp(vcpu);
> + kvm_arch_vcpu_exit_ctxsync_fp(vcpu);
>  
>   /*
>* We may have taken a host interrupt in HYP mode (ie
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 3e081d556e81..aec91f43df62 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -27,22 +27,13 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
>  {
>   int ret;
>  
> - struct thread_info *ti = >thread_info;
>   struct user_fpsimd_state *fpsimd = >thread.uw.fpsimd_state;
>  
> - /*
> -  * Make sure the host task thread flags and fpsimd state are
> -  * visible to hyp:
> -  */
> - ret = create_hyp_mappings(ti, ti + 1, PAGE_HYP);
> - if (ret)
> - goto error;
> -
> + /* Make sure the host task fpsimd state is visible to hyp: */
>   ret = create_hyp_mappings(fpsimd, fpsimd + 1, PAGE_HYP);
>   if (ret)
>   goto error;
>  
> - vcpu->arch.host_thread_info = kern_hyp_va(ti);
>   vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
>  error:
>