Re: [PATCH v10 10/18] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing

2018-05-24 Thread Dave Martin
On Thu, May 24, 2018 at 11:09:02AM +0100, Alex Bennée wrote:
> 
> Dave Martin  writes:
> 
> > This patch refactors KVM to align the host and guest FPSIMD
> > save/restore logic with each other for arm64.  This reduces the
> > number of redundant save/restore operations that must occur, and
> > reduces the common-case IRQ blackout time during guest exit storms
> > by saving the host state lazily and optimising away the need to
> > restore the host state before returning to the run loop.
> >
> > Four hooks are defined in order to enable this:
> >
> >  * kvm_arch_vcpu_run_map_fp():
> >Called on PID change to map necessary bits of current to Hyp.
> >
> >  * kvm_arch_vcpu_load_fp():
> >Set up FP/SIMD for entering the KVM run loop (parse as
> >"vcpu_load fp").
> >
> >  * kvm_arch_vcpu_ctxsync_fp():
> >Get FP/SIMD into a safe state for re-enabling interrupts after a
> >guest exit back to the run loop.
> >
> >For arm64 specifically, this involves updating the host kernel's
> >FPSIMD context tracking metadata so that kernel-mode NEON use
> >will cause the vcpu's FPSIMD state to be saved back correctly
> >into the vcpu struct.  This must be done before re-enabling
> >interrupts because kernel-mode NEON may be used by softirqs.
> >
> >  * kvm_arch_vcpu_put_fp():
> >Save guest FP/SIMD state back to memory and dissociate from the
> >CPU ("vcpu_put fp").
> >
> > Also, the arm64 FPSIMD context switch code is updated to enable it
> > to save back FPSIMD state for a vcpu, not just current.  A few
> > helpers drive this:
> >
> >  * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp):
> >mark this CPU as having context fp (which may belong to a vcpu)
> >currently loaded in its registers.  This is the non-task
> >equivalent of the static function fpsimd_bind_to_cpu() in
> >fpsimd.c.
> >
> >  * task_fpsimd_save():
> >exported to allow KVM to save the guest's FPSIMD state back to
> >memory on exit from the run loop.
> >
> >  * fpsimd_flush_state():
> >invalidate any context's FPSIMD state that is currently loaded.
> >Used to disassociate the vcpu from the CPU regs on run loop exit.
> >
> > These changes allow the run loop to enable interrupts (and thus
> > softirqs that may use kernel-mode NEON) without having to save the
> > guest's FPSIMD state eagerly.
> >
> > Some new vcpu_arch fields are added to make all this work.  Because
> > host FPSIMD state can now be saved back directly into current's
> > thread_struct as appropriate, host_cpu_context is no longer used
> > for preserving the FPSIMD state.  However, it is still needed for
> > preserving other things such as the host's system registers.  To
> > avoid ABI churn, the redundant storage space in host_cpu_context is
> > not removed for now.
> >
> > arch/arm is not addressed by this patch and continues to use its
> > current save/restore logic.  It could provide implementations of
> > the helpers later if desired.
> >
> > Signed-off-by: Dave Martin 
> > Reviewed-by: Marc Zyngier 
> > Reviewed-by: Christoffer Dall 
> > Acked-by: Catalin Marinas 
> >
> > ---
> >
> > Reviewers note: tags retained because this delta is straightforward by
> > itself.  Please shout if you're not happy!
> >
> > Changes since v9:
> >
> >  * Remove redundant set_thread_flag(TIF_FOREIGN_FPSTATE) that is now
> >implicit in fpsimd_flush_cpu_state().
> > ---
> >  arch/arm/include/asm/kvm_host.h   |   8 +++
> >  arch/arm64/include/asm/fpsimd.h   |   6 +++
> >  arch/arm64/include/asm/kvm_host.h |  21 
> >  arch/arm64/kernel/fpsimd.c|  17 --
> >  arch/arm64/kvm/Kconfig|   1 +
> >  arch/arm64/kvm/Makefile   |   2 +-
> >  arch/arm64/kvm/fpsimd.c   | 111 
> > ++
> >  arch/arm64/kvm/hyp/switch.c   |  51 +-
> >  virt/kvm/arm/arm.c|   4 ++
> >  9 files changed, 191 insertions(+), 30 deletions(-)
> >  create mode 100644 arch/arm64/kvm/fpsimd.c
> >
> > diff --git a/arch/arm/include/asm/kvm_host.h 
> > b/arch/arm/include/asm/kvm_host.h
> > index c7c28c8..ac870b2 100644
> > --- a/arch/arm/include/asm/kvm_host.h
> > +++ b/arch/arm/include/asm/kvm_host.h
> > @@ -303,6 +303,14 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
> >  int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
> >struct kvm_device_attr *attr);
> >
> > +/*
> > + * VFP/NEON switching is all done by the hyp switch code, so no need to
> > + * coordinate with host context handling for this state:
> > + */
> > +static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) {}
> > +static inline void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) {}
> > +static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {}
> > +
> >  /* All host FP/SIMD state is restored on guest exit, so nothing to 

Re: [PATCH v10 10/18] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing

2018-05-24 Thread Alex Bennée

Dave Martin  writes:

> This patch refactors KVM to align the host and guest FPSIMD
> save/restore logic with each other for arm64.  This reduces the
> number of redundant save/restore operations that must occur, and
> reduces the common-case IRQ blackout time during guest exit storms
> by saving the host state lazily and optimising away the need to
> restore the host state before returning to the run loop.
>
> Four hooks are defined in order to enable this:
>
>  * kvm_arch_vcpu_run_map_fp():
>Called on PID change to map necessary bits of current to Hyp.
>
>  * kvm_arch_vcpu_load_fp():
>Set up FP/SIMD for entering the KVM run loop (parse as
>"vcpu_load fp").
>
>  * kvm_arch_vcpu_ctxsync_fp():
>Get FP/SIMD into a safe state for re-enabling interrupts after a
>guest exit back to the run loop.
>
>For arm64 specifically, this involves updating the host kernel's
>FPSIMD context tracking metadata so that kernel-mode NEON use
>will cause the vcpu's FPSIMD state to be saved back correctly
>into the vcpu struct.  This must be done before re-enabling
>interrupts because kernel-mode NEON may be used by softirqs.
>
>  * kvm_arch_vcpu_put_fp():
>Save guest FP/SIMD state back to memory and dissociate from the
>CPU ("vcpu_put fp").
>
> Also, the arm64 FPSIMD context switch code is updated to enable it
> to save back FPSIMD state for a vcpu, not just current.  A few
> helpers drive this:
>
>  * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp):
>mark this CPU as having context fp (which may belong to a vcpu)
>currently loaded in its registers.  This is the non-task
>equivalent of the static function fpsimd_bind_to_cpu() in
>fpsimd.c.
>
>  * task_fpsimd_save():
>exported to allow KVM to save the guest's FPSIMD state back to
>memory on exit from the run loop.
>
>  * fpsimd_flush_state():
>invalidate any context's FPSIMD state that is currently loaded.
>Used to disassociate the vcpu from the CPU regs on run loop exit.
>
> These changes allow the run loop to enable interrupts (and thus
> softirqs that may use kernel-mode NEON) without having to save the
> guest's FPSIMD state eagerly.
>
> Some new vcpu_arch fields are added to make all this work.  Because
> host FPSIMD state can now be saved back directly into current's
> thread_struct as appropriate, host_cpu_context is no longer used
> for preserving the FPSIMD state.  However, it is still needed for
> preserving other things such as the host's system registers.  To
> avoid ABI churn, the redundant storage space in host_cpu_context is
> not removed for now.
>
> arch/arm is not addressed by this patch and continues to use its
> current save/restore logic.  It could provide implementations of
> the helpers later if desired.
>
> Signed-off-by: Dave Martin 
> Reviewed-by: Marc Zyngier 
> Reviewed-by: Christoffer Dall 
> Acked-by: Catalin Marinas 
>
> ---
>
> Reviewers note: tags retained because this delta is straightforward by
> itself.  Please shout if you're not happy!
>
> Changes since v9:
>
>  * Remove redundant set_thread_flag(TIF_FOREIGN_FPSTATE) that is now
>implicit in fpsimd_flush_cpu_state().
> ---
>  arch/arm/include/asm/kvm_host.h   |   8 +++
>  arch/arm64/include/asm/fpsimd.h   |   6 +++
>  arch/arm64/include/asm/kvm_host.h |  21 
>  arch/arm64/kernel/fpsimd.c|  17 --
>  arch/arm64/kvm/Kconfig|   1 +
>  arch/arm64/kvm/Makefile   |   2 +-
>  arch/arm64/kvm/fpsimd.c   | 111 
> ++
>  arch/arm64/kvm/hyp/switch.c   |  51 +-
>  virt/kvm/arm/arm.c|   4 ++
>  9 files changed, 191 insertions(+), 30 deletions(-)
>  create mode 100644 arch/arm64/kvm/fpsimd.c
>
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index c7c28c8..ac870b2 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -303,6 +303,14 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
>  int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>  struct kvm_device_attr *attr);
>
> +/*
> + * VFP/NEON switching is all done by the hyp switch code, so no need to
> + * coordinate with host context handling for this state:
> + */
> +static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {}
> +
>  /* All host FP/SIMD state is restored on guest exit, so nothing to save: */
>  static inline void kvm_fpsimd_flush_cpu_state(void) {}
>
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index aa7162a..3e00f70 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -41,6 +41,8 @@ struct