Re: [PATCH v10 10/18] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
On Thu, May 24, 2018 at 11:09:02AM +0100, Alex Bennée wrote: > > Dave Martinwrites: > > > This patch refactors KVM to align the host and guest FPSIMD > > save/restore logic with each other for arm64. This reduces the > > number of redundant save/restore operations that must occur, and > > reduces the common-case IRQ blackout time during guest exit storms > > by saving the host state lazily and optimising away the need to > > restore the host state before returning to the run loop. > > > > Four hooks are defined in order to enable this: > > > > * kvm_arch_vcpu_run_map_fp(): > >Called on PID change to map necessary bits of current to Hyp. > > > > * kvm_arch_vcpu_load_fp(): > >Set up FP/SIMD for entering the KVM run loop (parse as > >"vcpu_load fp"). > > > > * kvm_arch_vcpu_ctxsync_fp(): > >Get FP/SIMD into a safe state for re-enabling interrupts after a > >guest exit back to the run loop. > > > >For arm64 specifically, this involves updating the host kernel's > >FPSIMD context tracking metadata so that kernel-mode NEON use > >will cause the vcpu's FPSIMD state to be saved back correctly > >into the vcpu struct. This must be done before re-enabling > >interrupts because kernel-mode NEON may be used by softirqs. > > > > * kvm_arch_vcpu_put_fp(): > >Save guest FP/SIMD state back to memory and dissociate from the > >CPU ("vcpu_put fp"). > > > > Also, the arm64 FPSIMD context switch code is updated to enable it > > to save back FPSIMD state for a vcpu, not just current. A few > > helpers drive this: > > > > * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp): > >mark this CPU as having context fp (which may belong to a vcpu) > >currently loaded in its registers. This is the non-task > >equivalent of the static function fpsimd_bind_to_cpu() in > >fpsimd.c. > > > > * task_fpsimd_save(): > >exported to allow KVM to save the guest's FPSIMD state back to > >memory on exit from the run loop. > > > > * fpsimd_flush_state(): > >invalidate any context's FPSIMD state that is currently loaded. > >Used to disassociate the vcpu from the CPU regs on run loop exit. > > > > These changes allow the run loop to enable interrupts (and thus > > softirqs that may use kernel-mode NEON) without having to save the > > guest's FPSIMD state eagerly. > > > > Some new vcpu_arch fields are added to make all this work. Because > > host FPSIMD state can now be saved back directly into current's > > thread_struct as appropriate, host_cpu_context is no longer used > > for preserving the FPSIMD state. However, it is still needed for > > preserving other things such as the host's system registers. To > > avoid ABI churn, the redundant storage space in host_cpu_context is > > not removed for now. > > > > arch/arm is not addressed by this patch and continues to use its > > current save/restore logic. It could provide implementations of > > the helpers later if desired. > > > > Signed-off-by: Dave Martin > > Reviewed-by: Marc Zyngier > > Reviewed-by: Christoffer Dall > > Acked-by: Catalin Marinas > > > > --- > > > > Reviewers note: tags retained because this delta is straightforward by > > itself. Please shout if you're not happy! > > > > Changes since v9: > > > > * Remove redundant set_thread_flag(TIF_FOREIGN_FPSTATE) that is now > >implicit in fpsimd_flush_cpu_state(). > > --- > > arch/arm/include/asm/kvm_host.h | 8 +++ > > arch/arm64/include/asm/fpsimd.h | 6 +++ > > arch/arm64/include/asm/kvm_host.h | 21 > > arch/arm64/kernel/fpsimd.c| 17 -- > > arch/arm64/kvm/Kconfig| 1 + > > arch/arm64/kvm/Makefile | 2 +- > > arch/arm64/kvm/fpsimd.c | 111 > > ++ > > arch/arm64/kvm/hyp/switch.c | 51 +- > > virt/kvm/arm/arm.c| 4 ++ > > 9 files changed, 191 insertions(+), 30 deletions(-) > > create mode 100644 arch/arm64/kvm/fpsimd.c > > > > diff --git a/arch/arm/include/asm/kvm_host.h > > b/arch/arm/include/asm/kvm_host.h > > index c7c28c8..ac870b2 100644 > > --- a/arch/arm/include/asm/kvm_host.h > > +++ b/arch/arm/include/asm/kvm_host.h > > @@ -303,6 +303,14 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu, > > int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu, > >struct kvm_device_attr *attr); > > > > +/* > > + * VFP/NEON switching is all done by the hyp switch code, so no need to > > + * coordinate with host context handling for this state: > > + */ > > +static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) {} > > +static inline void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) {} > > +static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {} > > + > > /* All host FP/SIMD state is restored on guest exit, so nothing to
Re: [PATCH v10 10/18] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
Dave Martinwrites: > This patch refactors KVM to align the host and guest FPSIMD > save/restore logic with each other for arm64. This reduces the > number of redundant save/restore operations that must occur, and > reduces the common-case IRQ blackout time during guest exit storms > by saving the host state lazily and optimising away the need to > restore the host state before returning to the run loop. > > Four hooks are defined in order to enable this: > > * kvm_arch_vcpu_run_map_fp(): >Called on PID change to map necessary bits of current to Hyp. > > * kvm_arch_vcpu_load_fp(): >Set up FP/SIMD for entering the KVM run loop (parse as >"vcpu_load fp"). > > * kvm_arch_vcpu_ctxsync_fp(): >Get FP/SIMD into a safe state for re-enabling interrupts after a >guest exit back to the run loop. > >For arm64 specifically, this involves updating the host kernel's >FPSIMD context tracking metadata so that kernel-mode NEON use >will cause the vcpu's FPSIMD state to be saved back correctly >into the vcpu struct. This must be done before re-enabling >interrupts because kernel-mode NEON may be used by softirqs. > > * kvm_arch_vcpu_put_fp(): >Save guest FP/SIMD state back to memory and dissociate from the >CPU ("vcpu_put fp"). > > Also, the arm64 FPSIMD context switch code is updated to enable it > to save back FPSIMD state for a vcpu, not just current. A few > helpers drive this: > > * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp): >mark this CPU as having context fp (which may belong to a vcpu) >currently loaded in its registers. This is the non-task >equivalent of the static function fpsimd_bind_to_cpu() in >fpsimd.c. > > * task_fpsimd_save(): >exported to allow KVM to save the guest's FPSIMD state back to >memory on exit from the run loop. > > * fpsimd_flush_state(): >invalidate any context's FPSIMD state that is currently loaded. >Used to disassociate the vcpu from the CPU regs on run loop exit. > > These changes allow the run loop to enable interrupts (and thus > softirqs that may use kernel-mode NEON) without having to save the > guest's FPSIMD state eagerly. > > Some new vcpu_arch fields are added to make all this work. Because > host FPSIMD state can now be saved back directly into current's > thread_struct as appropriate, host_cpu_context is no longer used > for preserving the FPSIMD state. However, it is still needed for > preserving other things such as the host's system registers. To > avoid ABI churn, the redundant storage space in host_cpu_context is > not removed for now. > > arch/arm is not addressed by this patch and continues to use its > current save/restore logic. It could provide implementations of > the helpers later if desired. > > Signed-off-by: Dave Martin > Reviewed-by: Marc Zyngier > Reviewed-by: Christoffer Dall > Acked-by: Catalin Marinas > > --- > > Reviewers note: tags retained because this delta is straightforward by > itself. Please shout if you're not happy! > > Changes since v9: > > * Remove redundant set_thread_flag(TIF_FOREIGN_FPSTATE) that is now >implicit in fpsimd_flush_cpu_state(). > --- > arch/arm/include/asm/kvm_host.h | 8 +++ > arch/arm64/include/asm/fpsimd.h | 6 +++ > arch/arm64/include/asm/kvm_host.h | 21 > arch/arm64/kernel/fpsimd.c| 17 -- > arch/arm64/kvm/Kconfig| 1 + > arch/arm64/kvm/Makefile | 2 +- > arch/arm64/kvm/fpsimd.c | 111 > ++ > arch/arm64/kvm/hyp/switch.c | 51 +- > virt/kvm/arm/arm.c| 4 ++ > 9 files changed, 191 insertions(+), 30 deletions(-) > create mode 100644 arch/arm64/kvm/fpsimd.c > > diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h > index c7c28c8..ac870b2 100644 > --- a/arch/arm/include/asm/kvm_host.h > +++ b/arch/arm/include/asm/kvm_host.h > @@ -303,6 +303,14 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu, > int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu, > struct kvm_device_attr *attr); > > +/* > + * VFP/NEON switching is all done by the hyp switch code, so no need to > + * coordinate with host context handling for this state: > + */ > +static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) {} > +static inline void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) {} > +static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {} > + > /* All host FP/SIMD state is restored on guest exit, so nothing to save: */ > static inline void kvm_fpsimd_flush_cpu_state(void) {} > > diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h > index aa7162a..3e00f70 100644 > --- a/arch/arm64/include/asm/fpsimd.h > +++ b/arch/arm64/include/asm/fpsimd.h > @@ -41,6 +41,8 @@ struct