Re: [PATCHv2] kvm: arm64: Add SVE support for nVHE.

2021-02-08 Thread Dave Martin
On Fri, Feb 05, 2021 at 12:12:51AM +, Daniel Kiss wrote:
> 
> 
> > On 4 Feb 2021, at 18:36, Dave Martin  wrote:
> > 
> > On Tue, Feb 02, 2021 at 07:52:54PM +0100, Daniel Kiss wrote:
> >> CPUs that support SVE are architecturally required to support the
> >> Virtualization Host Extensions (VHE), so far the kernel supported
> >> SVE alongside KVM with VHE enabled. In same cases it is desired to
> >> run nVHE config even when VHE is available.
> >> This patch add support for SVE for nVHE configuration too.
> >> 
> >> Tested on FVP with a Linux guest VM that run with a different VL than
> >> the host system.
> >> 
> >> Signed-off-by: Daniel Kiss 

[...]

> >> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> >> index 3e081d556e81..8f29b468e989 100644
> >> --- a/arch/arm64/kvm/fpsimd.c
> >> +++ b/arch/arm64/kvm/fpsimd.c
> >> @@ -42,6 +42,16 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
> >>if (ret)
> >>goto error;
> >> 
> >> +  if (!has_vhe() && vcpu->arch.sve_state) {
> >> +  void *sve_state_end = vcpu->arch.sve_state +
> >> +  SVE_SIG_REGS_SIZE(
> >> +  
> >> sve_vq_from_vl(vcpu->arch.sve_max_vl));
> >> +  ret = create_hyp_mappings(vcpu->arch.sve_state,
> >> +sve_state_end,
> >> +PAGE_HYP);
> >> +  if (ret)
> >> +  goto error;
> >> +  }
> >>vcpu->arch.host_thread_info = kern_hyp_va(ti);
> >>vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
> >> error:
> >> @@ -109,10 +119,22 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
> >>local_irq_save(flags);
> >> 
> >>if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
> >> +  if (guest_has_sve) {
> >> +  if (has_vhe())
> >> +  __vcpu_sys_reg(vcpu, ZCR_EL1) = 
> >> read_sysreg_s(SYS_ZCR_EL12);
> >> +  else {
> >> +  __vcpu_sys_reg(vcpu, ZCR_EL1) = 
> >> read_sysreg_s(SYS_ZCR_EL1);
> >> +  /*
> >> +   * vcpu could set ZCR_EL1 to a shorter VL then 
> >> the max VL but
> >> +   * the context is still valid there. Save the 
> >> whole context.
> >> +   * In nVHE case we need to reset the ZCR_EL1 
> >> for that
> >> +   * because the save will be done in EL1.
> >> +   */
> >> +  
> >> write_sysreg_s(sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1,
> >> + SYS_ZCR_EL1);
> > 
> > This still doesn't feel right.  We're already in EL1 here I think, in
> > which case ZCR_EL1 has an immediate effect on what state the
> > architecture guarantees to preserve: if we need to change ZCR_EL1, it's
> > because it might be wrong.  If it's wrong, it might be too small.  And
> > if it's too small, we may have already lost some SVE register bits that
> > the guest cares about.
> "On taking an exception from an Exception level that is more constrained
>  to a target Exception level that is less constrained, or on writing a larger 
> value
>  to ZCR_ELx.LEN, then the previously inaccessible bits of these registers 
> that 
>  become accessible have a value of either zero or the value they had before
>  executing at the more constrained size.”
> If the CPU zeros the register when ZCR is written or exception is taken my 
> reading
>  of the above is that the register content maybe lost when we land in EL2.
> No code shall not count on the register content after reduces the VL in ZCR.
> 
> I see my comment also not clear enough.
> Maybe we shouldn’t save the guest’s sve_max_vl here, would enough to save up 
> to
> the actual VL.

Maybe you're right, but I may be missing some information here.

Can you sketch out more explicitly how it works, showing how all the
bits the host cares about (and only those bits) are saved/restored for
the host, and how all the bits the guest cares about (and only those
bits) are saved/restored for the guest?


Various optimisations are possible, but there is a risk of breaking
assumptions elsewhere.  For example, the KVM_{SET,GET}_ONE_REG code
makes assmuptions about the layout of the data in
vcpu->arch.sve_state.

The architectural rules about when SVE register bits are also complex,
with many interactions.  We also don't want to aggressively optimise in
a way that might be hard to apply to nested virt.


My instinct is to keep it simple while this patch matures, and continue
to save/restore based on vcpu->arch.sve_max_vl.  This keeps a clear
split between the emulated "hardware" (which doesn't change while the
guest runs), and changeable runtime state (like the guest's ZCR_EL1).

I'm happy to review proposed optimisations, but I think those should be
separated out as later patches (or a separate series).  My experience
is that it's much 

Re: [PATCHv2] kvm: arm64: Add SVE support for nVHE.

2021-02-05 Thread Daniel Kiss


> On 4 Feb 2021, at 18:36, Dave Martin  wrote:
> 
> On Tue, Feb 02, 2021 at 07:52:54PM +0100, Daniel Kiss wrote:
>> CPUs that support SVE are architecturally required to support the
>> Virtualization Host Extensions (VHE), so far the kernel supported
>> SVE alongside KVM with VHE enabled. In same cases it is desired to
>> run nVHE config even when VHE is available.
>> This patch add support for SVE for nVHE configuration too.
>> 
>> Tested on FVP with a Linux guest VM that run with a different VL than
>> the host system.
>> 
>> Signed-off-by: Daniel Kiss 
>> ---
>> arch/arm64/Kconfig  |  7 -
>> arch/arm64/include/asm/fpsimd.h |  6 
>> arch/arm64/include/asm/fpsimdmacros.h   | 24 +--
>> arch/arm64/include/asm/kvm_host.h   | 17 +++
>> arch/arm64/kernel/entry-fpsimd.S|  5 
>> arch/arm64/kvm/arm.c|  5 
>> arch/arm64/kvm/fpsimd.c | 39 -
>> arch/arm64/kvm/hyp/fpsimd.S | 15 ++
>> arch/arm64/kvm/hyp/include/hyp/switch.h | 34 +++--
>> arch/arm64/kvm/hyp/nvhe/switch.c| 29 +-
>> arch/arm64/kvm/reset.c  |  6 +---
>> 11 files changed, 126 insertions(+), 61 deletions(-)
>> 
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index f39568b28ec1..049428f1bf27 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -1676,7 +1676,6 @@ endmenu
>> config ARM64_SVE
>>  bool "ARM Scalable Vector Extension support"
>>  default y
>> -depends on !KVM || ARM64_VHE
>>  help
>>The Scalable Vector Extension (SVE) is an extension to the AArch64
>>execution state which complements and extends the SIMD functionality
>> @@ -1705,12 +1704,6 @@ config ARM64_SVE
>>booting the kernel.  If unsure and you are not observing these
>>symptoms, you should assume that it is safe to say Y.
>> 
>> -  CPUs that support SVE are architecturally required to support the
>> -  Virtualization Host Extensions (VHE), so the kernel makes no
>> -  provision for supporting SVE alongside KVM without VHE enabled.
>> -  Thus, you will need to enable CONFIG_ARM64_VHE if you want to support
>> -  KVM in the same kernel image.
>> -
>> config ARM64_MODULE_PLTS
>>  bool "Use PLTs to allow module memory to spill over into vmalloc area"
>>  depends on MODULES
>> diff --git a/arch/arm64/include/asm/fpsimd.h 
>> b/arch/arm64/include/asm/fpsimd.h
>> index bec5f14b622a..526d69f3eeb3 100644
>> --- a/arch/arm64/include/asm/fpsimd.h
>> +++ b/arch/arm64/include/asm/fpsimd.h
>> @@ -69,6 +69,12 @@ static inline void *sve_pffr(struct thread_struct *thread)
>> extern void sve_save_state(void *state, u32 *pfpsr);
>> extern void sve_load_state(void const *state, u32 const *pfpsr,
>> unsigned long vq_minus_1);
>> +/*
>> + * sve_load_state_nvhe function for the hyp code where the SVE registers are
>> + * handled from the EL2, vector length is governed by ZCR_EL2.
>> + */
>> +extern void sve_load_state_nvhe(void const *state, u32 const *pfpsr,
>> +   unsigned long vq_minus_1);
>> extern void sve_flush_live(void);
>> extern void sve_load_from_fpsimd_state(struct user_fpsimd_state const *state,
>> unsigned long vq_minus_1);
>> diff --git a/arch/arm64/include/asm/fpsimdmacros.h 
>> b/arch/arm64/include/asm/fpsimdmacros.h
>> index af43367534c7..d309c6071bce 100644
>> --- a/arch/arm64/include/asm/fpsimdmacros.h
>> +++ b/arch/arm64/include/asm/fpsimdmacros.h
>> @@ -205,6 +205,17 @@
>> 921:
>> .endm
>> 
>> +/* Update ZCR_EL2.LEN with the new VQ */
>> +.macro sve_load_vq_nvhe xvqminus1, xtmp, xtmp2
>> +mrs_s   \xtmp, SYS_ZCR_EL2
>> +bic \xtmp2, \xtmp, ZCR_ELx_LEN_MASK
>> +orr \xtmp2, \xtmp2, \xvqminus1
>> +cmp \xtmp2, \xtmp
>> +b.eq922f
>> +msr_s   SYS_ZCR_EL2, \xtmp2 //self-synchronising
>> +922:
>> +.endm
>> +
> 
> This looks a little better, but can we just give sve_load_vq an extra
> argument, say
> 
> .macro sve_load_vq ... , el=EL1
Sounds good, will do.
> 
> [...]
> 
>> +.macro sve_load_nvhe nxbase, xpfpsr, xvqminus1, nxtmp, xtmp2
>> +sve_load_vq_nvhe\xvqminus1, x\nxtmp, \xtmp2
> 
> and do sve_load_vq \xvqminus1, x\nxtmp, \xtmp2, EL2
> 
>> +_sve_load\nxbase, \xpfpsr, \nxtmp
>> +.endm
>> diff --git a/arch/arm64/include/asm/kvm_host.h 
>> b/arch/arm64/include/asm/kvm_host.h
>> index 8fcfab0c2567..11a058c81c1d 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -376,6 +376,10 @@ struct kvm_vcpu_arch {
>> #define vcpu_sve_pffr(vcpu) ((void *)((char *)((vcpu)->arch.sve_state) + \
>>sve_ffr_offset((vcpu)->arch.sve_max_vl)))
>> 
>> +#define vcpu_sve_pffr_hyp

Re: [PATCHv2] kvm: arm64: Add SVE support for nVHE.

2021-02-04 Thread Dave Martin
On Tue, Feb 02, 2021 at 07:52:54PM +0100, Daniel Kiss wrote:
> CPUs that support SVE are architecturally required to support the
> Virtualization Host Extensions (VHE), so far the kernel supported
> SVE alongside KVM with VHE enabled. In same cases it is desired to
> run nVHE config even when VHE is available.
> This patch add support for SVE for nVHE configuration too.
> 
> Tested on FVP with a Linux guest VM that run with a different VL than
> the host system.
> 
> Signed-off-by: Daniel Kiss 
> ---
>  arch/arm64/Kconfig  |  7 -
>  arch/arm64/include/asm/fpsimd.h |  6 
>  arch/arm64/include/asm/fpsimdmacros.h   | 24 +--
>  arch/arm64/include/asm/kvm_host.h   | 17 +++
>  arch/arm64/kernel/entry-fpsimd.S|  5 
>  arch/arm64/kvm/arm.c|  5 
>  arch/arm64/kvm/fpsimd.c | 39 -
>  arch/arm64/kvm/hyp/fpsimd.S | 15 ++
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 34 +++--
>  arch/arm64/kvm/hyp/nvhe/switch.c| 29 +-
>  arch/arm64/kvm/reset.c  |  6 +---
>  11 files changed, 126 insertions(+), 61 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index f39568b28ec1..049428f1bf27 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1676,7 +1676,6 @@ endmenu
>  config ARM64_SVE
>   bool "ARM Scalable Vector Extension support"
>   default y
> - depends on !KVM || ARM64_VHE
>   help
> The Scalable Vector Extension (SVE) is an extension to the AArch64
> execution state which complements and extends the SIMD functionality
> @@ -1705,12 +1704,6 @@ config ARM64_SVE
> booting the kernel.  If unsure and you are not observing these
> symptoms, you should assume that it is safe to say Y.
>  
> -   CPUs that support SVE are architecturally required to support the
> -   Virtualization Host Extensions (VHE), so the kernel makes no
> -   provision for supporting SVE alongside KVM without VHE enabled.
> -   Thus, you will need to enable CONFIG_ARM64_VHE if you want to support
> -   KVM in the same kernel image.
> -
>  config ARM64_MODULE_PLTS
>   bool "Use PLTs to allow module memory to spill over into vmalloc area"
>   depends on MODULES
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index bec5f14b622a..526d69f3eeb3 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -69,6 +69,12 @@ static inline void *sve_pffr(struct thread_struct *thread)
>  extern void sve_save_state(void *state, u32 *pfpsr);
>  extern void sve_load_state(void const *state, u32 const *pfpsr,
>  unsigned long vq_minus_1);
> +/*
> + * sve_load_state_nvhe function for the hyp code where the SVE registers are
> + * handled from the EL2, vector length is governed by ZCR_EL2.
> + */
> +extern void sve_load_state_nvhe(void const *state, u32 const *pfpsr,
> +unsigned long vq_minus_1);
>  extern void sve_flush_live(void);
>  extern void sve_load_from_fpsimd_state(struct user_fpsimd_state const *state,
>  unsigned long vq_minus_1);
> diff --git a/arch/arm64/include/asm/fpsimdmacros.h 
> b/arch/arm64/include/asm/fpsimdmacros.h
> index af43367534c7..d309c6071bce 100644
> --- a/arch/arm64/include/asm/fpsimdmacros.h
> +++ b/arch/arm64/include/asm/fpsimdmacros.h
> @@ -205,6 +205,17 @@
>  921:
>  .endm
>  
> +/* Update ZCR_EL2.LEN with the new VQ */
> +.macro sve_load_vq_nvhe xvqminus1, xtmp, xtmp2
> + mrs_s   \xtmp, SYS_ZCR_EL2
> + bic \xtmp2, \xtmp, ZCR_ELx_LEN_MASK
> + orr \xtmp2, \xtmp2, \xvqminus1
> + cmp \xtmp2, \xtmp
> + b.eq922f
> + msr_s   SYS_ZCR_EL2, \xtmp2 //self-synchronising
> +922:
> +.endm
> +

This looks a little better, but can we just give sve_load_vq an extra
argument, say

.macro sve_load_vq ... , el=EL1

[...]

> +.macro sve_load_nvhe nxbase, xpfpsr, xvqminus1, nxtmp, xtmp2
> + sve_load_vq_nvhe\xvqminus1, x\nxtmp, \xtmp2

and do sve_load_vq \xvqminus1, x\nxtmp, \xtmp2, EL2

> + _sve_load\nxbase, \xpfpsr, \nxtmp
> +.endm
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index 8fcfab0c2567..11a058c81c1d 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -376,6 +376,10 @@ struct kvm_vcpu_arch {
>  #define vcpu_sve_pffr(vcpu) ((void *)((char *)((vcpu)->arch.sve_state) + \
> sve_ffr_offset((vcpu)->arch.sve_max_vl)))
>  
> +#define vcpu_sve_pffr_hyp(vcpu) ((void *)((char *) \
> + (kern_hyp_va((vcpu)->arch.sve_state)) + \
> + sve_ffr_offset((vcpu)->arch.sve_m