On Fri, Feb 05, 2021 at 12:12:51AM +0000, Daniel Kiss wrote:
> 
> 
> > On 4 Feb 2021, at 18:36, Dave Martin <dave.mar...@arm.com> wrote:
> > 
> > On Tue, Feb 02, 2021 at 07:52:54PM +0100, Daniel Kiss wrote:
> >> CPUs that support SVE are architecturally required to support the
> >> Virtualization Host Extensions (VHE), so far the kernel supported
> >> SVE alongside KVM with VHE enabled. In same cases it is desired to
> >> run nVHE config even when VHE is available.
> >> This patch add support for SVE for nVHE configuration too.
> >> 
> >> Tested on FVP with a Linux guest VM that run with a different VL than
> >> the host system.
> >> 
> >> Signed-off-by: Daniel Kiss <daniel.k...@arm.com>

[...]

> >> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> >> index 3e081d556e81..8f29b468e989 100644
> >> --- a/arch/arm64/kvm/fpsimd.c
> >> +++ b/arch/arm64/kvm/fpsimd.c
> >> @@ -42,6 +42,16 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
> >>    if (ret)
> >>            goto error;
> >> 
> >> +  if (!has_vhe() && vcpu->arch.sve_state) {
> >> +          void *sve_state_end = vcpu->arch.sve_state +
> >> +                                      SVE_SIG_REGS_SIZE(
> >> +                                          
> >> sve_vq_from_vl(vcpu->arch.sve_max_vl));
> >> +          ret = create_hyp_mappings(vcpu->arch.sve_state,
> >> +                                    sve_state_end,
> >> +                                    PAGE_HYP);
> >> +          if (ret)
> >> +                  goto error;
> >> +  }
> >>    vcpu->arch.host_thread_info = kern_hyp_va(ti);
> >>    vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
> >> error:
> >> @@ -109,10 +119,22 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
> >>    local_irq_save(flags);
> >> 
> >>    if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
> >> +          if (guest_has_sve) {
> >> +                  if (has_vhe())
> >> +                          __vcpu_sys_reg(vcpu, ZCR_EL1) = 
> >> read_sysreg_s(SYS_ZCR_EL12);
> >> +                  else {
> >> +                          __vcpu_sys_reg(vcpu, ZCR_EL1) = 
> >> read_sysreg_s(SYS_ZCR_EL1);
> >> +                          /*
> >> +                           * vcpu could set ZCR_EL1 to a shorter VL then 
> >> the max VL but
> >> +                           * the context is still valid there. Save the 
> >> whole context.
> >> +                           * In nVHE case we need to reset the ZCR_EL1 
> >> for that
> >> +                           * because the save will be done in EL1.
> >> +                           */
> >> +                          
> >> write_sysreg_s(sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1,
> >> +                                         SYS_ZCR_EL1);
> > 
> > This still doesn't feel right.  We're already in EL1 here I think, in
> > which case ZCR_EL1 has an immediate effect on what state the
> > architecture guarantees to preserve: if we need to change ZCR_EL1, it's
> > because it might be wrong.  If it's wrong, it might be too small.  And
> > if it's too small, we may have already lost some SVE register bits that
> > the guest cares about.
> "On taking an exception from an Exception level that is more constrained
>  to a target Exception level that is less constrained, or on writing a larger 
> value
>  to ZCR_ELx.LEN, then the previously inaccessible bits of these registers 
> that 
>  become accessible have a value of either zero or the value they had before
>  executing at the more constrained size.”
> If the CPU zeros the register when ZCR is written or exception is taken my 
> reading
>  of the above is that the register content maybe lost when we land in EL2.
> No code shall not count on the register content after reduces the VL in ZCR.
> 
> I see my comment also not clear enough.
> Maybe we shouldn’t save the guest’s sve_max_vl here, would enough to save up 
> to
> the actual VL.

Maybe you're right, but I may be missing some information here.

Can you sketch out more explicitly how it works, showing how all the
bits the host cares about (and only those bits) are saved/restored for
the host, and how all the bits the guest cares about (and only those
bits) are saved/restored for the guest?


Various optimisations are possible, but there is a risk of breaking
assumptions elsewhere.  For example, the KVM_{SET,GET}_ONE_REG code
makes assmuptions about the layout of the data in
vcpu->arch.sve_state.

The architectural rules about when SVE register bits are also complex,
with many interactions.  We also don't want to aggressively optimise in
a way that might be hard to apply to nested virt.


My instinct is to keep it simple while this patch matures, and continue
to save/restore based on vcpu->arch.sve_max_vl.  This keeps a clear
split between the emulated "hardware" (which doesn't change while the
guest runs), and changeable runtime state (like the guest's ZCR_EL1).

I'm happy to review proposed optimisations, but I think those should be
separated out as later patches (or a separate series).  My experience
is that it's much easier to get this wrong than to get it right!

When it's wrong, it can be a nightmare to debug.


> > I think that we need to handle this on our way out of hyp instead,
> > _before_ returning back to EL1.
> > 
> > When at EL2 exiting back to the host: if and only if
> > KVM_ARM64_FP_ENABLED is set then save the guest's ZCR_EL1 and ZCR_EL1 to
> > match the guest's sve_max_vl (possibly by just cloning ZCR_EL2).
> > 
> >> +                  }
> >> +          }
> >>            fpsimd_save_and_flush_cpu_state();
> >> -
> >> -          if (guest_has_sve)
> >> -                  __vcpu_sys_reg(vcpu, ZCR_EL1) = 
> >> read_sysreg_s(SYS_ZCR_EL12);
> >>    } else if (host_has_sve) {
> >>            /*
> >>             * The FPSIMD/SVE state in the CPU has not been touched, and we

[...]

> >> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c 
> >> b/arch/arm64/kvm/hyp/nvhe/switch.c
> >> index f3d0e9eca56c..df9e912d1278 100644
> >> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> >> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> >> @@ -45,6 +45,18 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
> >>    if (!update_fp_enabled(vcpu)) {
> >>            val |= CPTR_EL2_TFP;
> >>            __activate_traps_fpsimd32(vcpu);
> >> +  } else {
> >> +          if (vcpu_has_sve(vcpu)) {
> >> +                  /*
> >> +                   * The register access will not be trapped so restore
> >> +                   * ZCR_EL1/ZCR_EL2 because those were set for the host.
> >> +                   */
> >> +                  write_sysreg_s(__vcpu_sys_reg(vcpu, ZCR_EL1), 
> >> SYS_ZCR_EL1);
> >> +                  write_sysreg_s(
> >> +                          sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1,
> >> +                          SYS_ZCR_EL2);
> >> +                  val &= ~CPTR_EL2_TZ;
> >> +          }
> >>    }
> >> 
> >>    write_sysreg(val, cptr_el2);
> >> @@ -110,6 +122,17 @@ static void __load_host_stage2(void)
> >>    write_sysreg(0, vttbr_el2);
> >> }
> >> 
> >> +static void __restore_host_sve_state(struct kvm_vcpu *vcpu)
> >> +{
> >> +  /*
> >> +   * If the guest uses SVE, the ZCR_EL2 was configured for the guest.
> >> +   * Host might save the context in EL1 but for that the ZCR_EL2 need
> >> +   * to be reset to the host's default.
> > 
> > This isn't just about saving the guest context correctly.  The host have
> > be using larger vectors than the guest's sve_max_vl allows.
> Right.

Can you clarify the comment please (unless you've done so already)?

> > 
> >> +   */
> >> +  if (vcpu_has_sve(vcpu) && (vcpu->arch.flags |= KVM_ARM64_FP_ENABLED))
> >> +          write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
> > 
> > I'm not sure if it's worth having a special #define for this, but it
> > would be a good idea at least to put comments here and in el2_setup.h to
> > remind people that the ZCR_EL2 settings need to match.  Otherwise this
> > might get mis-maintained in the future.
> I will add a comment to el2_setup.h

Can you put comments in both places please?  Maintainers of either bit
of code need to be aware of the other code.

[...]

> >> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> >> index 47f3f035f3ea..17cc5e87adcd 100644
> >> --- a/arch/arm64/kvm/reset.c
> >> +++ b/arch/arm64/kvm/reset.c
> >> @@ -74,10 +74,6 @@ static int kvm_vcpu_enable_sve(struct kvm_vcpu *vcpu)
> >>    if (!system_supports_sve())
> >>            return -EINVAL;
> >> 
> >> -  /* Verify that KVM startup enforced this when SVE was detected: */
> >> -  if (WARN_ON(!has_vhe()))
> >> -          return -EINVAL;
> >> -
> >>    vcpu->arch.sve_max_vl = kvm_sve_max_vl;
> >> 
> >>    /*
> >> @@ -113,7 +109,7 @@ static int kvm_vcpu_finalize_sve(struct kvm_vcpu *vcpu)
> >>    buf = kzalloc(SVE_SIG_REGS_SIZE(sve_vq_from_vl(vl)), GFP_KERNEL);
> >>    if (!buf)
> >>            return -ENOMEM;
> >> -
> >> +  __vcpu_sys_reg(vcpu, ZCR_EL1) = sve_vq_from_vl(vcpu->arch.sve_max_vl) - 
> >> 1;
> > 
> > What's this for?
> > 
> > __vcpu_sys_reg(vcpu, ZCR_EL1) should already be being reset sensibly
> > somewhere.  If not, that would be a bug…
> It is not needed indeed. 

Ah, OK.


Cheers
---Dave
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Reply via email to