On Mon, May 13, 2019 at 02:55:01PM +0100, Andrew Jones wrote: > On Mon, May 13, 2019 at 01:31:11PM +0100, Dave Martin wrote: > > On Sun, May 12, 2019 at 09:36:16AM +0100, Andrew Jones wrote: > > > These are the SVE equivalents to kvm_arch_get/put_fpsimd. > > > > > > Signed-off-by: Andrew Jones <drjo...@redhat.com> > > > --- > > > target/arm/kvm64.c | 127 +++++++++++++++++++++++++++++++++++++++++++-- > > > 1 file changed, 123 insertions(+), 4 deletions(-) > > > > [...] > > > > > +static int kvm_arch_put_sve(CPUState *cs) > > > +{ > > > + ARMCPU *cpu = ARM_CPU(cs); > > > + CPUARMState *env = &cpu->env; > > > + struct kvm_one_reg reg; > > > + int n, ret; > > > + > > > + for (n = 0; n < KVM_ARM64_SVE_NUM_ZREGS; n++) { > > > + uint64_t *q = aa64_vfp_qreg(env, n); > > > +#ifdef HOST_WORDS_BIGENDIAN > > > + uint64_t d[ARM_MAX_VQ * 2]; > > > + int i; > > > + for (i = 0; i < cpu->sve_max_vq * 2; i++) { > > > + d[i] = q[cpu->sve_max_vq * 2 - 1 - i]; > > > + } > > > > Out of interest, why do all this swabbing? It seems expensive. > > > > QEMU keeps its 128-bit and larger words in the same order (least > significant word first) for both host endian types. We need to > do word swapping every time we set/get them to/from KVM.
I'm not sure whether this is appropriate here, though it depends on what QEMU does with the data. Something non-obvious to be aware of: As exposed through the signal frame and the KVM ABI, the memory representation of an SVE reg is invariant with respect to the endianness. IIUC, the byte order seen for a V-reg in KVM_REG_ARM_CORE and for the equivalent Z-reg in KVM_REG_ARM64_SVE would be the opposite of each other on BE, but the same on LE. This is a feature of the archtiecture: a V-reg can be stored as a single value, but Z-regs are in general too big to be treated as a single value: they are always treated as a sequence of elements, and the largest element size supported is 64 bits, not 128. IIUC, there is no direct native way to store with 128-bit swabbing: some explicit data processing operation would also be needed to swap adjacent 64-bit elements in the vector around the store/load. This is not specified in the ABI documentation -- I should address that. If this is infeasible for KVM to work with, we could perhaps change it, but I'm not too keen on that at this stage. KVM_REG_ARM64_SVE_VLS has a similar behaviour: it's a vector of 64-bit possibly-swabbed words, not a single possibly-swabbed 512-bit word. Looking at the kernel, I may have screwed up in places where the two representations interact, like fpsimd_to_sve(). I should take a look at that... This doesn't affect the KVM ABI though. Cheers ---Dave