On Tue, 23 Dec 2025 at 01:22, Mark Brown <[email protected]> wrote: > > SME, the Scalable Matrix Extension, is an arm64 extension which adds > support for matrix operations, with core concepts patterned after SVE. > > SVE introduced some complication in the ABI since it adds new vector > floating point registers with runtime configurable size, the size being > controlled by a parameter called the vector length (VL). To provide control > of this to VMMs we offer two phase configuration of SVE, SVE must first be > enabled for the vCPU with KVM_ARM_VCPU_INIT(KVM_ARM_VCPU_SVE), after which > vector length may then be configured but the configurably sized floating > point registers are inaccessible until finalized with a call to > KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE) after which the configurably sized > registers can be accessed.
s/configurably/configurable > > SME introduces an additional independent configurable vector length > which as well as controlling the size of the new ZA register also > provides an alternative view of the configurably sized SVE registers > (known as streaming mode) with the guest able to switch between the two > modes as it pleases. There is also a fixed sized register ZT0 > introduced in SME2. As well as streaming mode the guest may enable and > disable ZA and (where SME2 is available) ZT0 dynamically independently > of streaming mode. These modes are controlled via the system register > SVCR. > > We handle the configuration of the vector length for SME in a similar > manner to SVE, requiring initialization and finalization of the feature > with a pseudo register controlling the available SME vector lengths as for > SVE. Further, if the guest has both SVE and SME then finalizing one > prevents further configuration of the vector length for the other. > > Where both SVE and SME are configured for the guest we always present > the SVE registers to userspace as having the larger of the configured > maximum SVE and SME vector lengths, discarding extra data at load time Looking at the code with the whole patch series applied, I don't think this is correct, rather, it depends on the active vector length. That said, it's not what the documentation below says, so it's only an issue in the commit message. > and zero padding on read as required if the active vector length is > lower. Note that this means that enabling or disabling streaming mode > while the guest is stopped will not zero Zn or Pn as it will when the > guest is running, but it does allow SVCR, Zn and Pn to be read and > written in any order. > > Userspace access to ZA and (if configured) ZT0 is always available, they > will be zeroed when the guest runs if disabled in SVCR and the value > read will be zero if the guest stops with them disabled. This mirrors > the behaviour of the architecture, enabling access causes ZA and ZT0 to > be zeroed, while allowing access to SVCR, ZA and ZT0 to be performed in > any order. > > Signed-off-by: Mark Brown <[email protected]> > --- > Documentation/virt/kvm/api.rst | 120 > +++++++++++++++++++++++++++++------------ > 1 file changed, 86 insertions(+), 34 deletions(-) > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > index 01a3abef8abb..e024b9783932 100644 > --- a/Documentation/virt/kvm/api.rst > +++ b/Documentation/virt/kvm/api.rst > @@ -406,7 +406,7 @@ Errors: > instructions from device memory (arm64) > ENOSYS data abort outside memslots with no syndrome info and > KVM_CAP_ARM_NISV_TO_USER not enabled (arm64) > - EPERM SVE feature set but not finalized (arm64) > + EPERM SVE or SME feature set but not finalized (arm64) > ======= ============================================================== > > This ioctl is used to run a guest virtual cpu. While there are no > @@ -2606,11 +2606,11 @@ Specifically: > ======================= ========= ===== > ======================================= > > .. [1] These encodings are not accepted for SVE-enabled vcpus. See > - :ref:`KVM_ARM_VCPU_INIT`. > + :ref:`KVM_ARM_VCPU_INIT`. They are also not accepted when SME is > + enabled without SVE and the vcpu is in streaming mode. > > The equivalent register content can be accessed via bits [127:0] of > - the corresponding SVE Zn registers instead for vcpus that have SVE > - enabled (see below). > + the corresponding SVE Zn registers in these cases (see below). > > arm64 CCSIDR registers are demultiplexed by CSSELR value:: > > @@ -2641,24 +2641,39 @@ arm64 SVE registers have the following bit patterns:: > 0x6050 0000 0015 060 <slice:5> FFR bits[256*slice + 255 : 256*slice] > 0x6060 0000 0015 ffff KVM_REG_ARM64_SVE_VLS pseudo-register > > -Access to register IDs where 2048 * slice >= 128 * max_vq will fail with > -ENOENT. max_vq is the vcpu's maximum supported vector length in 128-bit > -quadwords: see [2]_ below. > +arm64 SME registers have the following bit patterns: > + > + 0x6080 0000 0017 00 <n:5> <slice:5> ZA.H[n] bits[2048*slice + 2047 : > 2048*slice] > + 0x6060 0000 0017 0100 ZT0 > + 0x6060 0000 0017 fffe KVM_REG_ARM64_SME_VLS pseudo-register > + > +Access to Z, P, FFR or ZA register IDs where 2048 * slice >= 128 * > +max_vq will fail with ENOENT. max_vq is the vcpu's current maximum > +supported vector length in 128-bit quadwords: see [2]_ below. > + > +Changing the value of SVCR.SM will result in the contents of > +the Z, P and FFR registers being reset to 0. When restoring the > +values of these registers for a VM with SME support it is > +important that SVCR.SM be configured first. > + > +Access to the ZA and ZT0 registers is only available if SVCR.ZA is set > +to 1. > > These registers are only accessible on vcpus for which SVE is enabled. > See KVM_ARM_VCPU_INIT for details. > > -In addition, except for KVM_REG_ARM64_SVE_VLS, these registers are not > -accessible until the vcpu's SVE configuration has been finalized > -using KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE). See KVM_ARM_VCPU_INIT > -and KVM_ARM_VCPU_FINALIZE for more information about this procedure. > +In addition, except for KVM_REG_ARM64_SVE_VLS and > +KVM_REG_ARM64_SME_VLS, these registers are not accessible until the > +vcpu's SVE and SME configuration has been finalized using > +KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC). See KVM_ARM_VCPU_INIT and > +KVM_ARM_VCPU_FINALIZE for more information about this procedure. > > -KVM_REG_ARM64_SVE_VLS is a pseudo-register that allows the set of vector > -lengths supported by the vcpu to be discovered and configured by > -userspace. When transferred to or from user memory via KVM_GET_ONE_REG > -or KVM_SET_ONE_REG, the value of this register is of type > -__u64[KVM_ARM64_SVE_VLS_WORDS], and encodes the set of vector lengths as > -follows:: > +KVM_REG_ARM64_SVE_VLS and KVM_ARM64_VCPU_SME_VLS are pseudo-registers KVM_ARM64_VCPU_SME_VLS -> KVM_REG_ARM64_SME_VLS With this and the commit message fixed: Reviewed-by: Fuad Tabba <[email protected]> Cheers, /fuad > +that allows the set of vector lengths supported by the vcpu to be > +discovered and configured by userspace. When transferred to or from > +user memory via KVM_GET_ONE_REG or KVM_SET_ONE_REG, the value of this > +register is of type __u64[KVM_ARM64_SVE_VLS_WORDS], and encodes the > +set of vector lengths as follows:: > > __u64 vector_lengths[KVM_ARM64_SVE_VLS_WORDS]; > > @@ -2670,19 +2685,25 @@ follows:: > /* Vector length vq * 16 bytes not supported */ > > .. [2] The maximum value vq for which the above condition is true is > - max_vq. This is the maximum vector length available to the guest on > - this vcpu, and determines which register slices are visible through > - this ioctl interface. > + max_vq. This is the maximum vector length currently available to > + the guest on this vcpu, and determines which register slices are > + visible through this ioctl interface. > + > + If SME is supported then the max_vq used for the Z and P registers > + while SVCR.SM is 1 this vector length will be the maximum SME > + vector length max_vq_sme available for the guest, otherwise it > + will be the maximum SVE vector length max_vq_sve available. > > (See Documentation/arch/arm64/sve.rst for an explanation of the "vq" > nomenclature.) > > -KVM_REG_ARM64_SVE_VLS is only accessible after KVM_ARM_VCPU_INIT. > -KVM_ARM_VCPU_INIT initialises it to the best set of vector lengths that > -the host supports. > +KVM_REG_ARM64_SVE_VLS and KVM_REG_ARM_SME_VLS are only accessible > +after KVM_ARM_VCPU_INIT. KVM_ARM_VCPU_INIT initialises them to the > +best set of vector lengths that the host supports. > > -Userspace may subsequently modify it if desired until the vcpu's SVE > -configuration is finalized using KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE). > +Userspace may subsequently modify these registers if desired until the > +vcpu's SVE and SME configuration is finalized using > +KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC). > > Apart from simply removing all vector lengths from the host set that > exceed some value, support for arbitrarily chosen sets of vector lengths > @@ -2690,8 +2711,8 @@ is hardware-dependent and may not be available. > Attempting to configure > an invalid set of vector lengths via KVM_SET_ONE_REG will fail with > EINVAL. > > -After the vcpu's SVE configuration is finalized, further attempts to > -write this register will fail with EPERM. > +After the vcpu's SVE or SME configuration is finalized, further > +attempts to write these registers will fail with EPERM. > > arm64 bitmap feature firmware pseudo-registers have the following bit > pattern:: > > @@ -3490,6 +3511,7 @@ The initial values are defined as: > - General Purpose registers, including PC and SP: set to 0 > - FPSIMD/NEON registers: set to 0 > - SVE registers: set to 0 > + - SME registers: set to 0 > - System registers: Reset to their architecturally defined > values as for a warm reset to EL1 (resp. SVC) or EL2 (in the > case of EL2 being enabled). > @@ -3533,7 +3555,7 @@ Possible features: > > - KVM_ARM_VCPU_SVE: Enables SVE for the CPU (arm64 only). > Depends on KVM_CAP_ARM_SVE. > - Requires KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE): > + Requires KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC): > > * After KVM_ARM_VCPU_INIT: > > @@ -3541,7 +3563,7 @@ Possible features: > initial value of this pseudo-register indicates the best set > of > vector lengths possible for a vcpu on this host. > > - * Before KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE): > + * Before KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC): > > - KVM_RUN and KVM_GET_REG_LIST are not available; > > @@ -3554,11 +3576,40 @@ Possible features: > KVM_SET_ONE_REG, to modify the set of vector lengths available > for the vcpu. > > - * After KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE): > + * After KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC): > > - the KVM_REG_ARM64_SVE_VLS pseudo-register is immutable, and > can > no longer be written using KVM_SET_ONE_REG. > > + - KVM_ARM_VCPU_SME: Enables SME for the CPU (arm64 only). > + Depends on KVM_CAP_ARM_SME. > + Requires KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC): > + > + * After KVM_ARM_VCPU_INIT: > + > + - KVM_REG_ARM64_SME_VLS may be read using KVM_GET_ONE_REG: the > + initial value of this pseudo-register indicates the best set > of > + vector lengths possible for a vcpu on this host. > + > + * Before KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC): > + > + - KVM_RUN and KVM_GET_REG_LIST are not available; > + > + - KVM_GET_ONE_REG and KVM_SET_ONE_REG cannot be used to access > + the scalable architectural SVE registers > + KVM_REG_ARM64_SVE_ZREG(), KVM_REG_ARM64_SVE_PREG() or > + KVM_REG_ARM64_SVE_FFR, the matrix register > + KVM_REG_ARM64_SME_ZA() or the LUT register KVM_REG_ARM64_ZT(); > + > + - KVM_REG_ARM64_SME_VLS may optionally be written using > + KVM_SET_ONE_REG, to modify the set of vector lengths available > + for the vcpu. > + > + * After KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_VEC): > + > + - the KVM_REG_ARM64_SME_VLS pseudo-register is immutable, and > can > + no longer be written using KVM_SET_ONE_REG. > + > - KVM_ARM_VCPU_HAS_EL2: Enable Nested Virtualisation support, > booting the guest from EL2 instead of EL1. > Depends on KVM_CAP_ARM_EL2. > @@ -5143,11 +5194,12 @@ Errors: > > Recognised values for feature: > > - ===== =========================================== > - arm64 KVM_ARM_VCPU_SVE (requires KVM_CAP_ARM_SVE) > - ===== =========================================== > + ===== ============================================================== > + arm64 KVM_ARM_VCPU_VEC (requires KVM_CAP_ARM_SVE or KVM_CAP_ARM_SME) > + arm64 KVM_ARM_VCPU_SVE (alias for KVM_ARM_VCPU_VEC) > + ===== ============================================================== > > -Finalizes the configuration of the specified vcpu feature. > +Finalizes the configuration of the specified vcpu features. > > The vcpu must already have been initialised, enabling the affected feature, > by > means of a successful :ref:`KVM_ARM_VCPU_INIT <KVM_ARM_VCPU_INIT>` call with > the > > -- > 2.47.3 >
