SVE ACLE (intrinsics) are supported on LLVM/Clang 11 onwards.

-Miguel
________________________________
From: Rasmus Munk Larsen <[email protected]>
Sent: Tuesday, June 23, 2020 11:49 PM
To: eigen <[email protected]>
Cc: Miguel Tairum-Cruz <[email protected]>
Subject: Re: [eigen] Eigen Arm SVE backend RFC

Yes, clang in particular is important. Are SVE intrinsics supported?

Rasmus

On Tue, Jun 23, 2020 at 10:32 AM David Tellenbach 
<[email protected]<mailto:[email protected]>> wrote:
Hi Rasmus,

The naming should be OK, but could a fixed-length version of this be made to 
work with older compilers? Eigen is deployed on a large number of platforms, 
and depending on GCC 10 would mean missing out on support on many of them. I 
would be wrong, but I suspect that for Eigen the main benefit is not so much 
the variable length aspect, but rather having _some_ long vector extension on 
newer Arm CPUs.

Old compilers do not support SVE intrinsics anyway so they won't be able to 
compile the proposed backend anyway. I agree that we should try to find a 
solution that works for all compilers with SVE support.

Cheers,
David

On 23. Jun 2020, at 19:22, Rasmus Munk Larsen 
<[email protected]<mailto:[email protected]>> wrote:



On Tue, Jun 23, 2020 at 9:09 AM Miguel Tairum-Cruz 
<[email protected]<mailto:[email protected]>> wrote:
Hi Rasmus,

Thank you for your feedback.

 Could we make the vector length a build config macro without a lot of code 
duplication for different lengths?
GCC 10 support for fixed SVE sizes could be used in this situation, by checking 
the SVE size in the SVE PacketMath code (e.g. #if __ARM_FEATURE_SVE_BITS == 512 
…).
However, the Packet names would be less descriptive, e.g.: 'PacketSVE' for any 
vector length instead of 'Packet16' for 512b vectors or 'Packet4' for 128b 
vectors. This should not be an issue, as far as I can tell, as the packets 
would still have the correct size.

The naming should be OK, but could a fixed-length version of this be made to 
work with older compilers? Eigen is deployed on a large number of platforms, 
and depending on GCC 10 would mean missing out on support on many of them. I 
would be wrong, but I suspect that for Eigen the main benefit is not so much 
the variable length aspect, but rather having _some_ long vector extension on 
newer Arm CPUs.


We will work on a merge request with these changes in mind. Any implementation 
suggestions or recommendations on this are welcome.

Best regards,
Miguel

________________________________
From: Rasmus Munk Larsen <[email protected]<mailto:[email protected]>>
Sent: Monday, June 22, 2020 11:20 PM
To: eigen <[email protected]<mailto:[email protected]>>; Miguel 
Tairum-Cruz <[email protected]<mailto:[email protected]>>
Subject: Re: [eigen] Eigen Arm SVE backend RFC

+Miguel directly.

On Mon, Jun 22, 2020 at 3:15 PM Rasmus Munk Larsen 
<[email protected]<mailto:[email protected]>> wrote:
Miguel,

Thank you very much for the RFC. I think that support for Arm SVE would be a 
useful addition to Eigen. As you mention, doing it with fixed-sized vectors 
will probably be necessary to match the existing Eigen architecture. Could we 
make the vector length a build config macro without a lot of code duplication 
for different lengths?

 Could I ask your team to submit this as a merge request against head on the 
main branch for easier review and testing?

Best regards,
   Rasmus

On Wed, Jun 17, 2020 at 2:48 AM Miguel Tairum-Cruz 
<[email protected]<mailto:[email protected]>> wrote:
Hi all,


I would like to present to the Eigen community a Request for Comments (RFC) for 
a new proof-of-concept vector backend based on the Arm Scalable Vector Length 
(SVE) architecture.

With Eigen being widely used across multiple projects such as TensorFlow, we 
believe that adding support to this new vector length (VL) agnostic 
architecture will benefit performance on upcoming Arm micro-architectures and 
systems.

This proof-of-concept SVE backend keeps in line with the existent vector 
backends, using the Arm C Language Extensions (ACLE) for SVE to optimize 
Eigen’s functions.
Using the NEON backend as a starting point, we have ported most of NEON 
functions to SVE. Please be aware that this work is built upon a version of 
Eigen from December 2019 / January 2020. All the upstream commits made to the 
NEON backend since then are not yet considered in this version.

The introduced changes are provided in the form of patch files, specifically 
for two SVE vector lengths: 128-bit and 512-bit. You can find more information 
on how to apply them in the provided README file.

One caveat of this initial version is the requirement for fixed SVE vector 
lengths. Eigen codebase and vector optimizations are not fully compatible with 
the vector-length agnostic data types that SVE introduces, which is a barrier 
for its full support upstream. Optimizing the SVE backend for specific VLs (in 
this case 128-bit and 512-bit) is a necessary workaround for this initial 
proof-of-concept.

An additional goal of this work is to integrate the Eigen SVE backend with 
TensorFlow. So far, due to the caveats stated above, we have not been able to 
integrate TensorFlow with Eigen SVE. However, the recent release of GCC 10.1 
brings a new feature to enable fixed vector sizes at compile time, which we 
believe will allow building TensorFlow with the proof-of-concept fixed-VL SVE 
implementation of Eigen.

Below is the formal RFC document, where we detail the design choices and 
discuss drawbacks and potential solutions to enable a complete implementation 
of an SVE backend for Eigen.



Regards,

Miguel


--------


Eigen Arm SVE backend RFC

- Authors: Miguel Tairum 
([email protected]<mailto:[email protected]>)
- Updated: 2020-05-15

Summary

The purpose of this RFC is to share an experimental proof-of-concept Arm 
Scalable Vector Extension (SVE) backend to Eigen and engage with the Eigen 
development community on feedback and ideas on how to properly implement 
scalable vectors into the Eigen library codebase.

More information on how to apply the RFC patch can be found in the README file.

Motivation

SVE<https://developer.arm.com/docs/101726/latest/explore-the-scalable-vector-extension-sve/what-is-the-scalable-vector-extension>
 is the next-generation SIMD architectural extension to the Armv8 architecture, 
introducing scalable vector length, per-lane predication, gather-loads, 
scatter-stores amongst other features.

Eigen is a mature linear algebra library, supporting many vector architectures, 
including Arm NEON. Used in multiple projects, including TensorFlow, we believe 
that supporting SVE could not only improve compatibility with future 
micro-architectures, but also enable better performance.

Guide-level explanation

In this initial assessment, we present a proof-of-concept SVE port of the 
PacketMath backend in Eigen, using the Arm C Language Extensions (ACLE). Like 
the existent vector backends, SVE intrinsics are implemented in Eigen's 
PacketMath, MathFunctions and TypeCasting source files. In this initial 
release, complex math is not available (due to time constraints).

This proof-of-concept release provides a "fixed-sized" SVE backend, with vector 
lengths of 128 and 512 bits. This means that the implemented functions are 
validated only when executed on those specific SVE lengths, as optimizations 
were only made for them. To facilitate this, we provide a patch file for each 
VL. All currently implemented NEON functions except for the Complex math 
(Complex.h) are included in the SVE backend. This is up to date with commit 
312c8e77<https://gitlab.com/libeigen/eigen/-/commit/312c8e77ff653d718cf4b318c9633d4b45bb725f>
 from December 2019, plus the changes introduced to the NEON backend up until 
commit 
da5a7afe<https://gitlab.com/libeigen/eigen/-/commit/da5a7afed056596b089a4241b62a7e17f2c43119>
 from 10 January 2020 (these are included in the patches files). This commit 
was chosen to be compatible with TensorFlow 1.x, which uses a similar version 
of Eigen, plus any NEON updates at the time of this work. This initial release 
also contains an updated PacketMath test, with SVE validation.

Reference-level explanation



The changes presented in this RFC are based from commit 
312c8e77<https://gitlab.com/libeigen/eigen/-/commit/312c8e77ff653d718cf4b318c9633d4b45bb725f>
 in the master branch.

The Eigen SVE backend can be found at Eigen/src/Core/arch/SVE.
SVE intrinsics are implemented for float, int and double sized elements. 
Similar to the NEON backend at this time, half packets are not implemented. 
Therefore, the available packet sizes for 512-bit VL are: 16 elements for 
int/float, 8 elements for double; and for 128-bit VL are: 4 elements for 
int/float, 2 elements for double.

For most functions, SVE intrinsics are analogous to the ones used in the NEON 
backend. More complex functions have comments that explain the logic behind 
their implementation.

Regarding the ptranspose function, the PacketBlock structure was duplicated and 
modified into PacketBlockSVE, a new structure of SVE vector pointers. This 
structure is in Eigen/src/Core/GenericPacketMath.h. This is required to support 
vector length agnostic data types, introduced in SVE. Since these data types do 
not have a fixed sized at compile time, they cannot be addressed inside vectors 
and thus pointers are needed.
The included SVE PacketMath tests (available in 
/test/packetmath.cc<http://packetmath.cc> and /test/packetmath_sve_resnet.c) 
make use of this new structure to validate the transpose function.

Outside of PacketMath and the previously mentioned locations, other small SVE 
modifications were done whenever a NEON implementation was present in the code. 
Additionally, the cmake files were also modified to accommodate the new backend.

Drawbacks and future possibilities

The initial release demonstrates a proof of concept for an SVE backend with 128 
and 512-bit vector lengths. Although it can be compiled for SVE architectures 
with different vector lengths, some functions will not validate, as they were 
tuned for these specific VLs.

One of main features of SVE, Vector Length Agnosticism (VLA), is not fully 
supported by Eigen, which relies on fixed-vector sizes to better exploit vector 
performance. SVE vectors have sizeless types, identified by the size of their 
elements, independently of the maximum vector length set. As such, some 
structures in Eigen's backend are not compatible with these types, like 
PacketBlock, a structure containing an array of Packets. This structure is then 
called in other parts of the projects (e.g. transpose function), that require a 
workaround to support these data types.

Work still needs to be done to either abstract the vector length in function 
optimization, or to consider all possible SVE vector lengths and to optimize 
accordingly. In order to fully integrate a vector length agnostic SVE backend 
with Eigen, changes to Eigen's core are also required. The aforementioned 
PacketBlock is one of them, but the code needs to be revised in order to 
seamlessly support sizeless vectors without breaking support to all existent 
fixed-sized vector architectures. Ultimately, this would ensure compatibility 
with other projects such as TensorFlow, which currently cannot be built with 
Eigen SVE. As it stands in the proof-of-concept, benchmarks need to be 
carefully written to use the SVE backend.

As of mid-May, GCC 10.1 stable build has been released, bringing the feature to 
create fixed-length SVE types. This enables the substitution of sizeless data 
types for fixed size ones, solving the above incompatibility with the 
PacketBlock structure. However, this is not a complete solution, as it does not 
bring support for the desired SVE VLA.
We are currently performing some tests and evaluating this GCC feature with a 
TensorFlow build. The goal is to be able to build Tensorflow and run some 
benchmark using the proof-of-concept Eigen with the SVE backend and a fixed VL.

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.

Reply via email to