One long-standing problem with the implementation of the SVE ACLE
is that .H, .S, and .D predicate operations tend to have VNx8BI,
VNx4BI, and VNx2BI results.  As with the fix for PR121118, this
representation is usually incorrect, since every bit of an svbool_t
result is significant:

    https://gcc.gnu.org/pipermail/gcc-patches/2025-July/691024.html

In PR121294, this representation actively leads to wrong code.
.H, .S, and .D permutations operate on 2-bit, 4-bit, and 8-bit
predicate elements, but they copy all bits across verbatim.
That isn't something we need or rely on when permuting natural
VNx8BI, VNx4BI, or VNx2BI predicates, but it is something that
is guaranteed by the ACLE intrinsics.  The current representation
instead allows RTL optimisers to substitute one type of ptrue
for another, as long as the low bit of each element doesn't change.

Tested on aarch64-linux-gnu.  OK for trunk and for backports?

Richard

Richard Sandiford (2):
  aarch64: Use VNx16BI for more permutations [PR121294]
  aarch64: Use VNx16BI for svrev_b* [PR121294]

 .../aarch64/aarch64-sve-builtins-base.cc      |  5 +-
 .../aarch64/aarch64-sve-builtins-functions.h  |  5 +-
 gcc/config/aarch64/aarch64-sve.md             | 62 ++++++++++--
 gcc/config/aarch64/aarch64.cc                 |  3 +-
 gcc/config/aarch64/aarch64.md                 |  1 +
 gcc/config/aarch64/iterators.md               |  4 +-
 .../aarch64/sve/acle/general/perm_2.c         | 96 +++++++++++++++++++
 .../aarch64/sve/acle/general/perm_3.c         | 96 +++++++++++++++++++
 .../aarch64/sve/acle/general/perm_4.c         | 96 +++++++++++++++++++
 .../aarch64/sve/acle/general/perm_5.c         | 96 +++++++++++++++++++
 .../aarch64/sve/acle/general/perm_6.c         | 96 +++++++++++++++++++
 .../aarch64/sve/acle/general/perm_7.c         | 96 +++++++++++++++++++
 .../aarch64/sve/acle/general/rev_2.c          | 24 +++++
 13 files changed, 666 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general/perm_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general/perm_3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general/perm_4.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general/perm_5.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general/perm_6.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general/perm_7.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev_2.c

-- 
2.43.0

Reply via email to