Hi all, SVE2 BSL2N (x, y, z) = (x & z) | (~y & ~z). When x == y this computes: (x & z) | (~x & ~z) which is ~(x ^ z). Thus, we can use it to match RTL patterns (not (xor (...) (...))) for both Advanced SIMD and SVE modes when TARGET_SVE2. This patch does that. The tied register requirements of BSL2N and the MOVPRFX rules mean we can't use the MOVPRFX form here so I have not included that alternative. Correct me if I'm wrong on this.
For code like: uint64x2_t eon_q(uint64x2_t a, uint64x2_t b) { return EON(a, b); } svuint64_t eon_z(svuint64_t a, svuint64_t b) { return EON(a, b); } svuint64_t eon_z_mp(svuint64_t c, svuint64_t a, svuint64_t b) { return EON(a, b); } We now generate: eon_q: bsl2n z0.d, z0.d, z0.d, z1.d ret eon_z: bsl2n z0.d, z0.d, z0.d, z1.d ret eon_z_mp: bsl2n z1.d, z1.d, z1.d, z2.d mov z0.d, z1.d ret instead of the previous: eon_q: eor v0.16b, v0.16b, v1.16b not v0.16b, v0.16b ret eon_z: eor z0.d, z0.d, z1.d ptrue p3.b, all not z0.d, p3/m, z0.d ret eon_z_mp: eor z0.d, z1.d, z2.d ptrue p3.b, all not z0.d, p3/m, z0.d ret Bootstrapped and tested on aarch64-none-linux-gnu. Ok for trunk? Thanks, Kyrill Signed-off-by: Kyrylo Tkachov <ktkac...@nvidia.com> gcc/ * config/aarch64/aarch64-sve2.md (*aarch64_sve2_bsl2n_eon<mode>): New pattern. (*aarch64_sve2_eon_bsl2n_unpred<mode>): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve2/eon_bsl2n.c: New test.
0002-aarch64-Use-SVE2-BSL2N-for-vector-EON.patch
Description: 0002-aarch64-Use-SVE2-BSL2N-for-vector-EON.patch