Hi all,

SVE2 BSL2N (x, y, z) = (x & z) | (~y & ~z). When x == y this computes:
(x & z) | (~x & ~z) which is ~(x ^ z).
Thus, we can use it to match RTL patterns (not (xor (...) (...))) for both
Advanced SIMD and SVE modes when TARGET_SVE2.
This patch does that. The tied register requirements of BSL2N and the MOVPRFX
rules mean we can't use the MOVPRFX form here so I have not included that
alternative. Correct me if I'm wrong on this.

For code like:

uint64x2_t eon_q(uint64x2_t a, uint64x2_t b) { return EON(a, b); }
svuint64_t eon_z(svuint64_t a, svuint64_t b) { return EON(a, b); }
svuint64_t eon_z_mp(svuint64_t c, svuint64_t a, svuint64_t b) { return EON(a, 
b); }

We now generate:
eon_q:
        bsl2n z0.d, z0.d, z0.d, z1.d
        ret

eon_z:
        bsl2n z0.d, z0.d, z0.d, z1.d
        ret

eon_z_mp:
        bsl2n z1.d, z1.d, z1.d, z2.d
        mov z0.d, z1.d
        ret

instead of the previous:
eon_q:
        eor v0.16b, v0.16b, v1.16b
        not v0.16b, v0.16b
        ret

eon_z:
        eor z0.d, z0.d, z1.d
        ptrue p3.b, all
        not z0.d, p3/m, z0.d
        ret

eon_z_mp:
        eor z0.d, z1.d, z2.d
        ptrue p3.b, all
        not z0.d, p3/m, z0.d
        ret

Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
Thanks,
Kyrill

Signed-off-by: Kyrylo Tkachov <ktkac...@nvidia.com>

gcc/

        * config/aarch64/aarch64-sve2.md (*aarch64_sve2_bsl2n_eon<mode>):
        New pattern.
        (*aarch64_sve2_eon_bsl2n_unpred<mode>): Likewise.

gcc/testsuite/

        * gcc.target/aarch64/sve2/eon_bsl2n.c: New test.

Attachment: 0002-aarch64-Use-SVE2-BSL2N-for-vector-EON.patch
Description: 0002-aarch64-Use-SVE2-BSL2N-for-vector-EON.patch

Reply via email to