On Fri, Jun 12, 2020 at 7:50 AM Przemyslaw Wirkus
<[email protected]> wrote:
>
> Hi all,
>
> Pattern "(x | y) - y" can be optimized to simple "(x & ~y)" andn pattern.
Isn't it better to do this transformation on the gimple level and not
in a target specific form? Or at least do it in the RTL level in a
generic form rather than adding target specific patterns.
Thanks,
Andrew Pinski
>
> So, for the example code:
>
> $ cat main.c
> int
> f_i(int x, int y)
> {
> return (x | y) - y;
> }
>
> long long
> f_l(long long x, long long y)
> {
> return (x | y) - y;
> }
>
> typedef int v4si __attribute__ ((vector_size (16)));
> typedef long long v2di __attribute__ ((vector_size (16)));
>
> v4si
> f_v4si(v4si a, v4si b) {
> return (a | b) - b;
> }
>
> v2di
> f_v2di(v2di a, v2di b) {
> return (a | b) - b;
> }
>
> void
> f(v4si *d, v4si *a, v4si *b) {
> for (int i=0; i<N; i++)
> d[i] = (a[i] | b[i]) - b[i];
> }
>
> Before this patch:
> $ ./aarch64-none-linux-gnu-gcc -S -O2 main.c -dp
>
> f_i:
> orr w0, w0, w1 // 8 [c=4 l=4] iorsi3/0
> sub w0, w0, w1 // 14 [c=4 l=4] subsi3
> ret // 24 [c=0 l=4] *do_return
> f_l:
> orr x0, x0, x1 // 8 [c=4 l=4] iordi3/0
> sub x0, x0, x1 // 14 [c=4 l=4] subdi3/0
> ret // 24 [c=0 l=4] *do_return
> f_v4si:
> orr v0.16b, v0.16b, v1.16b // 8 [c=8 l=4]
> iorv4si3/0
> sub v0.4s, v0.4s, v1.4s // 14 [c=8 l=4] subv4si3
> ret // 24 [c=0 l=4] *do_return
> f_v2di:
> orr v0.16b, v0.16b, v1.16b // 8 [c=8 l=4]
> iorv2di3/0
> sub v0.2d, v0.2d, v1.2d // 14 [c=8 l=4] subv2di3
> ret // 24 [c=0 l=4] *do_return
>
> After this patch:
> $ ./aarch64-none-linux-gnu-gcc -S -O2 main.c -dp
>
> f_i:
> bic w0, w0, w1 // 13 [c=8 l=4] *bic_and_not_si3
> ret // 23 [c=0 l=4] *do_return
> f_l:
> bic x0, x0, x1 // 13 [c=8 l=4] *bic_and_not_di3
> ret // 23 [c=0 l=4] *do_return
> f_v4si:
> bic v0.16b, v0.16b, v1.16b // 13 [c=16 l=4]
> *bic_and_not_simd_v4si3
> ret // 23 [c=0 l=4] *do_return
> f_v2di:
> bic v0.16b, v0.16b, v1.16b // 13 [c=16 l=4]
> *bic_and_not_simd_v2di3
> ret // 23 [c=0 l=4] *do_return
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
> OK for master ?
>
> Cheers,
> Przemyslaw
>
> gcc/ChangeLog:
>
> PR tree-optimization/94880
> * config/aarch64/aarch64.md (bic_and_not_<mode>3): New define_insn.
> * config/aarch64/aarch64-simd.md (bic_and_not_simd_<mode>3): New
> define_insn.
>
> gcc/testsuite/ChangeLog:
>
> PR tree-optimization/94880
> * gcc.target/aarch64/bic_and_not_di3.c: New test.
> * gcc.target/aarch64/bic_and_not_si3.c: New test.
> * gcc.target/aarch64/bic_and_not_v2di3.c: New test.
> * gcc.target/aarch64/bic_and_not_v4si3.c: New test.