On Fri, Jun 21, 2024 at 1:04 AM Richard Biener
<richard.guent...@gmail.com> wrote:
>
> On Fri, Jun 21, 2024 at 9:12 AM Eikansh Gupta <quic_eikag...@quicinc.com> 
> wrote:
> >
> > We can optimize (vec_cond eq/ne vec_cond) when vec_cond is a
> > result of (vec CMP vec). The optimization is because of the
> > observation that in vec_cond, (-1 != 0) is true. So, we can
> > generate vec_cond of xor of vec resulting in a single
> > VEC_COND_EXPR instead of 3.
> >
> > The patch adds match pattern for vec a, b:
> > (a ? -1 : 0) != (b ? -1 : 0) --> (a^b) ? -1 : 0
> > (a ? -1 : 0) == (b ? -1 : 0) --> ~(a^b) ? -1 : 0
>
> Why should this only work for uniform -1 and 0 vectors?
> It seems to me it's valid for arbitrary values, thus
>
>  (a ? x : y) != (b ? x : y) -> a^b ? x : y
>  (a ? x : y) == (b ? x : y) -> a^b ? y : x
>
> no?

Well I think it should be:
(a ? x : y) != (b ? x : y) -> a^b ? TRUE : FALSE
(a ? x : y) == (b ? x : y) -> a^b ? FALSE : TRUE

In that the values of x/y .
This is also true for scalar (cond) too, Gimple testcase which can be used:
```
__GIMPLE()
_Bool   f4_  (int a, int b, int c, int d, int e, int f)   {
  _Bool X;
  _Bool Y;
  _Bool t;
  int t1;
  int t2;
  X = a == b;
  Y = c == d;
  t1 = X ? e : f;
  t2 = Y ? e : f;
  t = t1 == t2;
  return t;
}
```
I will work with Eikansh to finish this off list.

>
> >         PR tree-optimization/111150
> >
> > gcc/ChangeLog:
> >
> >         * match.pd: Optimization for above mentioned pattern.

Oh I just noticed the changelog should be improved too.
Eikansh,
  The wording there needs to be independent from the commit message as
it gets added to ChangeLog and has no real reference back to the
commit message.

Thanks,
Andrew Pinski

> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.dg/tree-ssa/pr111150.c: New test.
> >
> > Signed-off-by: Eikansh Gupta <quic_eikag...@quicinc.com>
> > ---
> >  gcc/match.pd                             | 18 ++++++++++++++++++
> >  gcc/testsuite/gcc.dg/tree-ssa/pr111150.c | 19 +++++++++++++++++++
> >  2 files changed, 37 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr111150.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 3d0689c9312..5cb78bd7ff9 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -5522,6 +5522,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >    (vec_cond (bit_and (bit_not @0) @1) @2 @3)))
> >  #endif
> >
> > +/* (a ? -1 : 0) != (b ? -1 : 0) --> (a^b) ? -1 : 0 */
> > +/* (a ? -1 : 0) == (b ? -1 : 0) --> ~(a^b) ? -1 : 0 */
> > +(for eqne (eq ne)
> > + (simplify
> > +  (eqne:c (vec_cond @0 uniform_integer_cst_p@2 uniform_integer_cst_p@3)
> > +         (vec_cond @1 @2 @3))
> > +  (with
> > +   {
> > +     tree newop1 = @2;
> > +     tree newop2 = @3;
> > +     if (eqne == NE_EXPR)
> > +       std::swap (newop1, newop2);
> > +   }
> > +   (if (integer_all_onesp (@2) && integer_zerop (@3))
> > +    (vec_cond (bit_xor @0 @1) {newop1;} {newop2;})
> > +    (if (integer_all_onesp (@3) && integer_zerop (@2))
> > +     (vec_cond (bit_xor @0 @1) {newop2;} {newop1;}))))))
> > +
> >  /* Canonicalize mask ? { 0, ... } : { -1, ...} to ~mask if the mask
> >     types are compatible.  */
> >  (simplify
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr111150.c 
> > b/gcc/testsuite/gcc.dg/tree-ssa/pr111150.c
> > new file mode 100644
> > index 00000000000..d10564fd722
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr111150.c
> > @@ -0,0 +1,19 @@
> > +/* PR tree-optimization/111150 */
> > +/* { dg-do compile } */
> > +/* { dg-options "-O1 -fdump-tree-forwprop1" } */
> > +
> > +typedef int v4si __attribute((__vector_size__(4 * sizeof(int))));
> > +
> > +v4si f1_(v4si a, v4si b, v4si c, v4si d) {
> > +  v4si X = a == b;
> > +  v4si Y = c == d;
> > +  return (X != Y);
> > +}
> > +
> > +v4si f2_(v4si a, v4si b, v4si c, v4si d) {
> > +  v4si X = a == b;
> > +  v4si Y = c == d;
> > +  return (X == Y);
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times " VEC_COND_EXPR " 2 "forwprop1" } } */
> > --
> > 2.17.1
> >

Reply via email to