Re: [PATCH] MATCH: Add simplifications for `(a * zero_one) ==/!= CST`

2023-09-18 Thread Andrew Pinski via Gcc-patches
On Mon, Sep 18, 2023 at 12:09 AM Richard Biener via Gcc-patches
 wrote:
>
> On Sat, Sep 16, 2023 at 7:50 AM Andrew Pinski via Gcc-patches
>  wrote:
> >
> > Transforming `(a * b@[0,1]) != 0` into `((cast)b) & a != 0`
>
> that isn't strictly a simplification (one more op), and your
> alternate transform is even worse in this regard.

Right, I agree here. I was trying to workaround a ranger issue (see below).

>
> > will produce better code as a lot of the time b is defined
> > by a comparison.
>
> what if not?  How does it simplify then?
>
> > Also since canonicalize `a & -zero_one` into `a * zero_one` we
> > start to lose information when doing comparisons against 0.
> > In the case of PR 110992, we lose that `a != 0` on the branch
>
> How so?  Ranger should be happy with both forms, no?

Ranger does not handle going backwards on the multiply case; only on
the bit_and case.
I tried figuring out how to understand that works but I got lost in
the ranger code.  Maybe Andrew or Aldy could look into figuring out
how to improve ranger here.

Thanks,
Andrew

>
> > and then don't do a jump threading like we should.
> >
> > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> >
> > PR tree-optimization/110992
> >
> > gcc/ChangeLog:
> >
> > * match.pd (`a * zero_one !=/== CST`): New pattern.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/vrp116.c: Update test to avoid the
> > extra comparison.
> > * gcc.c-torture/execute/pr110992-1.c: New test.
> > * gcc.dg/tree-ssa/pr110992-1.c: New test.
> > * gcc.dg/tree-ssa/pr110992-2.c: New test.
> > ---
> >  gcc/match.pd  | 15 +++
> >  .../gcc.c-torture/execute/pr110992-1.c| 43 +++
> >  gcc/testsuite/gcc.dg/tree-ssa/pr110992-1.c| 21 +
> >  gcc/testsuite/gcc.dg/tree-ssa/pr110992-2.c| 17 
> >  gcc/testsuite/gcc.dg/tree-ssa/vrp116.c|  2 +-
> >  5 files changed, 97 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110992-1.c
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110992-1.c
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110992-2.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 39c9c81966a..97405e6a5c3 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -2197,6 +2197,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >   (if (INTEGRAL_TYPE_P (type))
> >(bit_and @0 @1)))
> >
> > +/* (a * b@[0,1]) == CST
> > + ->
> > +   CST == 0 ? (a == CST | b == 0) : (a == CST & b != 0)
> > +   (a * b@[0,1]) != CST
> > + ->
> > +   CST != 0 ? (a != CST | b == 0) : (a != CST & b != 0)  */
> > +(for cmp (ne eq)
> > + (simplify
> > +  (cmp (mult:cs @0 zero_one_valued_p@1) INTEGER_CST@2)
> > +  (if ((cmp == EQ_EXPR) ^ (wi::to_wide (@2) != 0))
> > +   (bit_ior
> > +(cmp @0 @2)
> > +(convert (bit_xor @1 { build_one_cst (TREE_TYPE (@1)); })))
> > +   (bit_and (cmp @0 @2) (convert @1)
> > +
> >  (for cmp (tcc_comparison)
> >   icmp (inverted_tcc_comparison)
> >   /* Fold (((a < b) & c) | ((a >= b) & d)) into (a < b ? c : d) & 1.  */
> > diff --git a/gcc/testsuite/gcc.c-torture/execute/pr110992-1.c 
> > b/gcc/testsuite/gcc.c-torture/execute/pr110992-1.c
> > new file mode 100644
> > index 000..edb7eb75ef2
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.c-torture/execute/pr110992-1.c
> > @@ -0,0 +1,43 @@
> > +#define CST 5
> > +#define OP !=
> > +#define op_eq ==
> > +#define op_ne !=
> > +
> > +#define function(vol,op, cst) \
> > +__attribute__((noipa)) \
> > +_Bool func_##op##_##cst##_##vol(vol int a, vol _Bool b) \
> > +{ \
> > +  vol int d = (a * b); \
> > +  return d op_##op cst; \
> > +}
> > +
> > +#define funcdefs(op,cst) \
> > +function(,op,cst) \
> > +function(volatile,op,cst)
> > +
> > +#define funcs(f) \
> > +f(eq,0) \
> > +f(eq,1) \
> > +f(eq,5) \
> > +f(ne,0) \
> > +f(ne,1) \
> > +f(ne,5)
> > +
> > +funcs(funcdefs)
> > +
> > +#define test(op,cst) \
> > +do { \
> > + if(func_##op##_##cst##_(a,b) != func_##op##_##cst##_volatile(a,b))\
> > +   __builtin_abort(); \
> > +} while(0);
> > +
> > +int main(void)
> > +{
> > +for(int a = -10; a <= 10; a++)
> > +{
> > +  

[PATCH] Remove xfail from gcc.dg/tree-ssa/20040204-1.c

2023-09-17 Thread Andrew Pinski via Gcc-patches
So the xfail was there because at one point the difference
from having logical-op-non-short-circuit set to 1 or 0 made a
difference in being able to optimizing a conditional way.
This has not been true for over 10 years in this case so
instead of keeping on adding to the xfail list, removing it
is the right thing to do.

Committed as obvious after a test on x86_64-linux-gnu.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/20040204-1.c: Remove xfail.
---
 gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c
index b9f8fd21ac9..aa9f68b8b42 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c
@@ -29,8 +29,4 @@ void test55 (int x, int y)
 
 /* There should be not link_error calls, if there is any the
optimization has failed */
-/* ??? Ug.  This one may or may not fail based on how fold decides
-   that the && should be emitted (based on BRANCH_COST).  Fix this
-   by teaching dom to look through && and register all components
-   as true.  */
-/* { dg-final { scan-tree-dump-times "link_error" 0 "optimized" { xfail { ! 
"alpha*-*-* arm*-*-* aarch64*-*-* powerpc*-*-* cris-*-* hppa*-*-* i?86-*-* 
mmix-*-* mips*-*-* m68k*-*-* moxie-*-* nds32*-*-* s390*-*-* sh*-*-* sparc*-*-* 
visium-*-* x86_64-*-* riscv*-*-* or1k*-*-* msp430-*-* pru*-*-* nvptx*-*-*" } } 
} } */
+/* { dg-final { scan-tree-dump-times "link_error" 0 "optimized" } } */
-- 
2.31.1



[PATCH] MATCH: Make zero_one_valued_p non-recusive fully

2023-09-17 Thread Andrew Pinski via Gcc-patches
So it turns out VN can't handle any kind of recusion for match. In this
case we have `b = a & -1` and we try to match a as being zero_one_valued_p
and VN returns b as being the value and we just go into an infinite loop at
this point.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Note genmatch should warn (or error out) if this gets detected so I filed PR 
111446
which I will be looking into next week or the week after so we don't run into
this issue again.

PR tree-optimization/111442

gcc/ChangeLog:

* match.pd (zero_one_valued_p): Have the bit_and match not be
recusive.

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr111442-1.c: New test.
---
 gcc/match.pd |  5 -
 gcc/testsuite/gcc.c-torture/compile/pr111442-1.c | 13 +
 2 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr111442-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 887665633d4..773c3810f51 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2183,8 +2183,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 /* (a&1) is always [0,1] too. This is useful again when
the range is not known. */
+/* Note this can't be recusive due to VN handling of equivalents,
+   VN and would cause an infinite recusion. */
 (match zero_one_valued_p
- (bit_and:c@0 @1 zero_one_valued_p))
+ (bit_and:c@0 @1 integer_onep)
+ (if (INTEGRAL_TYPE_P (type
 
 /* A conversion from an zero_one_valued_p is still a [0,1].
This is useful when the range of a variable is not known */
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr111442-1.c 
b/gcc/testsuite/gcc.c-torture/compile/pr111442-1.c
new file mode 100644
index 000..5814ee938de
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr111442-1.c
@@ -0,0 +1,13 @@
+
+int *a, b;
+int main() {
+  int d = 1, e;
+  if (d)
+e = a ? 0 % 0 : 0;
+  if (d)
+a = 
+  d = -1;
+  b = d & e;
+  b = 2 * e ^ 1;
+  return 0;
+}
-- 
2.31.1



[PATCH] MATCH: Avoid recusive zero_one_valued_p for conversions

2023-09-16 Thread Andrew Pinski via Gcc-patches
So when VN finds a name which has a nop conversion, it says
both names are equivalent to each other and the valuaization
function for one will return the other. This normally does not
cause any issues as there is no recusive matches. But after
r14-4038-gb975c0dc3be285, there was one added. So we would
do an infinite recusion on the match and never finish.
This fixes the issue (and adds a comment in match.pd) by
for converts just handle one level instead of being recusive
always.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Note the testcase was reduced from tree-ssa-loop-niter.cc and then
changed slightly into C rather than C++ but it still needs exceptions
turned on get the IR that VN would produce this equivalence relationship
going on. Also had to turn off early inline to force put to be inlined later.

PR tree-optimization/111435

gcc/ChangeLog:

* match.pd (zero_one_valued_p): Don't do recusion
on converts.

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr111435-1.c: New test.
---
 gcc/match.pd   |  8 +++-
 .../gcc.c-torture/compile/pr111435-1.c | 18 ++
 2 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr111435-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 97405e6a5c3..887665633d4 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2188,8 +2188,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 /* A conversion from an zero_one_valued_p is still a [0,1].
This is useful when the range of a variable is not known */
+/* Note this matches can't be recusive because of the way VN handles
+   nop conversions being equivalent and then recusive between them. */
 (match zero_one_valued_p
- (convert@0 zero_one_valued_p))
+ (convert@0 @1)
+ (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
+  && (TYPE_UNSIGNED (TREE_TYPE (@1))
+ || TYPE_PRECISION (TREE_TYPE (@1)) > 1)
+  && wi::leu_p (tree_nonzero_bits (@1), 1
 
 /* Transform { 0 or 1 } * { 0 or 1 } into { 0 or 1 } & { 0 or 1 }.  */
 (simplify
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr111435-1.c 
b/gcc/testsuite/gcc.c-torture/compile/pr111435-1.c
new file mode 100644
index 000..afa84dd59dd
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr111435-1.c
@@ -0,0 +1,18 @@
+/* { dg-options "-fexceptions -fno-early-inlining" } */
+/* { dg-require-effective-target exceptions } */
+
+void find_slot_with_hash(const int *);
+
+void put(const int *k, const int *) {
+find_slot_with_hash(k);
+}
+unsigned len();
+int *address();
+void h(int header, int **bounds) {
+  if (!*bounds)
+return;
+  unsigned t = *bounds ? len() : 0;
+  int queue_index = t;
+  address()[(unsigned)queue_index] = 0;
+  put(, _index);
+}
-- 
2.31.1



[PATCH] MATCH: Add simplifications of `(a == CST) & a`

2023-09-16 Thread Andrew Pinski via Gcc-patches
`(a == CST) & a` can be either simplified to simplying `a == CST`
or 0 depending on the first bit of the CST.
This is an extension of the already pattern of `X & !X` and allows
us to remove the 2 xfails on gcc.dg/binop-notand1a.c and 
gcc.dg/binop-notand4a.c.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/111431

gcc/ChangeLog:

* match.pd (`(a == CST) & a`): New pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/binop-notand1a.c: Remove xfail.
* gcc.dg/binop-notand4a.c: Likewise.
* gcc.c-torture/execute/pr111431-1.c: New test.
* gcc.dg/binop-andeq1.c: New test.
* gcc.dg/binop-andeq2.c: New test.
* gcc.dg/binop-notand7.c: New test.
* gcc.dg/binop-notand7a.c: New test.
---
 gcc/match.pd  |  8 
 .../gcc.c-torture/execute/pr111431-1.c| 39 +++
 gcc/testsuite/gcc.dg/binop-andeq1.c   | 12 ++
 gcc/testsuite/gcc.dg/binop-andeq2.c   | 14 +++
 gcc/testsuite/gcc.dg/binop-notand1a.c |  4 +-
 gcc/testsuite/gcc.dg/binop-notand4a.c |  4 +-
 gcc/testsuite/gcc.dg/binop-notand7.c  | 12 ++
 gcc/testsuite/gcc.dg/binop-notand7a.c | 12 ++
 8 files changed, 99 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr111431-1.c
 create mode 100644 gcc/testsuite/gcc.dg/binop-andeq1.c
 create mode 100644 gcc/testsuite/gcc.dg/binop-andeq2.c
 create mode 100644 gcc/testsuite/gcc.dg/binop-notand7.c
 create mode 100644 gcc/testsuite/gcc.dg/binop-notand7a.c

diff --git a/gcc/match.pd b/gcc/match.pd
index ebb50ee0581..65960a1701e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5172,6 +5172,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  )
 )
 
+/* `(a == CST) & a` can be simplified to `0` or `(a == CST)` depending
+   on the first bit of the CST.  */
+(simplify
+ (bit_and:c (convert@2 (eq @0 INTEGER_CST@1)) (convert? @0))
+ (if ((wi::to_wide (@1) & 1) != 0)
+  @2
+  { build_zero_cst (type); }))
+
 /* Optimize
# x_5 in range [cst1, cst2] where cst2 = cst1 + 1
x_5 ? cstN ? cst4 : cst3
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr111431-1.c 
b/gcc/testsuite/gcc.c-torture/execute/pr111431-1.c
new file mode 100644
index 000..a96dbadf2b5
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr111431-1.c
@@ -0,0 +1,39 @@
+int
+foo (int a)
+{
+  int b = a == 0;
+  return (a & b);
+}
+
+#define function(vol,cst) \
+__attribute__((noipa)) \
+_Bool func_##cst##_##vol(vol int a) \
+{ \
+  vol int b = a == cst; \
+  return (a & b); \
+}
+
+#define funcdefs(cst) \
+function(,cst) \
+function(volatile,cst)
+
+#define funcs(f) \
+f(0) \
+f(1) \
+f(5)
+
+funcs(funcdefs)
+
+#define test(cst) \
+do { \
+ if(func_##cst##_(a) != func_##cst##_volatile(a))\
+   __builtin_abort(); \
+} while(0);
+int main(void)
+{
+  for(int a = -10; a <= 10; a++)
+   {
+ funcs(test)
+   }
+}
+
diff --git a/gcc/testsuite/gcc.dg/binop-andeq1.c 
b/gcc/testsuite/gcc.dg/binop-andeq1.c
new file mode 100644
index 000..2a92b8f95df
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/binop-andeq1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* PR tree-optimization/111431 */
+
+int
+foo (int a)
+{
+  int b = a == 2;
+  return (a & b);
+}
+
+/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/binop-andeq2.c 
b/gcc/testsuite/gcc.dg/binop-andeq2.c
new file mode 100644
index 000..895262fc17e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/binop-andeq2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* PR tree-optimization/111431 */
+
+int
+foo (int a)
+{
+  int b = a == 1025;
+  return (a & b);
+}
+
+/* { dg-final { scan-tree-dump-not "return 0"  "optimized" } } */
+/* { dg-final { scan-tree-dump-not " & "  "optimized" } } */
+/* { dg-final { scan-tree-dump-times " == 1025;" 1  "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/binop-notand1a.c 
b/gcc/testsuite/gcc.dg/binop-notand1a.c
index c7e932b2638..d94685eb4ce 100644
--- a/gcc/testsuite/gcc.dg/binop-notand1a.c
+++ b/gcc/testsuite/gcc.dg/binop-notand1a.c
@@ -7,6 +7,4 @@ foo (char a, unsigned short b)
   return (a & !a) | (b & !b);
 }
 
-/* As long as comparisons aren't boolified and casts from boolean-types
-   aren't preserved, the folding of  X & !X to zero fails.  */
-/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized" { xfail *-*-* } 
} } */
+/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized"  } } */
diff --git a/gcc/testsuite/gcc.dg/binop-notand4a.c 
b/gcc/testsuite/gcc.dg/binop-notand4a.c
index dce6a5c7eb5..bd9c7cce638 100644
--- a/gcc/testsuite/gcc.dg/binop-notand4a.c
+++ b/gcc/testsuite/gcc.dg/binop-notand4a.c
@@ -7,6 +7,4 @@ foo (unsigned char a, _Bool b)
   return (!a & a) | (b & !b);
 }
 
-/* As long as comparisons aren't boolified and casts from boolean-types
-   aren't 

[PATCH] MATCH: Add simplifications for `(a * zero_one) ==/!= CST`

2023-09-15 Thread Andrew Pinski via Gcc-patches
Transforming `(a * b@[0,1]) != 0` into `((cast)b) & a != 0`
will produce better code as a lot of the time b is defined
by a comparison.
Also since canonicalize `a & -zero_one` into `a * zero_one` we
start to lose information when doing comparisons against 0.
In the case of PR 110992, we lose that `a != 0` on the branch
and then don't do a jump threading like we should.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/110992

gcc/ChangeLog:

* match.pd (`a * zero_one !=/== CST`): New pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/vrp116.c: Update test to avoid the
extra comparison.
* gcc.c-torture/execute/pr110992-1.c: New test.
* gcc.dg/tree-ssa/pr110992-1.c: New test.
* gcc.dg/tree-ssa/pr110992-2.c: New test.
---
 gcc/match.pd  | 15 +++
 .../gcc.c-torture/execute/pr110992-1.c| 43 +++
 gcc/testsuite/gcc.dg/tree-ssa/pr110992-1.c| 21 +
 gcc/testsuite/gcc.dg/tree-ssa/pr110992-2.c| 17 
 gcc/testsuite/gcc.dg/tree-ssa/vrp116.c|  2 +-
 5 files changed, 97 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110992-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110992-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110992-2.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 39c9c81966a..97405e6a5c3 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2197,6 +2197,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (INTEGRAL_TYPE_P (type))
   (bit_and @0 @1)))
 
+/* (a * b@[0,1]) == CST
+ ->
+   CST == 0 ? (a == CST | b == 0) : (a == CST & b != 0)
+   (a * b@[0,1]) != CST
+ ->
+   CST != 0 ? (a != CST | b == 0) : (a != CST & b != 0)  */
+(for cmp (ne eq)
+ (simplify
+  (cmp (mult:cs @0 zero_one_valued_p@1) INTEGER_CST@2)
+  (if ((cmp == EQ_EXPR) ^ (wi::to_wide (@2) != 0))
+   (bit_ior
+(cmp @0 @2)
+(convert (bit_xor @1 { build_one_cst (TREE_TYPE (@1)); })))
+   (bit_and (cmp @0 @2) (convert @1)
+
 (for cmp (tcc_comparison)
  icmp (inverted_tcc_comparison)
  /* Fold (((a < b) & c) | ((a >= b) & d)) into (a < b ? c : d) & 1.  */
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr110992-1.c 
b/gcc/testsuite/gcc.c-torture/execute/pr110992-1.c
new file mode 100644
index 000..edb7eb75ef2
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr110992-1.c
@@ -0,0 +1,43 @@
+#define CST 5
+#define OP !=
+#define op_eq ==
+#define op_ne !=
+
+#define function(vol,op, cst) \
+__attribute__((noipa)) \
+_Bool func_##op##_##cst##_##vol(vol int a, vol _Bool b) \
+{ \
+  vol int d = (a * b); \
+  return d op_##op cst; \
+}
+
+#define funcdefs(op,cst) \
+function(,op,cst) \
+function(volatile,op,cst)
+
+#define funcs(f) \
+f(eq,0) \
+f(eq,1) \
+f(eq,5) \
+f(ne,0) \
+f(ne,1) \
+f(ne,5)
+
+funcs(funcdefs)
+
+#define test(op,cst) \
+do { \
+ if(func_##op##_##cst##_(a,b) != func_##op##_##cst##_volatile(a,b))\
+   __builtin_abort(); \
+} while(0);
+
+int main(void)
+{
+for(int a = -10; a <= 10; a++)
+{
+_Bool b = 0;
+funcs(test)
+b = 1;
+funcs(test)
+}
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr110992-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr110992-1.c
new file mode 100644
index 000..825fd63f84c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr110992-1.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+static unsigned b;
+static short c = 4;
+void foo(void);
+static short(a)(short d, short g) { return d * g; }
+void e();
+static char f() {
+  b = 0;
+  return 0;
+}
+int main() {
+  int h = b;
+  if ((short)(a(c && e, 65535) & h)) {
+foo();
+h || f();
+  }
+}
+
+/* There should be no calls to foo left. */
+/* { dg-final { scan-tree-dump-not " foo " "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr110992-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr110992-2.c
new file mode 100644
index 000..6082949a218
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr110992-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+static unsigned b;
+static short c = 4;
+void foo(void);
+int main() {
+  int h = b;
+  int d = c != 0;
+  if (h*d) {
+foo();
+if (!h) b = 20;
+  }
+}
+
+
+/* There should be no calls to foo left. */
+/* { dg-final { scan-tree-dump-not " foo " "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp116.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp116.c
index 9e68a774aee..16b31e320a0 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp116.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp116.c
@@ -6,7 +6,7 @@ f (int m1, int m2, int c)
 {
   int d = m1 > m2;
   int e = d * c;
-  return e ? m1 : m2;
+  return e;
 }
 
 /* { dg-final { scan-tree-dump-times "\\? c_\[0-9\]\\(D\\) : 0" 1 "optimized" 
} } */
-- 
2.31.1



Re: Question on -fwrapv and -fwrapv-pointer

2023-09-15 Thread Andrew Pinski via Gcc-patches
On Fri, Sep 15, 2023 at 8:12 AM Qing Zhao  wrote:
>
>
>
> > On Sep 15, 2023, at 3:43 AM, Xi Ruoyao  wrote:
> >
> > On Thu, 2023-09-14 at 21:41 +, Qing Zhao wrote:
>  CLANG already provided -fsanitize=unsigned-integer-overflow. GCC
>  might need to do the same.
> >>>
> >>> NO. There is no such thing as unsigned integer overflow. That option
> >>> is badly designed and the GCC community has rejected a few times now
> >>> having that sanitizer before. It is bad form to have a sanitizer for
> >>> well defined code.
> >>
> >> Even though unsigned integer overflow is well defined, it might be
> >> unintentional, shall we warn user about this?
> >
> > *Everything* could be unintentional and should be warned then.  GCC is a
> > compiler, not an advanced AI educating the programmers.
>
> Well, you are right in some sense. -:)
>
> However, overflow is one important source for security flaws, it’s important  
> for compilers to detect
> overflows in the programs in general.

Except it is NOT an overflow. Rather it is wrapping. That is a big
point here. unsigned wraps and does NOT overflow. Yes there is a major
difference.

>
> Qing
> >
> > --
> > Xi Ruoyao 
> > School of Aerospace Science and Technology, Xidian University
>


Re: [PATCH] MATCH: Improve zero_one_valued_p for cases without range information

2023-09-15 Thread Andrew Pinski via Gcc-patches
On Thu, Sep 14, 2023 at 11:28 PM Richard Biener via Gcc-patches
 wrote:
>
> On Fri, Sep 15, 2023 at 3:09 AM Andrew Pinski via Gcc-patches
>  wrote:
> >
> > I noticed we sometimes lose range information in forwprop due to a few
> > match and simplify patterns optimizing away casts. So the easier way
> > to these cases is to add a match for zero_one_valued_p wich mathes
> > a cast from another zero_one_valued_p.
> > This also adds the case of `x & zero_one_valued_p` as being 
> > zero_one_valued_p
> > which allows catching more cases too.
> >
> > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
>
> OK.
>
> I wonder if it would make a difference if we'd enable ranger unconditionally
> in forwprop (maybe with -O2+), currently it gets enabled sometimes only.

I was thinking about that though currently zero_one_valued_p only uses
the global ranger info via tree_nonzero_bits but I have patches to use
the local one too.

Thanks,
Andrew

>
> Richard.
>
> > gcc/ChangeLog:
> >
> > * match.pd (zero_one_valued_p): Match a cast from a 
> > zero_one_valued_p.
> > Also match `a & zero_one_valued_p` too.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/bool-13.c: Update testcase as we now do
> > the MIN/MAX during forwprop1.
> > ---
> >  gcc/match.pd| 10 ++
> >  gcc/testsuite/gcc.dg/tree-ssa/bool-13.c | 15 +--
> >  2 files changed, 15 insertions(+), 10 deletions(-)
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 97db0eb5f25..39c9c81966a 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -2181,6 +2181,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >&& (TYPE_UNSIGNED (type)
> >   || TYPE_PRECISION (type) > 1
> >
> > +/* (a&1) is always [0,1] too. This is useful again when
> > +   the range is not known. */
> > +(match zero_one_valued_p
> > + (bit_and:c@0 @1 zero_one_valued_p))
> > +
> > +/* A conversion from an zero_one_valued_p is still a [0,1].
> > +   This is useful when the range of a variable is not known */
> > +(match zero_one_valued_p
> > + (convert@0 zero_one_valued_p))
> > +
> >  /* Transform { 0 or 1 } * { 0 or 1 } into { 0 or 1 } & { 0 or 1 }.  */
> >  (simplify
> >   (mult zero_one_valued_p@0 zero_one_valued_p@1)
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bool-13.c 
> > b/gcc/testsuite/gcc.dg/tree-ssa/bool-13.c
> > index 438f15a484a..de8c99a7727 100644
> > --- a/gcc/testsuite/gcc.dg/tree-ssa/bool-13.c
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/bool-13.c
> > @@ -1,5 +1,5 @@
> >  /* { dg-do compile } */
> > -/* { dg-options "-O1 -fdump-tree-optimized -fdump-tree-original 
> > -fdump-tree-phiopt1 -fdump-tree-forwprop2" } */
> > +/* { dg-options "-O1 -fdump-tree-optimized -fdump-tree-original 
> > -fdump-tree-forwprop1 -fdump-tree-forwprop2" } */
> >  #define bool _Bool
> >  int maxbool(bool ab, bool bb)
> >  {
> > @@ -22,15 +22,10 @@ int minbool(bool ab, bool bb)
> >  /* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "original" } } */
> >  /* { dg-final { scan-tree-dump-times "if " 0 "original" } } */
> >
> > -/* PHI-OPT1 should have kept it as min/max. */
> > -/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
> > -/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
> > -/* { dg-final { scan-tree-dump-times "if " 0 "phiopt1" } } */
> > -
> > -/* Forwprop2 (after ccp) will convert it into &\| */
> > -/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "forwprop2" } } */
> > -/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "forwprop2" } } */
> > -/* { dg-final { scan-tree-dump-times "if " 0 "forwprop2" } } */
> > +/* Forwprop1 will convert it into &\| as we can detect that the arguments 
> > are one_zero. */
> > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "forwprop1" } } */
> > +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "forwprop1" } } */
> > +/* { dg-final { scan-tree-dump-times "if " 0 "forwprop1" } } */
> >
> >  /* By optimize there should be no min/max nor if  */
> >  /* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "optimized" } } */
> > --
> > 2.31.1
> >


[PATCH] MATCH: Improve zero_one_valued_p for cases without range information

2023-09-14 Thread Andrew Pinski via Gcc-patches
I noticed we sometimes lose range information in forwprop due to a few
match and simplify patterns optimizing away casts. So the easier way
to these cases is to add a match for zero_one_valued_p wich mathes
a cast from another zero_one_valued_p.
This also adds the case of `x & zero_one_valued_p` as being zero_one_valued_p
which allows catching more cases too.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* match.pd (zero_one_valued_p): Match a cast from a zero_one_valued_p.
Also match `a & zero_one_valued_p` too.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bool-13.c: Update testcase as we now do
the MIN/MAX during forwprop1.
---
 gcc/match.pd| 10 ++
 gcc/testsuite/gcc.dg/tree-ssa/bool-13.c | 15 +--
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index 97db0eb5f25..39c9c81966a 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2181,6 +2181,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   && (TYPE_UNSIGNED (type)
  || TYPE_PRECISION (type) > 1
 
+/* (a&1) is always [0,1] too. This is useful again when
+   the range is not known. */
+(match zero_one_valued_p
+ (bit_and:c@0 @1 zero_one_valued_p))
+
+/* A conversion from an zero_one_valued_p is still a [0,1].
+   This is useful when the range of a variable is not known */
+(match zero_one_valued_p
+ (convert@0 zero_one_valued_p))
+
 /* Transform { 0 or 1 } * { 0 or 1 } into { 0 or 1 } & { 0 or 1 }.  */
 (simplify
  (mult zero_one_valued_p@0 zero_one_valued_p@1)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bool-13.c 
b/gcc/testsuite/gcc.dg/tree-ssa/bool-13.c
index 438f15a484a..de8c99a7727 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/bool-13.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bool-13.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O1 -fdump-tree-optimized -fdump-tree-original 
-fdump-tree-phiopt1 -fdump-tree-forwprop2" } */
+/* { dg-options "-O1 -fdump-tree-optimized -fdump-tree-original 
-fdump-tree-forwprop1 -fdump-tree-forwprop2" } */
 #define bool _Bool
 int maxbool(bool ab, bool bb)
 {
@@ -22,15 +22,10 @@ int minbool(bool ab, bool bb)
 /* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "original" } } */
 /* { dg-final { scan-tree-dump-times "if " 0 "original" } } */
 
-/* PHI-OPT1 should have kept it as min/max. */
-/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
-/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
-/* { dg-final { scan-tree-dump-times "if " 0 "phiopt1" } } */
-
-/* Forwprop2 (after ccp) will convert it into &\| */
-/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "forwprop2" } } */
-/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "forwprop2" } } */
-/* { dg-final { scan-tree-dump-times "if " 0 "forwprop2" } } */
+/* Forwprop1 will convert it into &\| as we can detect that the arguments are 
one_zero. */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "forwprop1" } } */
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "forwprop1" } } */
+/* { dg-final { scan-tree-dump-times "if " 0 "forwprop1" } } */
 
 /* By optimize there should be no min/max nor if  */
 /* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "optimized" } } */
-- 
2.31.1



Re: Question on -fwrapv and -fwrapv-pointer

2023-09-14 Thread Andrew Pinski via Gcc-patches
On Thu, Sep 14, 2023 at 1:50 PM Qing Zhao via Gcc-patches
 wrote:
>
>
>
> > On Sep 14, 2023, at 12:18 PM, Xi Ruoyao  wrote:
> >
> > On Thu, 2023-09-14 at 15:57 +, Qing Zhao via Gcc-patches wrote:
> >> Currently, GCC behaves as following:
> >>
> >> /* True if overflow wraps around for the given integral or pointer type.  
> >> That
> >>is, TYPE_MAX + 1 == TYPE_MIN.  */
> >> #define TYPE_OVERFLOW_WRAPS(TYPE) \
> >>   (POINTER_TYPE_P (TYPE)\
> >>? flag_wrapv_pointer \
> >>: (ANY_INTEGRAL_TYPE_CHECK(TYPE)->base.u.bits.unsigned_flag  \
> >>   || flag_wrapv))
> >>
> >> /* True if overflow is undefined for the given integral or pointer type.
> >>We may optimize on the assumption that values in the type never 
> >> overflow.
> >>
> >>IMPORTANT NOTE: Any optimization based on TYPE_OVERFLOW_UNDEFINED
> >>must issue a warning based on warn_strict_overflow.  In some cases
> >>it will be appropriate to issue the warning immediately, and in
> >>other cases it will be appropriate to simply set a flag and let the
> >>caller decide whether a warning is appropriate or not.  */
> >> #define TYPE_OVERFLOW_UNDEFINED(TYPE)   \
> >>   (POINTER_TYPE_P (TYPE)\
> >>? !flag_wrapv_pointer\
> >>: (!ANY_INTEGRAL_TYPE_CHECK(TYPE)->base.u.bits.unsigned_flag \
> >>   && !flag_wrapv && !flag_trapv))
> >>
> >> The logic above seems treating the pointer default as signed integer, 
> >> right?
> >
> > It only says the pointers cannot overflow, not the pointers are signed.
> >
> > printf("%d\n", (char *)(intptr_t)-1 > (char *)(intptr_t)1);
> >
> > produces 1 instead of 0.  Technically this is invoking undefined
> > behavior and a conforming implementation can output anything.  But
> > consider a 32-bit bare metal target where the linker can locate a "char
> > x[512]" at [0x7f00, 0x8100).  The standard then requires [512]
> >> [0], but if we do a signed comparison here we'll end up "[512] <
> > [0]", this is non-conforming.
>
> So, are both the above examples showing that pointer based comparisons are 
> similar as unsigned integer comparison?  -:)
> Do we have examples on treating the pointer arithmetic as signed integer 
> arithmetic? (Really curious on this….)
>
> But anyway, if we cannot treat pointer type consistently as signed or 
> unsigned, shall we still need to catch pointer overflow?
>
> Currently, In GCC, we have -fsanitize=signed-integer-overflow to catch signed 
> integer overflow.
> But we don’t have options to catch unsigned integer overflow and pointer 
> overflow.
>
> Shall we add two more options to catch unsigned integer overflow and pointer 
> overflow, like:
>
> -fsanitize=unsigned-integer-overflow
> -fsanitize=pointer-overflow
>
> CLANG already provided -fsanitize=unsigned-integer-overflow. GCC might need 
> to do the same.

NO. There is no such thing as unsigned integer overflow. That option
is badly designed and the GCC community has rejected a few times now
having that sanitizer before. It is bad form to have a sanitizer for
well defined code.

Now -fsanitize=pointer-overflow is already there for GCC which was
added in r8-2238-gc9b39a4955f56fe609ef5478 . LLVM/clang also provides
it in the same timeframe too .
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80998

Thanks,
Andrew

>
> And both Clang and GCC might also need to add -fsanitize=pointer-overflow?
>
> > IIUC, pointers are not integers, at all.  If we treat them as integers
> > in the brain we'll end up invoking undefined behavior sooner or later.
> > Thus the wrapping/overflowing behavior of pointer is controlled by a
> > different option than integers.
>
> However, the wrapping/overflowing behavior of pointers is still based on the 
> corresponding integer(or unsigned integer) wrapping/overflowing, right?
> Do we have special pointer wrapping/overflowing?
>
> Qing
>
> >
> > --
> > Xi Ruoyao 
> > School of Aerospace Science and Technology, Xidian University
>


[PATCH] MATCH: Fix `(1 >> X) != 0` pattern for vector types

2023-09-14 Thread Andrew Pinski via Gcc-patches
I had missed that integer_onep can match vector types with uniform constant of 
`1`.
This means the shifter could be an scalar type and then doing a comparison 
against `0`
would be an invalid transformation.
This fixes the problem by adding a check for the type of the integer_onep to 
make
sure it is a INTEGRAL_TYPE_P (which does not match a vector type).

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/111414

gcc/ChangeLog:

* match.pd (`(1 >> X) != 0`): Check to see if
the integer_onep was an integral type (not a vector type).

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr111414-1.c: New test.
---
 gcc/match.pd |  5 +++--
 gcc/testsuite/gcc.c-torture/compile/pr111414-1.c | 13 +
 2 files changed, 16 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr111414-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 07ffd831132..97db0eb5f25 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4206,8 +4206,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  /* `(1 >> X) != 0` -> `X == 0` */
  /* `(1 >> X) == 0` -> `X != 0` */
  (simplify
-  (cmp (rshift integer_onep @0) integer_zerop)
-   (icmp @0 { build_zero_cst (TREE_TYPE (@0)); })))
+  (cmp (rshift integer_onep@1 @0) integer_zerop)
+   (if (INTEGRAL_TYPE_P (TREE_TYPE (@1)))
+(icmp @0 { build_zero_cst (TREE_TYPE (@0)); }
 
 /* (CST1 << A) == CST2 -> A == ctz (CST2) - ctz (CST1)
(CST1 << A) != CST2 -> A != ctz (CST2) - ctz (CST1)
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr111414-1.c 
b/gcc/testsuite/gcc.c-torture/compile/pr111414-1.c
new file mode 100644
index 000..13fbdae7230
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr111414-1.c
@@ -0,0 +1,13 @@
+int a, b, c, d, e, f, g;
+int h(int i) { return b >= 2 ?: i >> b; }
+void j() {
+  int k;
+  int *l = 
+  for (; d; d++) {
+g = h(0 != j);
+f = g >> a;
+k = f << 7;
+e = k > 5 ? k : 0;
+*l ^= e;
+  }
+}
-- 
2.31.1



[PATCH] MATCH: Support `(a != (CST+1)) & (a > CST)` optimizations

2023-09-13 Thread Andrew Pinski via Gcc-patches
Even though this is done via reassocation, match can support
these with a simple change to detect that the difference is just
one. This allows to optimize these earlier and even during phiopt
for an example.

This patch adds the following cases:
(a != (CST+1)) & (a > CST) -> a > (CST+1)
(a != (CST-1)) & (a < CST) -> a < (CST-1)
(a == (CST-1)) | (a >= CST) -> a >= (CST-1)
(a == (CST+1)) | (a <= CST) -> a <= (CST+1)

Canonicalizations of comparisons causes this case to show up more.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/106164

gcc/ChangeLog:

* match.pd (`(X CMP1 CST1) AND/IOR (X CMP2 CST2)`):
Expand to support constants that are off by one.

gcc/testsuite/ChangeLog:

* gcc.dg/pr21643.c: Update test now that match does
the combing of the comparisons.
* gcc.dg/tree-ssa/cmpbit-5.c: New test.
* gcc.dg/tree-ssa/phi-opt-35.c: New test.
---
 gcc/match.pd   | 44 ++-
 gcc/testsuite/gcc.dg/pr21643.c |  6 ++-
 gcc/testsuite/gcc.dg/tree-ssa/cmpbit-5.c   | 51 ++
 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-35.c | 13 ++
 4 files changed, 111 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cmpbit-5.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-35.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 7ecf5568599..07ffd831132 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2970,10 +2970,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& operand_equal_p (@1, @2)))
 (with
  {
+  bool one_before = false;
+  bool one_after = false;
   int cmp = 0;
   if (TREE_CODE (@1) == INTEGER_CST
  && TREE_CODE (@2) == INTEGER_CST)
-   cmp = tree_int_cst_compare (@1, @2);
+   {
+ cmp = tree_int_cst_compare (@1, @2);
+ if (cmp < 0
+ && wi::to_wide (@1) == wi::to_wide (@2) - 1)
+   one_before = true;
+ if (cmp > 0
+ && wi::to_wide (@1) == wi::to_wide (@2) + 1)
+   one_after = true;
+   }
   bool val;
   switch (code2)
 {
@@ -2998,6 +3008,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& code2 == LE_EXPR
   && cmp == 0)
(lt @0 @1))
+  /* (a != (b+1)) & (a > b) -> a > (b+1) */
+  (if (code1 == NE_EXPR
+   && code2 == GT_EXPR
+  && one_after)
+   (gt @0 @1))
+  /* (a != (b-1)) & (a < b) -> a < (b-1) */
+  (if (code1 == NE_EXPR
+   && code2 == LT_EXPR
+  && one_before)
+   (lt @0 @1))
  )
 )
)
@@ -3069,10 +3089,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& operand_equal_p (@1, @2)))
 (with
  {
+  bool one_before = false;
+  bool one_after = false;
   int cmp = 0;
   if (TREE_CODE (@1) == INTEGER_CST
  && TREE_CODE (@2) == INTEGER_CST)
-   cmp = tree_int_cst_compare (@1, @2);
+   {
+ cmp = tree_int_cst_compare (@1, @2);
+ if (cmp < 0
+ && wi::to_wide (@1) == wi::to_wide (@2) - 1)
+   one_before = true;
+ if (cmp > 0
+ && wi::to_wide (@1) == wi::to_wide (@2) + 1)
+   one_after = true;
+   }
   bool val;
   switch (code2)
{
@@ -3097,6 +3127,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& code2 == LT_EXPR
   && cmp == 0)
(le @0 @1))
+  /* (a == (b-1)) | (a >= b) -> a >= (b-1) */
+  (if (code1 == EQ_EXPR
+   && code2 == GE_EXPR
+  && one_before)
+   (ge @0 @1))
+  /* (a == (b+1)) | (a <= b) -> a <= (b-1) */
+  (if (code1 == EQ_EXPR
+   && code2 == LE_EXPR
+  && one_after)
+   (le @0 @1))
  )
 )
)
diff --git a/gcc/testsuite/gcc.dg/pr21643.c b/gcc/testsuite/gcc.dg/pr21643.c
index 4e7f93d351a..42517b5af1e 100644
--- a/gcc/testsuite/gcc.dg/pr21643.c
+++ b/gcc/testsuite/gcc.dg/pr21643.c
@@ -86,4 +86,8 @@ f9 (unsigned char c)
   return 1;
 }
 
-/* { dg-final { scan-tree-dump-times "Optimizing range tests c_\[0-9\]*.D. 
-.0, 31. and -.32, 32.\[\n\r\]* into" 6 "reassoc1" } }  */
+/* Note with match being able to simplify this, optimizing range tests is no 
longer needed here. */
+/* Equivalence: _7 | _2 -> c_5(D) <= 32 */
+/* old test: dg-final  scan-tree-dump-times "Optimizing range tests 
c_\[0-9\]*.D. -.0, 31. and -.32, 32.\[\n\r\]* into" 6 "reassoc1"   */
+/* { dg-final { scan-tree-dump-times "Equivalence: _\[0-9\]+ \\\| _\[0-9\]+ -> 
c_\[0-9\]+.D. <= 32" 5 "reassoc1" } }  */
+/* { dg-final { scan-tree-dump-times "Equivalence: _\[0-9\]+ \& _\[0-9\]+ -> 
c_\[0-9\]+.D. > 32" 1 "reassoc1" } }  */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cmpbit-5.c 
b/gcc/testsuite/gcc.dg/tree-ssa/cmpbit-5.c
new file mode 100644
index 000..d81a129825b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/cmpbit-5.c
@@ -0,0 +1,51 @@
+/* PR tree-optimization/106164 */
+/* { dg-do compile } */
+/* { 

[PATCH] Improve error message for if with an else part while in switch

2023-09-13 Thread Andrew Pinski via Gcc-patches
While writing some match.pd code, I was trying to figure
out why I was getting an `expected ), got (` error message
while writing an if statement with an else clause. For switch
statements, the if statements cannot have an else clause so
it would be better to have a decent error message saying that
explictly.

OK? Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* genmatch.cc (parser::parse_result): For an else clause
of an if statement inside a switch, error out explictly.
---
 gcc/genmatch.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc
index a1925a747a7..03d325efdf6 100644
--- a/gcc/genmatch.cc
+++ b/gcc/genmatch.cc
@@ -4891,6 +4891,8 @@ parser::parse_result (operand *result, predicate_id 
*matcher)
ife->trueexpr = parse_result (result, matcher);
  else
ife->trueexpr = parse_op ();
+ if (peek ()->type == CPP_OPEN_PAREN)
+   fatal_at (peek(), "if inside switch cannot have an else");
  eat_token (CPP_CLOSE_PAREN);
}
  else
-- 
2.31.1



Re: [PATCH 1/2] MATCH: [PR111364] Add some more minmax cmp operand simplifications

2023-09-13 Thread Andrew Pinski via Gcc-patches
On Tue, Sep 12, 2023 at 11:45 PM Richard Biener via Gcc-patches
 wrote:
>
> On Tue, Sep 12, 2023 at 5:31 PM Andrew Pinski via Gcc-patches
>  wrote:
> >
> > This adds a few more minmax cmp operand simplifications which were missed 
> > before.
> > `MIN(a,b) < a` -> `a > b`
> > `MIN(a,b) >= a` -> `a <= b`
> > `MAX(a,b) > a` -> `a < b`
> > `MAX(a,b) <= a` -> `a >= b`
> >
> > OK? Bootstrapped and tested on x86_64-linux-gnu.
>
> OK.  I wonder if any of these are also valid for FP types?

I was thinking about that too. I will look into that later this week.

Thanks,
Andrew

>
> > Note gcc.dg/pr96708-negative.c needed to updated to remove the
> > check for MIN/MAX as they have been optimized (correctly) away.
> >
> > PR tree-optimization/111364
> >
> > gcc/ChangeLog:
> >
> > * match.pd (`MIN (X, Y) == X`): Extend
> > to min/lt, min/ge, max/gt, max/le.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.c-torture/execute/minmaxcmp-1.c: New test.
> > * gcc.dg/tree-ssa/minmaxcmp-2.c: New test.
> > * gcc.dg/pr96708-negative.c: Update testcase.
> > * gcc.dg/pr96708-positive.c: Add comment about `return 0`.
> > ---
> >  gcc/match.pd  |  8 +--
> >  .../gcc.c-torture/execute/minmaxcmp-1.c   | 51 +++
> >  gcc/testsuite/gcc.dg/pr96708-negative.c   |  4 +-
> >  gcc/testsuite/gcc.dg/pr96708-positive.c   |  1 +
> >  gcc/testsuite/gcc.dg/tree-ssa/minmaxcmp-2.c   | 30 +++
> >  5 files changed, 89 insertions(+), 5 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.c-torture/execute/minmaxcmp-1.c
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/minmaxcmp-2.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 51985c1bad4..36e3da4841b 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -3902,9 +3902,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >(maxmin @0 (bit_not @1
> >
> >  /* MIN (X, Y) == X -> X <= Y  */
> > -(for minmax (min min max max)
> > - cmp(eq  ne  eq  ne )
> > - out(le  gt  ge  lt )
> > +/* MIN (X, Y) < X -> X > Y  */
> > +/* MIN (X, Y) >= X -> X <= Y  */
> > +(for minmax (min min min min max max max max)
> > + cmp(eq  ne  lt  ge  eq  ne  gt  le )
> > + out(le  gt  gt  le  ge  lt  lt  ge )
> >   (simplify
> >(cmp:c (minmax:c @0 @1) @0)
> >(if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)))
> > diff --git a/gcc/testsuite/gcc.c-torture/execute/minmaxcmp-1.c 
> > b/gcc/testsuite/gcc.c-torture/execute/minmaxcmp-1.c
> > new file mode 100644
> > index 000..6705a053768
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.c-torture/execute/minmaxcmp-1.c
> > @@ -0,0 +1,51 @@
> > +#define func(vol, op1, op2)\
> > +_Bool op1##_##op2##_##vol (int a, int b)   \
> > +{  \
> > + vol int x = op_##op1(a, b);   \
> > + return op_##op2(x, a);\
> > +}
> > +
> > +#define op_lt(a, b) ((a) < (b))
> > +#define op_le(a, b) ((a) <= (b))
> > +#define op_eq(a, b) ((a) == (b))
> > +#define op_ne(a, b) ((a) != (b))
> > +#define op_gt(a, b) ((a) > (b))
> > +#define op_ge(a, b) ((a) >= (b))
> > +#define op_min(a, b) ((a) < (b) ? (a) : (b))
> > +#define op_max(a, b) ((a) > (b) ? (a) : (b))
> > +
> > +
> > +#define funcs(a) \
> > + a(min,lt) \
> > + a(max,lt) \
> > + a(min,gt) \
> > + a(max,gt) \
> > + a(min,le) \
> > + a(max,le) \
> > + a(min,ge) \
> > + a(max,ge) \
> > + a(min,ne) \
> > + a(max,ne) \
> > + a(min,eq) \
> > + a(max,eq)
> > +
> > +#define funcs1(a,b) \
> > +func(,a,b) \
> > +func(volatile,a,b)
> > +
> > +funcs(funcs1)
> > +
> > +#define test(op1,op2)   \
> > +do {\
> > +  if (op1##_##op2##_(x,y) != op1##_##op2##_volatile(x,y))   \
> > +__builtin_abort();  \
> > +} while(0);
> > +
> > +int main()
> > +{
> > +  for(int x = -10; x < 10; x++)
> > +for(int y = -10; y < 10; y++)
> > +{
> > +funcs(test)
> > +}
> > +}
> > diff --git a/gcc/testsuite/gcc.dg/pr96708-negative.c 
> > b/gcc/testsuite/gcc.dg/pr96708-negative.c
> > index 91964d3b971..c9c1aa85558 100644
> > --- a/gcc/testsuite/gcc.dg/pr9

[PATCH] MATCH: Simplify `(X % Y) < Y` pattern.

2023-09-12 Thread Andrew Pinski via Gcc-patches
This merges the two patterns to catch
`(X % Y) < Y` and `Y > (X % Y)` into one by
using :c on the comparison operator.
It does not change any code generation nor
anything else. It is more to allow for better
maintainability of this pattern.

OK? Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* match.pd (`Y > (X % Y)`): Merge
into ...
(`(X % Y) < Y`): Pattern by adding `:c`
on the comparison.
---
 gcc/match.pd | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index 39c7ea1088f..24fd29863fb 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1483,14 +1483,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 /* X % Y is smaller than Y.  */
 (for cmp (lt ge)
  (simplify
-  (cmp (trunc_mod @0 @1) @1)
+  (cmp:c (trunc_mod @0 @1) @1)
   (if (TYPE_UNSIGNED (TREE_TYPE (@0)))
{ constant_boolean_node (cmp == LT_EXPR, type); })))
-(for cmp (gt le)
- (simplify
-  (cmp @1 (trunc_mod @0 @1))
-  (if (TYPE_UNSIGNED (TREE_TYPE (@0)))
-   { constant_boolean_node (cmp == GT_EXPR, type); })))
 
 /* x | ~0 -> ~0  */
 (simplify
-- 
2.31.1



[PATCH 2/2] MATCH: Move `X <= MAX(X, Y)` before `MIN (X, C1) < C2` pattern

2023-09-12 Thread Andrew Pinski via Gcc-patches
Since matching C1 as C2 here will decrease how much other simplifications
will need to happen to get the final answer.

OK? Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* match.pd (`X <= MAX(X, Y)`):
Move before `MIN (X, C1) < C2` pattern.
---
 gcc/match.pd | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index 36e3da4841b..34b67df784e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3931,13 +3931,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(if (wi::lt_p (wi::to_wide (@1), wi::to_wide (@2),
  TYPE_SIGN (TREE_TYPE (@0
 (cmp @0 @2)
-/* MIN (X, C1) < C2 -> X < C2 || C1 < C2  */
-(for minmax (min min max max min min max max)
- cmp(lt  le  gt  ge  gt  ge  lt  le )
- comb   (bit_ior bit_ior bit_ior bit_ior bit_and bit_and bit_and bit_and)
- (simplify
-  (cmp (minmax @0 INTEGER_CST@1) INTEGER_CST@2)
-  (comb (cmp @0 @2) (cmp @1 @2
 
 /* X <= MAX(X, Y) -> true
X > MAX(X, Y) -> false 
@@ -3949,6 +3942,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (cmp:c @0 (minmax:c @0 @1))
   { constant_boolean_node (cmp == GE_EXPR || cmp == LE_EXPR, type); } ))
 
+/* MIN (X, C1) < C2 -> X < C2 || C1 < C2  */
+(for minmax (min min max max min min max max)
+ cmp(lt  le  gt  ge  gt  ge  lt  le )
+ comb   (bit_ior bit_ior bit_ior bit_ior bit_and bit_and bit_and bit_and)
+ (simplify
+  (cmp (minmax @0 INTEGER_CST@1) INTEGER_CST@2)
+  (comb (cmp @0 @2) (cmp @1 @2
+
 /* Undo fancy ways of writing max/min or other ?: expressions, like
a - ((a - b) & -(a < b))  and  a - (a - b) * (a < b) into (a < b) ? b : a.
People normally use ?: and that is what we actually try to optimize.  */
-- 
2.31.1



[PATCH 1/2] MATCH: [PR111364] Add some more minmax cmp operand simplifications

2023-09-12 Thread Andrew Pinski via Gcc-patches
This adds a few more minmax cmp operand simplifications which were missed 
before.
`MIN(a,b) < a` -> `a > b`
`MIN(a,b) >= a` -> `a <= b`
`MAX(a,b) > a` -> `a < b`
`MAX(a,b) <= a` -> `a >= b`

OK? Bootstrapped and tested on x86_64-linux-gnu.

Note gcc.dg/pr96708-negative.c needed to updated to remove the
check for MIN/MAX as they have been optimized (correctly) away.

PR tree-optimization/111364

gcc/ChangeLog:

* match.pd (`MIN (X, Y) == X`): Extend
to min/lt, min/ge, max/gt, max/le.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/minmaxcmp-1.c: New test.
* gcc.dg/tree-ssa/minmaxcmp-2.c: New test.
* gcc.dg/pr96708-negative.c: Update testcase.
* gcc.dg/pr96708-positive.c: Add comment about `return 0`.
---
 gcc/match.pd  |  8 +--
 .../gcc.c-torture/execute/minmaxcmp-1.c   | 51 +++
 gcc/testsuite/gcc.dg/pr96708-negative.c   |  4 +-
 gcc/testsuite/gcc.dg/pr96708-positive.c   |  1 +
 gcc/testsuite/gcc.dg/tree-ssa/minmaxcmp-2.c   | 30 +++
 5 files changed, 89 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/minmaxcmp-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/minmaxcmp-2.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 51985c1bad4..36e3da4841b 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3902,9 +3902,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (maxmin @0 (bit_not @1
 
 /* MIN (X, Y) == X -> X <= Y  */
-(for minmax (min min max max)
- cmp(eq  ne  eq  ne )
- out(le  gt  ge  lt )
+/* MIN (X, Y) < X -> X > Y  */
+/* MIN (X, Y) >= X -> X <= Y  */
+(for minmax (min min min min max max max max)
+ cmp(eq  ne  lt  ge  eq  ne  gt  le )
+ out(le  gt  gt  le  ge  lt  lt  ge )
  (simplify
   (cmp:c (minmax:c @0 @1) @0)
   (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)))
diff --git a/gcc/testsuite/gcc.c-torture/execute/minmaxcmp-1.c 
b/gcc/testsuite/gcc.c-torture/execute/minmaxcmp-1.c
new file mode 100644
index 000..6705a053768
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/minmaxcmp-1.c
@@ -0,0 +1,51 @@
+#define func(vol, op1, op2)\
+_Bool op1##_##op2##_##vol (int a, int b)   \
+{  \
+ vol int x = op_##op1(a, b);   \
+ return op_##op2(x, a);\
+}
+
+#define op_lt(a, b) ((a) < (b))
+#define op_le(a, b) ((a) <= (b))
+#define op_eq(a, b) ((a) == (b))
+#define op_ne(a, b) ((a) != (b))
+#define op_gt(a, b) ((a) > (b))
+#define op_ge(a, b) ((a) >= (b))
+#define op_min(a, b) ((a) < (b) ? (a) : (b))
+#define op_max(a, b) ((a) > (b) ? (a) : (b))
+
+
+#define funcs(a) \
+ a(min,lt) \
+ a(max,lt) \
+ a(min,gt) \
+ a(max,gt) \
+ a(min,le) \
+ a(max,le) \
+ a(min,ge) \
+ a(max,ge) \
+ a(min,ne) \
+ a(max,ne) \
+ a(min,eq) \
+ a(max,eq)
+
+#define funcs1(a,b) \
+func(,a,b) \
+func(volatile,a,b)
+
+funcs(funcs1)
+
+#define test(op1,op2)   \
+do {\
+  if (op1##_##op2##_(x,y) != op1##_##op2##_volatile(x,y))   \
+__builtin_abort();  \
+} while(0);
+
+int main()
+{
+  for(int x = -10; x < 10; x++)
+for(int y = -10; y < 10; y++)
+{
+funcs(test)
+}
+}
diff --git a/gcc/testsuite/gcc.dg/pr96708-negative.c 
b/gcc/testsuite/gcc.dg/pr96708-negative.c
index 91964d3b971..c9c1aa85558 100644
--- a/gcc/testsuite/gcc.dg/pr96708-negative.c
+++ b/gcc/testsuite/gcc.dg/pr96708-negative.c
@@ -42,7 +42,7 @@ int main()
 return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "optimized" } } */
-/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "optimized" } } */
+/* Even though test[1-4] originally has MIN/MAX, those can be optimized away
+   into just comparing a and b arguments. */
 /* { dg-final { scan-tree-dump-times "return 0;" 1 "optimized" } } */
 /* { dg-final { scan-tree-dump-not { "return 1;" } "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/pr96708-positive.c 
b/gcc/testsuite/gcc.dg/pr96708-positive.c
index 65af85344b6..12c5fedfd30 100644
--- a/gcc/testsuite/gcc.dg/pr96708-positive.c
+++ b/gcc/testsuite/gcc.dg/pr96708-positive.c
@@ -42,6 +42,7 @@ int main()
 return 0;
 }
 
+/* Note main has one `return 0`. */
 /* { dg-final { scan-tree-dump-times "return 0;" 3 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "return 1;" 2 "optimized" } } */
 /* { dg-final { scan-tree-dump-not { "MAX_EXPR" } "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmaxcmp-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/minmaxcmp-2.c
new file mode 100644
index 000..f64a9253cfb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmaxcmp-2.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-original" } */
+/* PR tree-optimization/111364 */
+
+#define min1(a, b) ((a) < (b) ? (a) : (b))
+#define max1(a, b) ((a) > (b) ? (a) : (b))
+
+int minlt(int a, int b)
+{
+return min1(a, b) < a; // b < a or a > b

[PATCH] MATCH: Simplify (a CMP1 b) ^ (a CMP2 b)

2023-09-11 Thread Andrew Pinski via Gcc-patches
This adds the missing optimizations here.
Note we don't need to match where CMP1 and CMP2 are complements of each
other as that is already handled elsewhere.

I added a new executable testcase to make sure we optimize it correctly
as I had originally messed up one of the entries for the resulting
comparison to make sure they were 100% correct.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/107881

gcc/ChangeLog:

* match.pd (`(a CMP1 b) ^ (a CMP2 b)`): New pattern.
(`(a CMP1 b) == (a CMP2 b)`): New pattern.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/pr107881-1.c: New test.
* gcc.dg/tree-ssa/cmpeq-4.c: New test.
* gcc.dg/tree-ssa/cmpxor-1.c: New test.
---
 gcc/match.pd  |  20 +++
 .../gcc.c-torture/execute/pr107881-1.c| 115 ++
 gcc/testsuite/gcc.dg/tree-ssa/cmpeq-4.c   |  51 
 gcc/testsuite/gcc.dg/tree-ssa/cmpxor-1.c  |  51 
 4 files changed, 237 insertions(+)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr107881-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cmpeq-4.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cmpxor-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index e96e385c6fa..39c7ea1088f 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3154,6 +3154,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   { constant_boolean_node (true, type); })
  ))
 
+/* Optimize (a CMP b) ^ (a CMP b)  */
+/* Optimize (a CMP b) != (a CMP b)  */
+(for op (bit_xor ne)
+ (for cmp1 (lt lt lt le le le)
+  cmp2 (gt eq ne ge eq ne)
+  rcmp (ne le gt ne lt ge)
+  (simplify
+   (op:c (cmp1:c @0 @1) (cmp2:c @0 @1))
+   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) || POINTER_TYPE_P (TREE_TYPE (@0)))
+(rcmp @0 @1)
+
+/* Optimize (a CMP b) == (a CMP b)  */
+(for cmp1 (lt lt lt le le le)
+ cmp2 (gt eq ne ge eq ne)
+ rcmp (eq gt le eq ge lt)
+ (simplify
+  (eq:c (cmp1:c @0 @1) (cmp2:c @0 @1))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) || POINTER_TYPE_P (TREE_TYPE (@0)))
+(rcmp @0 @1
+
 /* We can't reassociate at all for saturating types.  */
 (if (!TYPE_SATURATING (type))
 
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr107881-1.c 
b/gcc/testsuite/gcc.c-torture/execute/pr107881-1.c
new file mode 100644
index 000..063ec4c2797
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr107881-1.c
@@ -0,0 +1,115 @@
+#define func(vol, op1, op2, op3)   \
+_Bool op1##_##op2##_##op3##_##vol (int a, int b)   \
+{  \
+ vol _Bool x = op_##op1(a, b); \
+ vol _Bool y = op_##op2(a, b); \
+ return op_##op3(x, y);\
+}
+
+#define op_lt(a, b) ((a) < (b))
+#define op_le(a, b) ((a) <= (b))
+#define op_eq(a, b) ((a) == (b))
+#define op_ne(a, b) ((a) != (b))
+#define op_gt(a, b) ((a) > (b))
+#define op_ge(a, b) ((a) >= (b))
+#define op_xor(a, b) ((a) ^ (b))
+
+
+#define funcs(a) \
+ a(lt,lt,ne) \
+ a(lt,lt,eq) \
+ a(lt,lt,xor) \
+ a(lt,le,ne) \
+ a(lt,le,eq) \
+ a(lt,le,xor) \
+ a(lt,gt,ne) \
+ a(lt,gt,eq) \
+ a(lt,gt,xor) \
+ a(lt,ge,ne) \
+ a(lt,ge,eq) \
+ a(lt,ge,xor) \
+ a(lt,eq,ne) \
+ a(lt,eq,eq) \
+ a(lt,eq,xor) \
+ a(lt,ne,ne) \
+ a(lt,ne,eq) \
+ a(lt,ne,xor) \
+  \
+ a(le,lt,ne) \
+ a(le,lt,eq) \
+ a(le,lt,xor) \
+ a(le,le,ne) \
+ a(le,le,eq) \
+ a(le,le,xor) \
+ a(le,gt,ne) \
+ a(le,gt,eq) \
+ a(le,gt,xor) \
+ a(le,ge,ne) \
+ a(le,ge,eq) \
+ a(le,ge,xor) \
+ a(le,eq,ne) \
+ a(le,eq,eq) \
+ a(le,eq,xor) \
+ a(le,ne,ne) \
+ a(le,ne,eq) \
+ a(le,ne,xor)  \
+ \
+ a(gt,lt,ne) \
+ a(gt,lt,eq) \
+ a(gt,lt,xor) \
+ a(gt,le,ne) \
+ a(gt,le,eq) \
+ a(gt,le,xor) \
+ a(gt,gt,ne) \
+ a(gt,gt,eq) \
+ a(gt,gt,xor) \
+ a(gt,ge,ne) \
+ a(gt,ge,eq) \
+ a(gt,ge,xor) \
+ a(gt,eq,ne) \
+ a(gt,eq,eq) \
+ a(gt,eq,xor) \
+ a(gt,ne,ne) \
+ a(gt,ne,eq) \
+ a(gt,ne,xor) \
+  \
+ a(ge,lt,ne) \
+ a(ge,lt,eq) \
+ a(ge,lt,xor) \
+ a(ge,le,ne) \
+ a(ge,le,eq) \
+ a(ge,le,xor) \
+ a(ge,gt,ne) \
+ a(ge,gt,eq) \
+ a(ge,gt,xor) \
+ a(ge,ge,ne) \
+ a(ge,ge,eq) \
+ a(ge,ge,xor) \
+ a(ge,eq,ne) \
+ a(ge,eq,eq) \
+ a(ge,eq,xor) \
+ a(ge,ne,ne) \
+ a(ge,ne,eq) \
+ a(ge,ne,xor)
+
+#define funcs1(a,b,c) \
+func(,a,b,c) \
+func(volatile,a,b,c)
+
+funcs(funcs1)
+
+#define test(op1,op2,op3)  \
+do {   \
+  if (op1##_##op2##_##op3##_(x,y)  \
+  != op1##_##op2##_##op3##_volatile(x,y))  \
+__builtin_abort(); \
+} while(0);
+
+int main()
+{
+  for(int x = -10; x < 10; x++)
+for(int y = -10; y < 10; y++)
+{
+funcs(test)
+}
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cmpeq-4.c 
b/gcc/testsuite/gcc.dg/tree-ssa/cmpeq-4.c
new file mode 100644
index 000..868d80fdcca
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/cmpeq-4.c
@@ -0,0 +1,51 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -fdump-tree-original" } */
+/* PR 

[PATCH] MATCH: [PR111348] add missing :c to cmp in the `(a CMP b) ? minmax : minmax` pattern

2023-09-11 Thread Andrew Pinski via Gcc-patches
When I added this pattern in r14-337-gc43819a9b4cd, I had missed the :c on the 
cmp
part of the pattern meaning there might be some missing optimizations happening.
The testcase shows an example of the missed optmization.

Committed as obvious after a bootstrap/test on x86_64-linux-gnu.

PR tree-optimization/111348

gcc/ChangeLog:

* match.pd (`(a CMP b) ? minmax : minmax`): Add :c on
the cmp part of the pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/minmax-26.c: New test.
---
 gcc/match.pd  |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/minmax-26.c | 22 ++
 2 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/minmax-26.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 209b0599382..e96e385c6fa 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5417,7 +5417,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (for minmax (min max)
  (for cmp (lt le gt ge ne)
   (simplify
-   (cond (cmp @1 @3) (minmax:c @1 @4) (minmax:c @2 @4))
+   (cond (cmp:c @1 @3) (minmax:c @1 @4) (minmax:c @2 @4))
(with
 {
   tree_code code = minmax_from_comparison (cmp, @1, @2, @1, @3);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-26.c 
b/gcc/testsuite/gcc.dg/tree-ssa/minmax-26.c
new file mode 100644
index 000..e4b7412e766
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-26.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -fdump-tree-original" } */
+/* PR tree-optimization/111348 */
+
+int test1(int a, int b, int c)
+{
+return (a > b) ? ((a > c) ? a : c) : ((b > c) ? b : c);
+}
+
+
+int test1_(int a, int b, int c)
+{
+return (b < a) ? ((a > c) ? a : c) : ((b > c) ? b : c);
+}
+
+/* test1 and test1_ should be able to optimize to `MAX_EXPR , 
c>;` during fold.  */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR , c>" 2 
"original" } } */
+/* { dg-final { scan-tree-dump-not "b > a" "original" } } */
+/* { dg-final { scan-tree-dump-not "a > b" "original" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR " 4 "optimized" } } */
+/* { dg-final { scan-tree-dump-not "if " "optimized" } } */
+
-- 
2.31.1



[PATCH] MATCH: [PR111349] add missing :c to cmp in the `(a CMP CST1) ? max : a` pattern

2023-09-11 Thread Andrew Pinski via Gcc-patches
When I added this pattern in r14-1411-g17cca3c43e2f49, I had missed the :c on 
the cmp
part of the pattern meaning there might be some missing optimizations happening.
The testcase shows an example of the missed optmization.

Committed as obvious after a bootstrap/test on x86_64-linux-gnu.

PR tree-optimization/111349

gcc/ChangeLog:

* match.pd (`(a CMP CST1) ? max : a`): Add :c on
the cmp part of the pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/minmax-25.c: New test.
---
 gcc/match.pd  |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/minmax-25.c | 21 +
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/minmax-25.c

diff --git a/gcc/match.pd b/gcc/match.pd
index a60fe04885e..209b0599382 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5431,7 +5431,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (for cmp(gt  ge  lt  le)
  minmax (min min max max)
  (simplify
-  (cond (cmp @0 @1) (minmax:c@2 @0 @3) @4)
+  (cond (cmp:c @0 @1) (minmax:c@2 @0 @3) @4)
(with
 {
   tree_code code = minmax_from_comparison (cmp, @0, @1, @0, @4);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-25.c 
b/gcc/testsuite/gcc.dg/tree-ssa/minmax-25.c
new file mode 100644
index 000..b7a5bfd4c19
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-25.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized -fdump-tree-original" } */
+/* PR tree-optimization/111349 */
+
+int f();
+int g();
+
+int test1(int a, int b)
+{
+return (a > b) ? ((a > b) ? a : b) : a;
+}
+
+int test1_(int a, int b)
+{
+return (b < a) ? ((a > b) ? a : b) : a;
+}
+
+/* test1 and test1_ should be able to optimize to `return a;` during fold.  */
+/* { dg-final { scan-tree-dump-times "return a;" 2 "original" } } */
+/* { dg-final { scan-tree-dump-not " MAX_EXPR " "original" } } */
+/* { dg-final { scan-tree-dump-times "return a" 2 "optimized" } } */
-- 
2.31.1



[PATCH] MATCH: [PR111346] `X CMP MINMAX` pattern missing :c on CMP

2023-09-10 Thread Andrew Pinski via Gcc-patches
I noticed this while working on other MINMAX optimizations. It was
hard to find a simplified testcase though because it was dependent on
the ssa name versions. Adding the `:c` to cmp allows the pattern to
be match for the case where minmax as the first operand of the comparison
rather than the second.

Committed as obvious after a bootstrap/test on x86_64-linux-gnu.

PR tree-optimization/111346

gcc/ChangeLog:

* match.pd (`X CMP MINMAX`): Add `:c` on the cmp part
of the pattern

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/minmaxcmp-1.c: New test.
---
 gcc/match.pd|  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/minmaxcmp-1.c | 39 +
 2 files changed, 40 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/minmaxcmp-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index c7b6db4b543..a60fe04885e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3942,7 +3942,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (for minmax (min min max max )
  cmp(ge  lt  le  gt  )
  (simplify
-  (cmp @0 (minmax:c @0 @1))
+  (cmp:c @0 (minmax:c @0 @1))
   { constant_boolean_node (cmp == GE_EXPR || cmp == LE_EXPR, type); } ))
 
 /* Undo fancy ways of writing max/min or other ?: expressions, like
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmaxcmp-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/minmaxcmp-1.c
new file mode 100644
index 000..0706c026076
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmaxcmp-1.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -fdump-tree-original" } */
+/* PR tree-optimization/111346 */
+
+int f();
+int g();
+
+_Bool test1(int a, int b)
+{
+return ((a > b) ? a : b) >= a; // return 1;
+}
+_Bool test1_(int a, int b)
+{
+return a <= ((a > b) ? a : b); // return 1;
+}
+/* test1 and test1_ should be able to optimize to `return 1;` during fold.  */
+/* { dg-final { scan-tree-dump-times "return 1;" 2 "original" } } */
+/* { dg-final { scan-tree-dump-not " MAX_EXPR " "original" } } */
+
+_Bool test2(int a, int b)
+{
+a = f();
+a = g();
+int t = a;
+if (t < b) t = b;
+return t >= a; // return 1;
+}
+
+_Bool test2_(int a, int b)
+{
+a = g();
+int t = a;
+if (t < b) t = b;
+return t >= a; // return 1;
+}
+
+/* All of these should be optimized to just be the function calls and `return 
1;` */
+/* { dg-final { scan-tree-dump-times "return 1;" 4 "optimized" } } */
+/* { dg-final { scan-tree-dump-not " MAX_EXPR " "optimized" } } */
-- 
2.31.1



[PATCH] Fix PR 111331: wrong code for `a > 28 ? MIN : 29`

2023-09-08 Thread Andrew Pinski via Gcc-patches
The problem here is after r6-7425-ga9fee7cdc3c62d0e51730,
the comparison to see if the transformation could be done was using the
wrong value. Instead of see if the inner was LE (for MIN and GE for MAX)
the outer value, it was comparing the inner to the value used in the comparison
which was wrong.
The match pattern copied the same logic mistake when they were added in
r14-1411-g17cca3c43e2f49 .

OK? Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

PR tree-optimization/111331
* match.pd (`(a CMP CST1) ? max : a`):
Fix the LE/GE comparison to the correct value.
* tree-ssa-phiopt.cc (minmax_replacement):
Fix the LE/GE comparison for the
`(a CMP CST1) ? max : a` optimization.

gcc/testsuite/ChangeLog:

PR tree-optimization/111331
* gcc.c-torture/execute/pr111331-1.c: New test.
* gcc.c-torture/execute/pr111331-2.c: New test.
* gcc.c-torture/execute/pr111331-3.c: New test.
---
 gcc/match.pd  |  4 ++--
 .../gcc.c-torture/execute/pr111331-1.c| 17 +
 .../gcc.c-torture/execute/pr111331-2.c| 19 +++
 .../gcc.c-torture/execute/pr111331-3.c| 15 +++
 gcc/tree-ssa-phiopt.cc|  8 
 5 files changed, 57 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr111331-1.c
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr111331-2.c
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr111331-3.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 8c24dae71cd..c7b6db4b543 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5438,11 +5438,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 }
 (if ((cmp == LT_EXPR || cmp == LE_EXPR)
 && code == MIN_EXPR
- && integer_nonzerop (fold_build2 (LE_EXPR, boolean_type_node, @3, 
@1)))
+ && integer_nonzerop (fold_build2 (LE_EXPR, boolean_type_node, @3, 
@4)))
  (min @2 @4)
  (if ((cmp == GT_EXPR || cmp == GE_EXPR)
  && code == MAX_EXPR
-  && integer_nonzerop (fold_build2 (GE_EXPR, boolean_type_node, @3, 
@1)))
+  && integer_nonzerop (fold_build2 (GE_EXPR, boolean_type_node, @3, 
@4)))
   (max @2 @4))
 
 #if GIMPLE
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr111331-1.c 
b/gcc/testsuite/gcc.c-torture/execute/pr111331-1.c
new file mode 100644
index 000..4c7f4fdbaa9
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr111331-1.c
@@ -0,0 +1,17 @@
+int a;
+int b;
+int c(int d, int e, int f) {
+  if (d < e)
+return e;
+  if (d > f)
+return f;
+  return d;
+}
+int main() {
+  int g = -1;
+  a = c(b + 30, 29, g + 29);
+  volatile t = a;
+  if (t != 28)
+__builtin_abort();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr111331-2.c 
b/gcc/testsuite/gcc.c-torture/execute/pr111331-2.c
new file mode 100644
index 000..5c677f2caa9
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr111331-2.c
@@ -0,0 +1,19 @@
+
+int a;
+int b;
+
+int main() {
+  int d = b+30;
+  {
+int t;
+if (d < 29)
+  t =  29;
+else
+  t = (d > 28) ? 28 : d;
+a = t;
+  }
+  volatile int t = a;
+  if (a != 28)
+__builtin_abort();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr111331-3.c 
b/gcc/testsuite/gcc.c-torture/execute/pr111331-3.c
new file mode 100644
index 000..213d9bdd539
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr111331-3.c
@@ -0,0 +1,15 @@
+int a;
+int b;
+
+int main() {
+  int d = b+30;
+  {
+int t;
+t = d < 29 ? 29 : ((d > 28) ? 28 : d);
+a = t;
+  }
+  volatile int t = a;
+  if (a != 28)
+__builtin_abort();
+  return 0;
+}
diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index 9993bbe5b76..9b44ca9758a 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -2073,7 +2073,7 @@ minmax_replacement (basic_block cond_bb, basic_block 
middle_bb, basic_block alt_
 
  /* We need BOUND <= LARGER.  */
  if (!integer_nonzerop (fold_build2 (LE_EXPR, boolean_type_node,
- bound, larger)))
+ bound, arg_false)))
return false;
}
  else if (operand_equal_for_phi_arg_p (arg_false, smaller)
@@ -2104,7 +2104,7 @@ minmax_replacement (basic_block cond_bb, basic_block 
middle_bb, basic_block alt_
 
  /* We need BOUND >= SMALLER.  */
  if (!integer_nonzerop (fold_build2 (GE_EXPR, boolean_type_node,
- bound, smaller)))
+ bound, arg_false)))
return false;
}
  else
@@ -2144,7 +2144,7 @@ minmax_replacement (basic_block cond_bb, basic_block 
middle_bb, basic_block alt_
 
  /* We need BOUND >= LARGER.  */
  if 

Re: [PATCH] Support folding min(poly,poly) to const

2023-09-07 Thread Andrew Pinski via Gcc-patches
On Thu, Sep 7, 2023 at 10:25 PM Lehua Ding  wrote:
>
> Hi,
>
> This patch adds support that tries to fold `MIN (poly, poly)` to
> a constant. Consider the following C Code:

Does it make sense to handle max also?

Thanks,
Andrew


>
> ```
> void foo2 (int* restrict a, int* restrict b, int n)
> {
> for (int i = 0; i < 3; i += 1)
>   a[i] += b[i];
> }
> ```
>
> Before this patch:
>
> ```
> void foo2 (int * restrict a, int * restrict b, int n)
> {
>   vector([4,4]) int vect__7.27;
>   vector([4,4]) int vect__6.26;
>   vector([4,4]) int vect__4.23;
>   unsigned long _32;
>
>[local count: 268435456]:
>   _32 = MIN_EXPR <3, POLY_INT_CST [4, 4]>;
>   vect__4.23_20 = .MASK_LEN_LOAD (a_11(D), 32B, { -1, ... }, _32, 0);
>   vect__6.26_15 = .MASK_LEN_LOAD (b_12(D), 32B, { -1, ... }, _32, 0);
>   vect__7.27_9 = vect__6.26_15 + vect__4.23_20;
>   .MASK_LEN_STORE (a_11(D), 32B, { -1, ... }, _32, 0, vect__7.27_9); [tail 
> call]
>   return;
>
> }
> ```
>
> After this patch:
>
> ```
> void foo2 (int * restrict a, int * restrict b, int n)
> {
>   vector([4,4]) int vect__7.27;
>   vector([4,4]) int vect__6.26;
>   vector([4,4]) int vect__4.23;
>
>[local count: 268435456]:
>   vect__4.23_20 = .MASK_LEN_LOAD (a_11(D), 32B, { -1, ... }, 3, 0);
>   vect__6.26_15 = .MASK_LEN_LOAD (b_12(D), 32B, { -1, ... }, 3, 0);
>   vect__7.27_9 = vect__6.26_15 + vect__4.23_20;
>   .MASK_LEN_STORE (a_11(D), 32B, { -1, ... }, 3, 0, vect__7.27_9); [tail call]
>   return;
>
> }
> ```
>
> For RISC-V RVV, one branch instruction can be reduced:
>
> Before this patch:
>
> ```
> foo2:
> csrra4,vlenb
> srlia4,a4,2
> li  a5,3
> bleua5,a4,.L5
> mv  a5,a4
> .L5:
> vsetvli zero,a5,e32,m1,ta,ma
> ...
> ```
>
> After this patch.
>
> ```
> foo2:
> vsetivlizero,3,e32,m1,ta,ma
> ...
> ```
>
> Best,
> Lehua
>
> gcc/ChangeLog:
>
> * fold-const.cc (can_min_p): New function.
> (poly_int_binop): Try fold MIN_EXPR.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/vls/div-1.c: Adjust.
> * gcc.target/riscv/rvv/autovec/vls/shift-3.c: Adjust.
> * gcc.target/riscv/rvv/autovec/fold-min-poly.c: New test.
>
> ---
>  gcc/fold-const.cc | 33 +++
>  .../riscv/rvv/autovec/fold-min-poly.c | 24 ++
>  .../gcc.target/riscv/rvv/autovec/vls/div-1.c  |  2 +-
>  .../riscv/rvv/autovec/vls/shift-3.c   |  2 +-
>  4 files changed, 59 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/fold-min-poly.c
>
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 1da498a3152..f7f793cc326 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -1213,6 +1213,34 @@ wide_int_binop (wide_int ,
>return true;
>  }
>
> +/* Returns true if we know who is smaller or equal, ARG1 or ARG2., and set 
> the
> +   min value to RES.  */
> +bool
> +can_min_p (const_tree arg1, const_tree arg2, poly_wide_int )
> +{
> +  if (tree_fits_poly_int64_p (arg1) && tree_fits_poly_int64_p (arg2))
> +{
> +  if (known_le (tree_to_poly_int64 (arg1), tree_to_poly_int64 (arg2)))
> +   res = wi::to_poly_wide (arg1);
> +  else if (known_le (tree_to_poly_int64 (arg2), tree_to_poly_int64 
> (arg1)))
> +   res = wi::to_poly_wide (arg2);
> +  else
> +   return false;
> +}
> +  else if (tree_fits_poly_uint64_p (arg1) && tree_fits_poly_uint64_p (arg2))
> +{
> +  if (known_le (tree_to_poly_uint64 (arg1), tree_to_poly_uint64 (arg2)))
> +   res = wi::to_poly_wide (arg1);
> +  else if (known_le (tree_to_poly_int64 (arg2), tree_to_poly_int64 
> (arg1)))
> +   res = wi::to_poly_wide (arg2);
> +  else
> +   return false;
> +}
> +  else
> +return false;
> +  return true;
> +}
> +
>  /* Combine two poly int's ARG1 and ARG2 under operation CODE to
> produce a new constant in RES.  Return FALSE if we don't know how
> to evaluate CODE at compile-time.  */
> @@ -1261,6 +1289,11 @@ poly_int_binop (poly_wide_int , enum tree_code 
> code,
> return false;
>break;
>
> +case MIN_EXPR:
> +  if (!can_min_p (arg1, arg2, res))
> +   return false;
> +  break;
> +
>  default:
>return false;
>  }
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/fold-min-poly.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/fold-min-poly.c
> new file mode 100644
> index 000..de4c472c76e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/fold-min-poly.c
> @@ -0,0 +1,24 @@
> +/* { dg-do compile } */
> +/* { dg-options " -march=rv64gcv_zvl128b -mabi=lp64d -O3 --param 
> riscv-autovec-preference=scalable --param riscv-autovec-lmul=m1 
> -fno-vect-cost-model" } */
> +
> +void foo1 (int* restrict a, int* restrict b, int n)
> +{
> +for (int i = 0; i < 4; i += 1)
> +  a[i] += b[i];
> +}
> +
> +void foo2 (int* restrict a, int* 

Re: [PATCH 17/12] _BitInt a ? ~b : b match.pd fix [PR102989]

2023-09-05 Thread Andrew Pinski via Gcc-patches
On Tue, Sep 5, 2023 at 2:51 PM Jakub Jelinek  wrote:
>
> On Tue, Sep 05, 2023 at 02:27:10PM -0700, Andrew Pinski wrote:
> > > I admit it isn't really clear to me what do you want to achieve by the
> > > above build_nonstandard_integer_type.  Is it because of BOOLEAN_TYPE
> > > or perhaps ENUMERAL_TYPE as well?
> >
> > Yes I was worried about types where the precision was set but MIN/MAX
> > of that type was not over the full precision and would not include
> > both 0 and allones in that range.
> > There is another match.pd pattern where we do a similar thing with
> > calling build_nonstandard_integer_type for a similar reason but
> > because we don't know if the type includes 0, 1, and allones in their
> > range.
>
> Ah, in that case you should use range_check_type, that is used already
> in multiple spots in match.pd for the same purpose.  It can return NULL and
> in that case one should punt on the optimization.  Otherwise, that is the
> function which ensures that the type is unsigned and max + 1 is min and min
> - 1 is max.
> And for me, I should add BITINT_TYPE handling to that function.

Hmm maybe range_check_type is the correct one here.


>
> > > If type is INTEGER_TYPE or BITINT_TYPE, one doesn't really need to create 
> > > a
> > > new type, type already is an integral type with that precision and
> > > signedness.  In other places using unsigned_type_for or signed_type_for
> > > might be better than using build_nonstandard_integer_type if that is what
> > > one wants to achieve, those functions handle BITINT_TYPE.
> >
> > Maybe here we should just use `signed_or_unsigned_type_for (type,
> > TYPE_SIGN (type));`
> > instead of build_nonstandard_integer_type.
>
> No, signed_or_unsigned_type_for (TYPE_UNSIGNED (type), type) will just return
> type.
>   if (ANY_INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type) == unsignedp)
> return type;

Oh I missed that.

Note I notice another all to build_nonstandard_integer_type in this
match pattern which might also need to be fixed:
/* For (x << c) >> c, optimize into x & ((unsigned)-1 >> c) for
   unsigned x OR truncate into the precision(type) - c lowest bits
   of signed x (if they have mode precision or a precision of 1).  */
(simplify
 (rshift (nop_convert? (lshift @0 INTEGER_CST@1)) @@1)
 (if (wi::ltu_p (wi::to_wide (@1), element_precision (type)))
  (if (TYPE_UNSIGNED (type))
   (bit_and (convert @0) (rshift { build_minus_one_cst (type); } @1))
   (if (INTEGRAL_TYPE_P (type))
(with {
  int width = element_precision (type) - tree_to_uhwi (@1);
  tree stype = build_nonstandard_integer_type (width, 0);
 }
 (if (width == 1 || type_has_mode_precision_p (stype))
  (convert (convert:stype @0

Do we have ranges on BITINT_TYPEs? If so the two_value_replacement
pattern in match.pd has a similar issue too.
(that is where I copied the code to use build_nonstandard_integer_type
from originally too.

Thanks,
Andrew


>
> Jakub
>


Re: [PATCH 18/12] Handle BITINT_TYPE in build_{, minus_}one_cst [PR102989]

2023-09-05 Thread Andrew Pinski via Gcc-patches
On Tue, Sep 5, 2023 at 12:31 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> Recent match.pd changes trigger ICE in build_minus_one_cst, apparently
> I forgot to handle BITINT_TYPE in these (while I've handled it in
> build_zero_cst).
>
> Will commit as obvious together with the rest of the series when the last
> patches are approved.

I assume there was a testcase that will be added when _BitInt
front-end support gets added.

Thanks,
Andrew

>
> 2023-09-05  Jakub Jelinek  
>
> PR c/102989
> * tree.cc (build_one_cst, build_minus_one_cst): Handle BITINT_TYPE
> like INTEGER_TYPE.
>
> --- gcc/tree.cc.jj  2023-09-04 09:45:33.444059843 +0200
> +++ gcc/tree.cc 2023-09-05 08:57:31.420059962 +0200
> @@ -2546,7 +2546,7 @@ build_one_cst (tree type)
>  {
>  case INTEGER_TYPE: case ENUMERAL_TYPE: case BOOLEAN_TYPE:
>  case POINTER_TYPE: case REFERENCE_TYPE:
> -case OFFSET_TYPE:
> +case OFFSET_TYPE: case BITINT_TYPE:
>return build_int_cst (type, 1);
>
>  case REAL_TYPE:
> @@ -2599,7 +2599,7 @@ build_minus_one_cst (tree type)
>  {
>  case INTEGER_TYPE: case ENUMERAL_TYPE: case BOOLEAN_TYPE:
>  case POINTER_TYPE: case REFERENCE_TYPE:
> -case OFFSET_TYPE:
> +case OFFSET_TYPE: case BITINT_TYPE:
>return build_int_cst (type, -1);
>
>  case REAL_TYPE:
>
> Jakub
>


Re: [PATCH 17/12] _BitInt a ? ~b : b match.pd fix [PR102989]

2023-09-05 Thread Andrew Pinski via Gcc-patches
On Tue, Sep 5, 2023 at 12:28 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> On Wed, Aug 09, 2023 at 12:19:54PM -0700, Andrew Pinski via Gcc-patches wrote:
> >   PR tree-optimization/110937
> >   PR tree-optimization/100798
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -6460,6 +6460,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >(if (cmp == NE_EXPR)
> > { constant_boolean_node (true, type); })))
> >
> > +#if GIMPLE
> > +/* a?~t:t -> (-(a))^t */
> > +(simplify
> > + (cond @0 @1 @2)
> > + (if (INTEGRAL_TYPE_P (type)
> > +  && bitwise_inverted_equal_p (@1, @2))
> > +  (with {
> > +auto prec = TYPE_PRECISION (type);
> > +auto unsign = TYPE_UNSIGNED (type);
> > +tree inttype = build_nonstandard_integer_type (prec, unsign);
> > +   }
> > +   (convert (bit_xor (negate (convert:inttype @0)) (convert:inttype 
> > @2))
> > +#endif
>
> This broke one bitint test - bitint-42.c for -O1 and -Os (in admittedly not 
> yet
> committed series).
> Using build_nonstandard_integer_type this way doesn't work well for larger
> precision BITINT_TYPEs, because it always creates an INTEGER_TYPE and
> say 467-bit INTEGER_TYPE doesn't work very well.  To get a BITINT_TYPE, one
> needs to use build_bitint_type instead (but similarly to
> build_nonstandard_integer_type one should first make sure such a type
> actually can be created).
>
> I admit it isn't really clear to me what do you want to achieve by the
> above build_nonstandard_integer_type.  Is it because of BOOLEAN_TYPE
> or perhaps ENUMERAL_TYPE as well?

Yes I was worried about types where the precision was set but MIN/MAX
of that type was not over the full precision and would not include
both 0 and allones in that range.
There is another match.pd pattern where we do a similar thing with
calling build_nonstandard_integer_type for a similar reason but
because we don't know if the type includes 0, 1, and allones in their
range.

>
> If type is INTEGER_TYPE or BITINT_TYPE, one doesn't really need to create a
> new type, type already is an integral type with that precision and
> signedness.  In other places using unsigned_type_for or signed_type_for
> might be better than using build_nonstandard_integer_type if that is what
> one wants to achieve, those functions handle BITINT_TYPE.

Maybe here we should just use `signed_or_unsigned_type_for (type,
TYPE_SIGN (type));`
instead of build_nonstandard_integer_type.

Thanks,
Andrew

>
> Or shall we instead test for == BOOLEAN_TYPE (or if ENUMERAL_TYPE for
> some reason needs the same treatment also || == ENUMERAL_TYPE)?
>
> 2023-09-05  Jakub Jelinek  
>
> PR c/102989
> * match.pd (a ? ~b : b): Don't use build_nonstandard_integer_type
> for INTEGER_TYPE or BITINT_TYPE.
>
> --- gcc/match.pd.jj 2023-09-04 09:45:33.553058301 +0200
> +++ gcc/match.pd2023-09-05 08:45:53.258078971 +0200
> @@ -6631,7 +6631,9 @@ (define_operator_list SYNC_FETCH_AND_AND
> (with {
>   auto prec = TYPE_PRECISION (type);
>   auto unsign = TYPE_UNSIGNED (type);
> - tree inttype = build_nonstandard_integer_type (prec, unsign);
> + tree inttype = type;
> + if (TREE_CODE (type) != INTEGER_TYPE && TREE_CODE (type) != BITINT_TYPE)
> +   inttype = build_nonstandard_integer_type (prec, unsign);
>  }
>  (convert (bit_xor (negate (convert:inttype @0)) (convert:inttype 
> @2)))
>  #endif
>
>
> Jakub
>


Re: [PATCH] ssa_name_has_boolean_range vs signed-boolean:31 types

2023-09-05 Thread Andrew Pinski via Gcc-patches
On Tue, Sep 5, 2023 at 12:09 AM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 9/1/23 20:32, Andrew Pinski via Gcc-patches wrote:
> > This turns out to be a latent bug in ssa_name_has_boolean_range
> > where it would return true for all boolean types but all of the
> > uses of ssa_name_has_boolean_range was expecting 0/1 as the range
> > rather than [-1,0].
> > So when I fixed vector lower to do all comparisons in boolean_type
> > rather than still in the signed-boolean:31 type (to fix a different issue),
> > the pattern in match for `-(type)!A -> (type)A - 1.` would assume A (which
> > was signed-boolean:31) had a range of [0,1] which broke down and sometimes
> > gave us -1/-2 as values rather than what we were expecting of -1/0.
> >
> > This was the simpliest patch I found while testing.
> >
> > We have another way of matching [0,1] range which we could use instead
> > of ssa_name_has_boolean_range except that uses only the global ranges
> > rather than the local range (during VRP).
> > I tried to clean this up slightly by using gimple_match_zero_one_valuedp
> > inside ssa_name_has_boolean_range but that failed because due to using
> > only the global ranges. I then tried to change get_nonzero_bits to use
> > the local ranges at the optimization time but that failed also because
> > we would remove branches to __builtin_unreachable during evrp and lose
> > information as we don't set the global ranges during evrp.
> >
> > OK? Bootstrapped and tested on x86_64-linux-gnu.
> >
> >   PR 110817
> >
> > gcc/ChangeLog:
> >
> >   * tree-ssanames.cc (ssa_name_has_boolean_range): Remove the
> >   check for boolean type as they don't have "[0,1]" range.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.c-torture/execute/pr110817-1.c: New test.
> >   * gcc.c-torture/execute/pr110817-2.c: New test.
> >   * gcc.c-torture/execute/pr110817-3.c: New test.
> I'm a bit surprised this didn't trigger any regressions.  Though maybe
> all the existing testcases were capturing cases where non-boolean types
> were known to have a 0/1 value.

Well except ssa_name_has_boolean_range will return true for `An
[unsigned] integral type with a single bit of precision` which the
normal boolean type for C is. So the only case where this makes a
difference is signed booleans. Vectors and Ada are the only 2 places I
know of which use signed booleans even.

This came up before too;
https://inbox.sourceware.org/gcc-patches/cafiyyc23zmevy6i9g1wpmpp7purcuzatg1qpwf2d_8n6f22...@mail.gmail.com/
.
Anyways the 3 uses of ssa_name_has_boolean_range in match.pd are:
 /* X / bool_range_Y is X.  */
which is not true for signed booleans; though division for boolean
types is not well defined
/* 1 - a is a ^ 1 if a had a bool range. */
Which is broken for signed booleans; though it might not show up in IR
for non 1-bit boolean types.
/* -(type)!A -> (type)A - 1.  */
 This one 100 % requires `A` and `A == 0` to be [0,1] range.

The other uses of ssa_name_has_boolean_range are in DOM.
The first 2 uses of ssa_name_has_boolean_range use
build_one_cst/build_one_cst which is definitely wrong there. should
have been constant_boolean_node for N-bit signed boolean types.
The use `A COND_EXPR may create equivalences too.` actually does the
correct thing and uses constant_boolean_node.

Now maybe we miss some optimizations with Ada code with this change; I
am not 100% sure. Maybe the change should just add && TYPE_UNSIGNED
(type) to the check of boolean type and that will fix the issue too.

>
>
> OK.
> jeff


Re: [PATCH 2/2] VR-VALUES: Rewrite test_for_singularity using range_op_handler

2023-09-05 Thread Andrew Pinski via Gcc-patches
On Mon, Sep 4, 2023 at 11:06 PM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 9/1/23 11:30, Andrew Pinski via Gcc-patches wrote:
> > So it turns out there was a simplier way of starting to
> > improve VRP to start to fix PR 110131, PR 108360, and PR 108397.
> > That was rewrite test_for_singularity to use range_op_handler
> > and Value_Range.
> >
> > This patch implements that and
> >
> > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> >
> > gcc/ChangeLog:
> >
> >   * vr-values.cc (test_for_singularity): Add edge argument
> >   and rewrite using range_op_handler.
> >   (simplify_compare_using_range_pairs): Use Value_Range
> >   instead of value_range and update test_for_singularity call.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.dg/tree-ssa/vrp124.c: New test.
> >   * gcc.dg/tree-ssa/vrp125.c: New test.
> > ---
>
> > diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
> > index 52ab4fe6109..2474e57ee90 100644
> > --- a/gcc/vr-values.cc
> > +++ b/gcc/vr-values.cc
> > @@ -904,69 +904,33 @@ simplify_using_ranges::simplify_bit_ops_using_ranges
> >   }
> >
> >   /* We are comparing trees OP1 and OP2 using COND_CODE.  OP1 has
> > -   a known value range VR.
> > +   a known value range OP1_RANGE.
> >
> >  If there is one and only one value which will satisfy the
> > -   conditional, then return that value.  Else return NULL.
> > -
> > -   If signed overflow must be undefined for the value to satisfy
> > -   the conditional, then set *STRICT_OVERFLOW_P to true.  */
> > +   conditional on the EDGE, then return that value.
> > +   Else return NULL.  */
> >
> >   static tree
> >   test_for_singularity (enum tree_code cond_code, tree op1,
> > -   tree op2, const value_range *vr)
> > +   tree op2, const int_range_max _range, bool edge)
> >   {
> > -  tree min = NULL;
> > -  tree max = NULL;
> > -
> > -  /* Extract minimum/maximum values which satisfy the conditional as it was
> > - written.  */
> > -  if (cond_code == LE_EXPR || cond_code == LT_EXPR)
> > +  /* This is already a singularity.  */
> > +  if (cond_code == NE_EXPR || cond_code == EQ_EXPR)
> > +return NULL;
> I don't think this is necessarily the right thing to do for NE.
>
> Consider if op1 has the range [0,1] and op2 has the value 1.  If the
> code is NE, then we should be able to return a singularity of 0 since
> that's the only value for x where x ne 1 is true given the range for x.

The "false" edge singularity is already known when NE is supplied. I
don't think changing it to the "true" edge singularity will be helpful
all of the time; preferring the value of 0 is a different story.
But that is a different patch and for a different location rather than
inside VRP; it should be in either isel or expand (more likely isel).

Thanks,
Andrew

>
>
>
> I like what you're trying to do, it just needs a bit of refinement I think.
>
> jeff


[PATCH] MATCH: Add `(x | c) & ~(y | c)` and `x & ~(y | x)` patterns [PR98710]

2023-09-03 Thread Andrew Pinski via Gcc-patches
Adding some more simple bit_and/bit_ior patterns.
How often these show up, I have no idea.

This was tested on top of
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629174.html .

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/98710
* match.pd (`(x | c) & ~(y | c)`, `(x & c) | ~(y & c)`): New pattern.
(`x & ~(y | x)`, `x | ~(y & x)`): New patterns.

gcc/testsuite/ChangeLog:

PR tree-optimization/98710
* gcc.dg/tree-ssa/andor-7.c: New test.
* gcc.dg/tree-ssa/andor-8.c: New test.
---
 gcc/match.pd| 14 +-
 gcc/testsuite/gcc.dg/tree-ssa/andor-7.c | 16 
 gcc/testsuite/gcc.dg/tree-ssa/andor-8.c | 19 +++
 3 files changed, 48 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/andor-7.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/andor-8.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 3495f9451d1..a3f507a1e2e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1995,7 +1995,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   /* (x & y) | (x | z) -> (x | z) */
  (simplify
   (bitop:c (rbitop:c @0 @1) (bitop:c@3 @0 @2))
-  @3))
+  @3)
+ /* (x | c) & ~(y | c) -> x & ~(y | c) */
+ /* (x & c) | ~(y & c) -> x | ~(y & c) */
+ (simplify
+  (bitop:c (rbitop:c @0 @1) (bit_not@3 (rbitop:c @1 @2)))
+  (bitop @0 @3))
+ /* x & ~(y | x) -> 0 */
+ /* x | ~(y & x) -> -1 */
+ (simplify
+  (bitop:c @0 (bit_not (rbitop:c @0 @1)))
+  (if (bitop == BIT_AND_EXPR)
+   { build_zero_cst (type); }
+   { build_minus_one_cst (type); })))
 
 /* ((x | y) & z) | x -> (z & y) | x
((x ^ y) & z) | x -> (z & y) | x  */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/andor-7.c 
b/gcc/testsuite/gcc.dg/tree-ssa/andor-7.c
new file mode 100644
index 000..63b70fa7888
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/andor-7.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-original" } */
+/* PR tree-optimization/98710 */
+
+signed foo(signed x, signed y, signed z)
+{
+return (x | z) & ~(y | z); // x & ~(y | z);
+}
+// Note . here is `(` or `)`
+/* { dg-final { scan-tree-dump "return x \& ~.y \\| z.;|return ~.y \\| z. \& 
x;" "original" } } */
+
+signed foo_or(signed a, signed b, signed c)
+{
+return (a & c) | ~(b & c); // a | ~(b & c);
+}
+/* { dg-final { scan-tree-dump "return a \\| ~.b \& c.;|return ~.b \& c. \\| 
a;" "original" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/andor-8.c 
b/gcc/testsuite/gcc.dg/tree-ssa/andor-8.c
new file mode 100644
index 000..0c2eb4c1a00
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/andor-8.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-original" } */
+/* PR tree-optimization/98710 */
+
+signed foo2(signed a, signed b, signed c)
+{
+return (a & ~(b | a)) & c; // 0
+}
+/* { dg-final { scan-tree-dump "return 0;" "original" } } */
+signed foo2_or(signed x, signed y, signed z)
+{
+return (x | ~(y & x)) & z; // -1 & z -> z
+}
+
+/* { dg-final { scan-tree-dump "return z;" "original" } } */
+/* All | and & should have been removed. */
+/* { dg-final { scan-tree-dump-not "~" "original" } } */
+/* { dg-final { scan-tree-dump-not " \& " "original" } } */
+/* { dg-final { scan-tree-dump-not " \\| " "original" } } */
-- 
2.31.1



[PATCH] MATCH: Add `~MAX(~X, Y)` pattern: [PR96694]

2023-09-03 Thread Andrew Pinski via Gcc-patches
This adds `~MAX(~X, Y)` and `~MIN(~X, Y)` patterns
that are like the `~(~a & b)` and `~(~a | b)` patterns
and allows to reduce the number of ~ by 1.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/96694

gcc/ChangeLog:

* match.pd (`~MAX(~X, Y)`, `~MIN(~X, Y)`): New patterns.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/minmax-24.c: New test.
---
 gcc/match.pd  |  7 -
 gcc/testsuite/gcc.dg/tree-ssa/minmax-24.c | 31 +++
 2 files changed, 37 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/minmax-24.c

diff --git a/gcc/match.pd b/gcc/match.pd
index e9ce48ea7fa..604c2c2360c 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3786,7 +3786,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  maxmin (max min)
  (simplify
   (minmax (bit_not:s@2 @0) (bit_not:s@3 @1))
-  (bit_not (maxmin @0 @1
+  (bit_not (maxmin @0 @1)))
+/* ~MAX(~X, Y) --> MIN(X, ~Y) */
+/* ~MIN(~X, Y) --> MAX(X, ~Y) */
+ (simplify
+  (bit_not (minmax:cs (bit_not @0) @1))
+  (maxmin @0 (bit_not @1
 
 /* MIN (X, Y) == X -> X <= Y  */
 (for minmax (min min max max)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-24.c 
b/gcc/testsuite/gcc.dg/tree-ssa/minmax-24.c
new file mode 100644
index 000..2b21f94eecf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-24.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* PR tree-optimization/96694 */
+
+static inline int min(int a, int b)
+{
+  return a < b ? a : b;
+}
+
+static inline int max(int a, int b)
+{
+  return a > b ? a : b;
+}
+
+int max_not(int x, int y)
+{
+  return ~max(~x, y); // min (x, ~y)
+}
+/* { dg-final { scan-tree-dump "~y_\[0-9\]+.D.;" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "~x_\[0-9\]+.D.;" "optimized" } } */
+/* { dg-final { scan-tree-dump "MIN_EXPR |MIN_EXPR 
<_\[0-9\]+, x_\[0-9\]+.D.>" "optimized" } } */
+
+int min_not(int c, int d)
+{
+  return ~min(~c, d); // max (c, ~d)
+}
+/* { dg-final { scan-tree-dump "~d_\[0-9\]+.D.;" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "~c_\[0-9\]+.D.;" "optimized" } } */
+/* { dg-final { scan-tree-dump "MAX_EXPR |MIN_EXPR 
<_\[0-9\]+, c_\[0-9\]+.D.>" "optimized" } } */
+
+/* { dg-final { scan-tree-dump-times "~" 2 "optimized" } } */
-- 
2.31.1



[PATCH] MATCH: Add pattern for `(x | y) & (x & z)`

2023-09-03 Thread Andrew Pinski via Gcc-patches
Like the pattern already there for `(x | y) & x`,
this adds a simple pattern to optimize `(x | y) & (x & z)`
to just `x & z`.

OK? Bootstrapped and tested on x86-64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/103536
* match.pd (`(x | y) & (x & z)`,
`(x & y) | (x | z)`): New patterns.

gcc/testsuite/ChangeLog:

PR tree-optimization/103536
* gcc.dg/tree-ssa/andor-6.c: New test.
* gcc.dg/tree-ssa/andor-bool-1.c: New test.
---
 gcc/match.pd |  7 ++-
 gcc/testsuite/gcc.dg/tree-ssa/andor-6.c  | 19 +++
 gcc/testsuite/gcc.dg/tree-ssa/andor-bool-1.c | 13 +
 3 files changed, 38 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/andor-6.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/andor-bool-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 3efc971f7f6..3495f9451d1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1990,7 +1990,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (with { bool wascmp; }
(if (bitwise_inverted_equal_p (@0, @2, wascmp)
&& (!wascmp || element_precision (type) == 1))
-(bitop @0 @1)
+(bitop @0 @1
+  /* (x | y) & (x & z) -> (x & z) */
+  /* (x & y) | (x | z) -> (x | z) */
+ (simplify
+  (bitop:c (rbitop:c @0 @1) (bitop:c@3 @0 @2))
+  @3))
 
 /* ((x | y) & z) | x -> (z & y) | x
((x ^ y) & z) | x -> (z & y) | x  */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/andor-6.c 
b/gcc/testsuite/gcc.dg/tree-ssa/andor-6.c
new file mode 100644
index 000..32e11730f98
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/andor-6.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-original" } */
+/* PR tree-optimization/103536 */
+
+int
+orand(int a, int b, int c)
+{
+return (a | b) & (a & c); // a & c
+}
+
+/* { dg-final { scan-tree-dump "return a \& c;" "original" } } */
+
+int
+andor(int d, int e, int f)
+{
+return (d & e) | (d | f); // d | f
+}
+
+/* { dg-final { scan-tree-dump "return d \\| f;" "original" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/andor-bool-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/andor-bool-1.c
new file mode 100644
index 000..a1b974f3859
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/andor-bool-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* PR tree-optimization/103536 */
+
+_Bool
+src_1 (_Bool a, _Bool b)
+{
+return (a || b) && (a && b);
+}
+
+/* { dg-final { scan-tree-dump "a_\[0-9\]+.D. \& b_\[0-9\]+.D." "optimized" } 
} */
+/* { dg-final { scan-tree-dump-not "a_\[0-9\]+.D. \\\| b_\[0-9\]+.D." 
"optimized" } } */
+/* { dg-final { scan-tree-dump-not "if " "optimized" } } */
-- 
2.31.1



[PATCH] MATCH: Transform `(1 >> X) !=/== 0` into `X ==/!= 0`

2023-09-03 Thread Andrew Pinski via Gcc-patches
We currently have a pattern for handling `(C >> X) & D == 0`
but if C is 1 and D is 1, the `& 1` might have been removed.

gcc/ChangeLog:

PR tree-optimization/105832
* match.pd (`(1 >> X) != 0`): New pattern

gcc/testsuite/ChangeLog:

PR tree-optimization/105832
* gcc.dg/tree-ssa/pr105832-1.c: New test.
* gcc.dg/tree-ssa/pr105832-2.c: New test.
* gcc.dg/tree-ssa/pr105832-3.c: New test.
---
 gcc/match.pd   | 10 -
 gcc/testsuite/gcc.dg/tree-ssa/pr105832-1.c | 25 
 gcc/testsuite/gcc.dg/tree-ssa/pr105832-2.c | 30 ++
 gcc/testsuite/gcc.dg/tree-ssa/pr105832-3.c | 46 ++
 4 files changed, 109 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr105832-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr105832-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr105832-3.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 5270e4104ac..e9ce48ea7fa 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4026,7 +4026,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 /* Simplify ((C << x) & D) != 0 where C and D are power of two constants,
either to false if D is smaller (unsigned comparison) than C, or to
-   x == log2 (D) - log2 (C).  Similarly for right shifts.  */
+   x == log2 (D) - log2 (C).  Similarly for right shifts.
+   Note for `(1 >> x)`, the & 1 has been removed so matching that seperately. 
*/
 (for cmp (ne eq)
  icmp (eq ne)
  (simplify
@@ -4043,7 +4044,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
int c2 = wi::clz (wi::to_wide (@2)); }
  (if (c1 > c2)
   { constant_boolean_node (cmp == NE_EXPR ? false : true, type); }
-  (icmp @0 { build_int_cst (TREE_TYPE (@0), c2 - c1); }))
+  (icmp @0 { build_int_cst (TREE_TYPE (@0), c2 - c1); })
+ /* `(1 >> X) != 0` -> `X == 0` */
+ /* `(1 >> X) == 0` -> `X != 0` */
+ (simplify
+  (cmp (rshift integer_onep @0) integer_zerop)
+   (icmp @0 { build_zero_cst (TREE_TYPE (@0)); })))
 
 /* (CST1 << A) == CST2 -> A == ctz (CST2) - ctz (CST1)
(CST1 << A) != CST2 -> A != ctz (CST2) - ctz (CST1)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr105832-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr105832-1.c
new file mode 100644
index 000..d7029d39c85
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr105832-1.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+/* PR tree-optimization/105832 */
+
+void foo(void);
+
+static struct {
+short a;
+signed char b;
+} c;
+
+static signed char d;
+
+int main() {
+signed char g = c.b > 4U ? c.b : c.b << 2;
+for (int h = 0; h < 5; h++) {
+d = (g >= 2 || 1 >> g) ? g : g << 1;
+if (d && 1 == g)
+foo();
+c.a = 0;
+}
+}
+
+/* The call of foo should have been removed. */
+/* { dg-final { scan-tree-dump-not "foo " "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr105832-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr105832-2.c
new file mode 100644
index 000..2d2a33e2755
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr105832-2.c
@@ -0,0 +1,30 @@
+/* PR tree-optimization/105832 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-original" } */
+/* { dg-final { scan-tree-dump "return a == 0;" "original" } } */
+/* { dg-final { scan-tree-dump "return b == 0;" "original" } } */
+/* { dg-final { scan-tree-dump "return c != 0;" "original" } } */
+/* { dg-final { scan-tree-dump "return d != 0;" "original" } } */
+
+int
+f1 (int a)
+{
+  return (1 >> a) != 0;
+}
+
+int
+f2 (int b)
+{
+  return ((1 >> b) & 1) != 0;
+}
+int
+f3 (int c)
+{
+  return (1 >> c) == 0;
+}
+
+int
+f4 (int d)
+{
+  return ((1 >> d) & 1) == 0;
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr105832-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr105832-3.c
new file mode 100644
index 000..2bdd9afcbc7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr105832-3.c
@@ -0,0 +1,46 @@
+/* PR tree-optimization/105832 */
+/* { dg-do compile } */
+/* Disable the first forwprop1 as that will catch f2/f4 even though `&1`
+   will be removed during evrp. */
+/* { dg-options "-O2 -fdisable-tree-forwprop1 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump "a_\[0-9]+\\(D\\) == 0" "optimized" } } */
+/* { dg-final { scan-tree-dump "b_\[0-9]+\\(D\\) == 0" "optimized" } } */
+/* { dg-final { scan-tree-dump "c_\[0-9]+\\(D\\) != 0" "optimized" } } */
+/* { dg-final { scan-tree-dump "d_\[0-9]+\\(D\\) != 0" "optimized" } } */
+
+int g(void);
+int h(void);
+
+int
+f1 (int a)
+{
+  int t = 1 >> a;
+  if (t != 0) return g();
+  return h();
+}
+
+int
+f2 (int b)
+{
+  int t = 1 >> b;
+  t &= 1;
+  if (t != 0) return g();
+  return h();
+}
+
+int
+f3 (int c)
+{
+  int t = 1 >> c;
+  if (t == 0) return g();
+  return h();
+}
+
+int
+f4 (int d)
+{
+  int t = 1 >> d;
+  t &= 1;
+  if (t == 0) return g();
+  return h();
+}
-- 
2.31.1



[PATCH] Improve rewrite_to_defined_overflow for lhs already the correct type

2023-09-03 Thread Andrew Pinski via Gcc-patches
This improves rewrite_to_defined_overflow slightly if we already
have an unsigned type. The only place where this seems to show up
is ifcombine. It removes one extra statement which gets added and
then later on removed.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/111276
* gimple-fold.cc (rewrite_to_defined_overflow): Don't
add a new lhs if we already the unsigned type.
---
 gcc/gimple-fold.cc | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index fd01810581a..2fcafeada37 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -8721,10 +8721,19 @@ rewrite_to_defined_overflow (gimple *stmt, bool 
in_place /* = false */)
op = gimple_convert (, type, op);
gimple_set_op (stmt, i, op);
   }
-  gimple_assign_set_lhs (stmt, make_ssa_name (type, stmt));
+  bool needs_cast_back = false;
+  if (!useless_type_conversion_p (type, TREE_TYPE (lhs)))
+{
+  gimple_assign_set_lhs (stmt, make_ssa_name (type, stmt));
+  needs_cast_back = true;
+}
+
   if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR)
 gimple_assign_set_rhs_code (stmt, PLUS_EXPR);
-  gimple_set_modified (stmt, true);
+
+  if (needs_cast_back || stmts)
+gimple_set_modified (stmt, true);
+
   if (in_place)
 {
   gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
@@ -8734,6 +8743,10 @@ rewrite_to_defined_overflow (gimple *stmt, bool in_place 
/* = false */)
 }
   else
 gimple_seq_add_stmt (, stmt);
+
+  if (!needs_cast_back)
+return stmts;
+
   gimple *cvt = gimple_build_assign (lhs, NOP_EXPR, gimple_assign_lhs (stmt));
   if (in_place)
 {
-- 
2.31.1



[PATCH 1/3] Improve ssa_name_has_boolean_range slightly

2023-09-02 Thread Andrew Pinski via Gcc-patches
Right now ssa_name_has_boolean_range compares the range to
range_true_and_false but instead we would get the nonzero bits and
compare that to 1 instead (<=u 1).
The nonzerobits comparison can be done in similar fashion.
Note I think get_nonzero_bits is redundant as the range queury will
return a more accurate version or the same value.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* tree-ssanames.cc (ssa_name_has_boolean_range): Improve
using range's get_nonzero_bits and use `<=u 1`.
---
 gcc/tree-ssanames.cc | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-ssanames.cc b/gcc/tree-ssanames.cc
index 6c362995c1a..7940d9954d8 100644
--- a/gcc/tree-ssanames.cc
+++ b/gcc/tree-ssanames.cc
@@ -535,10 +535,11 @@ ssa_name_has_boolean_range (tree op)
 {
   int_range<2> r;
   if (get_range_query (cfun)->range_of_expr (r, op)
- && r == range_true_and_false (TREE_TYPE (op)))
+ && !r.undefined_p ()
+ && wi::leu_p (r.get_nonzero_bits (), 1))
return true;
 
-  if (wi::eq_p (get_nonzero_bits (op), 1))
+  if (wi::leu_p (get_nonzero_bits (op), 1))
return true;
 }
 
-- 
2.31.1



[PATCH 3/3] MATCH: Replace all uses of ssa_name_has_boolean_range with zero_one_valued_p

2023-09-02 Thread Andrew Pinski via Gcc-patches
This replaces all uses of ssa_name_has_boolean_range with zero_one_valued_p
except for the one in the definition of zero_one_valued_p. This simplifies
the code in general and makes only one way of saying we have a range of [0,1].

Note this depends on the patch that adds ssa_name_has_boolean_range usage
to zero_one_valued_p.

OK? Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* match.pd: Move zero_one_valued_p and truth_valued_p
towards the begnining of the file.
(X / bool_range_Y): Use zero_one_valued_p instead
of ssa_name_has_boolean_range. Move after all other
`X / Y` patterns. Add check to make sure bool_range_Y
is not the literal 0.
(1 - a): Use zero_one_valued_p instead
of ssa_name_has_boolean_range
(-(type)!A): Likewise.
---
 gcc/match.pd | 96 
 1 file changed, 45 insertions(+), 51 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index 04033546fc1..5270e4104ac 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -172,6 +172,38 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 )
 #endif
 
+/* Try simple folding for X op !X, and X op X with the help
+   of the truth_valued_p and logical_inverted_value predicates.  */
+(match truth_valued_p
+ @0
+ (if (INTEGRAL_TYPE_P (type) && TYPE_PRECISION (type) == 1)))
+(for op (tcc_comparison truth_and truth_andif truth_or truth_orif truth_xor)
+ (match truth_valued_p
+  (op @0 @1)))
+(match truth_valued_p
+  (truth_not @0))
+
+/* zero_one_valued_p will match when a value is known to be either
+   0 or 1 including constants 0 or 1.
+   Signed 1-bits includes -1 so they cannot match here. */
+/* Note ssa_name_has_boolean_range uses
+   the current ranger while tree_nonzero_bits uses only
+   the global one. */
+(match zero_one_valued_p
+ SSA_NAME@0
+ (if (ssa_name_has_boolean_range (@0
+(match zero_one_valued_p
+ @0
+ (if (INTEGRAL_TYPE_P (type)
+  && (TYPE_UNSIGNED (type)
+ || TYPE_PRECISION (type) > 1)
+  && wi::leu_p (tree_nonzero_bits (@0), 1
+(match zero_one_valued_p
+ truth_valued_p@0
+ (if (INTEGRAL_TYPE_P (type)
+  && (TYPE_UNSIGNED (type)
+ || TYPE_PRECISION (type) > 1
+
 /* Transform likes of (char) ABS_EXPR <(int) x> into (char) ABSU_EXPR 
ABSU_EXPR returns unsigned absolute value of the operand and the operand
of the ABSU_EXPR will have the corresponding signed type.  */
@@ -493,13 +525,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (div @0 integer_minus_onep@1)
   (if (!TYPE_UNSIGNED (type))
(negate @0)))
- /* X / bool_range_Y is X.  */ 
- (simplify
-  (div @0 SSA_NAME@1)
-  (if (INTEGRAL_TYPE_P (type)
-   && ssa_name_has_boolean_range (@1)
-   && !flag_non_call_exceptions)
-   @0))
  /* X / X is one.  */
  (simplify
   (div @0 @0)
@@ -525,7 +550,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& TYPE_OVERFLOW_UNDEFINED (type)
&& !integer_zerop (@0)
&& (!flag_non_call_exceptions || tree_expr_nonzero_p (@0)))
-{ build_minus_one_cst (type); })))
+{ build_minus_one_cst (type); }))
+ /* X / bool_range_Y is X.  */
+ (simplify
+  (div @0 zero_one_valued_p@1)
+  (if (INTEGRAL_TYPE_P (type)
+   /* But not for X / 0 so that we can get the proper warnings and errors. 
*/
+   && !integer_zerop (@1)
+   && !flag_non_call_exceptions)
+   @0)))
 
 /* For unsigned integral types, FLOOR_DIV_EXPR is the same as
TRUNC_DIV_EXPR.  Rewrite into the latter in this case.  Similarly
@@ -1865,14 +1898,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (plus @0 (negate @1
 
 /* 1 - a is a ^ 1 if a had a bool range. */
-/* This is only enabled for gimple as sometimes
-   cfun is not set for the function which contains
-   the SSA_NAME (e.g. while IPA passes are happening,
-   fold might be called).  */
 (simplify
- (minus integer_onep@0 SSA_NAME@1)
-  (if (INTEGRAL_TYPE_P (type)
-   && ssa_name_has_boolean_range (@1))
+ (minus integer_onep@0 zero_one_valued_p@1)
+  (if (INTEGRAL_TYPE_P (type))
(bit_xor @1 @0)))
 
 /* Other simplifications of negation (c.f. fold_negate_expr_1).  */
@@ -2018,17 +2046,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (cst2)
(bitop @0 { cst2; }
 
-/* Try simple folding for X op !X, and X op X with the help
-   of the truth_valued_p and logical_inverted_value predicates.  */
-(match truth_valued_p
- @0
- (if (INTEGRAL_TYPE_P (type) && TYPE_PRECISION (type) == 1)))
-(for op (tcc_comparison truth_and truth_andif truth_or truth_orif truth_xor)
- (match truth_valued_p
-  (op @0 @1)))
-(match truth_valued_p
-  (truth_not @0))
-
 (match (logical_inverted_value @0)
  (truth_not @0))
 (match (logical_inverted_value @0)
@@ -2060,27 +2077,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (bit_not (bit_not @0))
   @0)
 
-/* zero_one_valued_p will match when a value is known to be either
-   0 or 1 including constants 0 or 1.
-   Signed 1-bits includes -1 so they cannot match here. */
-/* Note 

[PATCH 2/3] MATCH: Improve zero_one_valued_p by using ssa_name_has_boolean_range

2023-09-02 Thread Andrew Pinski via Gcc-patches
Currently zero_one_valued_p uses tree_nonzero_bits which uses
the global ranges of the SSA Names. We can improve this via using
ssa_name_has_boolean_range which uses the local ranges
which are used while handling folding during VRP and other passes.

OK? Bootstrapped and tested on x86_64 with no regressions.

gcc/ChangeLog:

* match.pd (zero_one_valued_p): Match SSA_NAMES where
ssa_name_has_boolean_range returns true.
---
 gcc/match.pd | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index b94d71d2376..04033546fc1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2063,6 +2063,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 /* zero_one_valued_p will match when a value is known to be either
0 or 1 including constants 0 or 1.
Signed 1-bits includes -1 so they cannot match here. */
+/* Note ssa_name_has_boolean_range uses
+   the current ranger while tree_nonzero_bits uses only
+   the global one. */
+(match zero_one_valued_p
+ SSA_NAME@0
+ (if (ssa_name_has_boolean_range (@0
 (match zero_one_valued_p
  @0
  (if (INTEGRAL_TYPE_P (type)
-- 
2.31.1



[PATCH] MATCH: `(nop_convert)-(convert)a` into -(convert)a if we are converting from something smaller

2023-09-02 Thread Andrew Pinski via Gcc-patches
This allows removal of one conversion and in the case of booleans, might be 
able to remove
the negate and the other conversion later on.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/107137

gcc/ChangeLog:

* match.pd (`(nop_convert)-(convert)a`): New pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/neg-cast-2.c: New test.
* gcc.dg/tree-ssa/neg-cast-3.c: New test.
---
 gcc/match.pd   | 11 +++
 gcc/testsuite/gcc.dg/tree-ssa/neg-cast-2.c | 20 
 gcc/testsuite/gcc.dg/tree-ssa/neg-cast-3.c | 15 +++
 3 files changed, 46 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/neg-cast-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/neg-cast-3.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 487a7e38719..3efc971f7f6 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -959,6 +959,17 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 #endif

 
+/* (nop_outer_cast)-(inner_cast)var -> -(outer_cast)(var)
+   if var is smaller in precision.
+   This is always safe for both doing the negative in signed or unsigned
+   as the value for undefined will not show up.  */
+(simplify
+ (convert (negate:s@1 (convert:s @0)))
+ (if (INTEGRAL_TYPE_P (type)
+  && tree_nop_conversion_p (type, TREE_TYPE (@1))
+  && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)))
+(negate (convert @0
+
 (for op (negate abs)
  /* Simplify cos(-x) and cos(|x|) -> cos(x).  Similarly for cosh.  */
  (for coss (COS COSH)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/neg-cast-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/neg-cast-2.c
new file mode 100644
index 000..c1d5066cd4a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/neg-cast-2.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-fre1 -fdump-tree-optimized" } */
+/* part of PR tree-optimization/108397 */
+
+long long
+foo (unsigned char o)
+{
+  unsigned long long t1 = -(long long) (o == 0);
+  unsigned long long t2 = -(long long) (t1 != 0);
+  unsigned long long t3 = -(long long) (t1 <= t2);
+  return t3;
+}
+
+/* Should be able to optimize this down to just `return -1;` during fre1. */
+/* { dg-final { scan-tree-dump "return -1;" "fre1" } } */
+/* FRE does not remove all dead statements so a few negate expressions are 
left behind. */
+/* { dg-final { scan-tree-dump-not " -\[^1\]" "fre1" { xfail *-*-* } } } */
+
+/* { dg-final { scan-tree-dump "return -1;" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " - " "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/neg-cast-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/neg-cast-3.c
new file mode 100644
index 000..7b23ca85d1f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/neg-cast-3.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-forwprop1 -fdump-tree-optimized" } */
+/* PR tree-optimization/107137 */
+
+unsigned f(_Bool a)
+{
+  int t = a;
+  t = -t;
+  return t;
+}
+
+/* There should be no cast to int at all. */
+/* Forwprop1 does not remove all of the statements. */
+/* { dg-final { scan-tree-dump-not "\\\(int\\\)" "forwprop1" { xfail *-*-* } } 
} */
+/* { dg-final { scan-tree-dump-not "\\\(int\\\)" "optimized" } } */
-- 
2.31.1



[PATCH] ssa_name_has_boolean_range vs signed-boolean:31 types

2023-09-01 Thread Andrew Pinski via Gcc-patches
This turns out to be a latent bug in ssa_name_has_boolean_range
where it would return true for all boolean types but all of the
uses of ssa_name_has_boolean_range was expecting 0/1 as the range
rather than [-1,0].
So when I fixed vector lower to do all comparisons in boolean_type
rather than still in the signed-boolean:31 type (to fix a different issue),
the pattern in match for `-(type)!A -> (type)A - 1.` would assume A (which
was signed-boolean:31) had a range of [0,1] which broke down and sometimes
gave us -1/-2 as values rather than what we were expecting of -1/0.

This was the simpliest patch I found while testing.

We have another way of matching [0,1] range which we could use instead
of ssa_name_has_boolean_range except that uses only the global ranges
rather than the local range (during VRP).
I tried to clean this up slightly by using gimple_match_zero_one_valuedp
inside ssa_name_has_boolean_range but that failed because due to using
only the global ranges. I then tried to change get_nonzero_bits to use
the local ranges at the optimization time but that failed also because
we would remove branches to __builtin_unreachable during evrp and lose
information as we don't set the global ranges during evrp.

OK? Bootstrapped and tested on x86_64-linux-gnu.

PR 110817

gcc/ChangeLog:

* tree-ssanames.cc (ssa_name_has_boolean_range): Remove the
check for boolean type as they don't have "[0,1]" range.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/pr110817-1.c: New test.
* gcc.c-torture/execute/pr110817-2.c: New test.
* gcc.c-torture/execute/pr110817-3.c: New test.
---
 gcc/testsuite/gcc.c-torture/execute/pr110817-1.c | 13 +
 gcc/testsuite/gcc.c-torture/execute/pr110817-2.c | 16 
 gcc/testsuite/gcc.c-torture/execute/pr110817-3.c | 14 ++
 gcc/tree-ssanames.cc |  4 
 4 files changed, 43 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110817-1.c
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110817-2.c
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110817-3.c

diff --git a/gcc/testsuite/gcc.c-torture/execute/pr110817-1.c 
b/gcc/testsuite/gcc.c-torture/execute/pr110817-1.c
new file mode 100644
index 000..1d33fa9a207
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr110817-1.c
@@ -0,0 +1,13 @@
+typedef unsigned long __attribute__((__vector_size__ (8))) V;
+
+
+V c;
+
+int
+main (void)
+{
+  V v = ~((V) { } <=0);
+  if (v[0])
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr110817-2.c 
b/gcc/testsuite/gcc.c-torture/execute/pr110817-2.c
new file mode 100644
index 000..1f759178425
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr110817-2.c
@@ -0,0 +1,16 @@
+
+typedef unsigned char u8;
+typedef unsigned __attribute__((__vector_size__ (8))) V;
+
+V v;
+unsigned char c;
+
+int
+main (void)
+{
+  V x = (v > 0) > (v != c);
+ // V x = foo ();
+  if (x[0] || x[1])
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr110817-3.c 
b/gcc/testsuite/gcc.c-torture/execute/pr110817-3.c
new file mode 100644
index 000..36f09c88dd9
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr110817-3.c
@@ -0,0 +1,14 @@
+typedef unsigned __attribute__((__vector_size__ (1*sizeof(unsigned V;
+
+V v;
+unsigned char c;
+
+int
+main (void)
+{
+  V x = (v > 0) > (v != c);
+  volatile signed int t = x[0];
+  if (t)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssanames.cc b/gcc/tree-ssanames.cc
index 23387b90fe3..6c362995c1a 100644
--- a/gcc/tree-ssanames.cc
+++ b/gcc/tree-ssanames.cc
@@ -521,10 +521,6 @@ ssa_name_has_boolean_range (tree op)
 {
   gcc_assert (TREE_CODE (op) == SSA_NAME);
 
-  /* Boolean types always have a range [0..1].  */
-  if (TREE_CODE (TREE_TYPE (op)) == BOOLEAN_TYPE)
-return true;
-
   /* An integral type with a single bit of precision.  */
   if (INTEGRAL_TYPE_P (TREE_TYPE (op))
   && TYPE_UNSIGNED (TREE_TYPE (op))
-- 
2.31.1



[PATCH 2/2] VR-VALUES: Rewrite test_for_singularity using range_op_handler

2023-09-01 Thread Andrew Pinski via Gcc-patches
So it turns out there was a simplier way of starting to
improve VRP to start to fix PR 110131, PR 108360, and PR 108397.
That was rewrite test_for_singularity to use range_op_handler
and Value_Range.

This patch implements that and

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* vr-values.cc (test_for_singularity): Add edge argument
and rewrite using range_op_handler.
(simplify_compare_using_range_pairs): Use Value_Range
instead of value_range and update test_for_singularity call.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/vrp124.c: New test.
* gcc.dg/tree-ssa/vrp125.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/vrp124.c | 44 
 gcc/testsuite/gcc.dg/tree-ssa/vrp125.c | 44 
 gcc/vr-values.cc   | 99 --
 3 files changed, 117 insertions(+), 70 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp125.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
new file mode 100644
index 000..6ccbda35d1b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
@@ -0,0 +1,44 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+/* Should be optimized to a == -100 */
+int g(int a)
+{
+  if (a == -100 || a >= 0)
+;
+  else
+return 0;
+  return a < 0;
+}
+
+/* Should optimize to a == 0 */
+int f(int a)
+{
+  if (a == 0 || a > 100)
+;
+  else
+return 0;
+  return a < 50;
+}
+
+/* Should be optimized to a == 0. */
+int f2(int a)
+{
+  if (a == 0 || a > 100)
+;
+  else
+return 0;
+  return a < 100;
+}
+
+/* Should optimize to a == 100 */
+int f1(int a)
+{
+  if (a < 0 || a == 100)
+;
+  else
+return 0;
+  return a > 50;
+}
+
+/* { dg-final { scan-tree-dump-not "goto " "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp125.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp125.c
new file mode 100644
index 000..f6c2f8e35f1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp125.c
@@ -0,0 +1,44 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+/* Should be optimized to a == -100 */
+int g(int a)
+{
+  if (a == -100 || a == -50 || a >= 0)
+;
+  else
+return 0;
+  return a < -50;
+}
+
+/* Should optimize to a == 0 */
+int f(int a)
+{
+  if (a == 0 || a == 50 || a > 100)
+;
+  else
+return 0;
+  return a < 50;
+}
+
+/* Should be optimized to a == 0. */
+int f2(int a)
+{
+  if (a == 0 || a == 50 || a > 100)
+;
+  else
+return 0;
+  return a < 25;
+}
+
+/* Should optimize to a == 100 */
+int f1(int a)
+{
+  if (a < 0 || a == 50 || a == 100)
+;
+  else
+return 0;
+  return a > 50;
+}
+
+/* { dg-final { scan-tree-dump-not "goto " "optimized" } } */
diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
index 52ab4fe6109..2474e57ee90 100644
--- a/gcc/vr-values.cc
+++ b/gcc/vr-values.cc
@@ -904,69 +904,33 @@ simplify_using_ranges::simplify_bit_ops_using_ranges
 }
 
 /* We are comparing trees OP1 and OP2 using COND_CODE.  OP1 has
-   a known value range VR.
+   a known value range OP1_RANGE.
 
If there is one and only one value which will satisfy the
-   conditional, then return that value.  Else return NULL.
-
-   If signed overflow must be undefined for the value to satisfy
-   the conditional, then set *STRICT_OVERFLOW_P to true.  */
+   conditional on the EDGE, then return that value.
+   Else return NULL.  */
 
 static tree
 test_for_singularity (enum tree_code cond_code, tree op1,
- tree op2, const value_range *vr)
+ tree op2, const int_range_max _range, bool edge)
 {
-  tree min = NULL;
-  tree max = NULL;
-
-  /* Extract minimum/maximum values which satisfy the conditional as it was
- written.  */
-  if (cond_code == LE_EXPR || cond_code == LT_EXPR)
+  /* This is already a singularity.  */
+  if (cond_code == NE_EXPR || cond_code == EQ_EXPR)
+return NULL;
+  auto range_op = range_op_handler (cond_code);
+  wide_int w = wi::to_wide (op2);
+  int_range<1> op2_range (TREE_TYPE (op2), w, w);
+  int_range_max vr;
+  if (range_op.op1_range (vr, TREE_TYPE (op1),
+ edge ? range_true () : range_false (),
+ op2_range))
 {
-  min = TYPE_MIN_VALUE (TREE_TYPE (op1));
-
-  max = op2;
-  if (cond_code == LT_EXPR)
-   {
- tree one = build_int_cst (TREE_TYPE (op1), 1);
- max = fold_build2 (MINUS_EXPR, TREE_TYPE (op1), max, one);
- /* Signal to compare_values_warnv this expr doesn't overflow.  */
- if (EXPR_P (max))
-   suppress_warning (max, OPT_Woverflow);
-   }
-}
-  else if (cond_code == GE_EXPR || cond_code == GT_EXPR)
-{
-  max = TYPE_MAX_VALUE (TREE_TYPE (op1));
-
-  min = op2;
-  if (cond_code == GT_EXPR)
-   {
- tree one = build_int_cst 

[PATCH 1/2] VR-VALUES: Rename op0/op1 to op1/op2 for test_for_singularity

2023-09-01 Thread Andrew Pinski via Gcc-patches
As requested and make easier to understand with the new ranger
code, rename the arguments op0/op1 to op1/op2.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions

gcc/ChangeLog:

* vr-values.cc (test_for_singularity): Rename
arguments op0/op1 to op1/op2.
---
 gcc/vr-values.cc | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
index a4fddd62841..52ab4fe6109 100644
--- a/gcc/vr-values.cc
+++ b/gcc/vr-values.cc
@@ -903,7 +903,7 @@ simplify_using_ranges::simplify_bit_ops_using_ranges
   return true;
 }
 
-/* We are comparing trees OP0 and OP1 using COND_CODE.  OP0 has
+/* We are comparing trees OP1 and OP2 using COND_CODE.  OP1 has
a known value range VR.
 
If there is one and only one value which will satisfy the
@@ -913,8 +913,8 @@ simplify_using_ranges::simplify_bit_ops_using_ranges
the conditional, then set *STRICT_OVERFLOW_P to true.  */
 
 static tree
-test_for_singularity (enum tree_code cond_code, tree op0,
- tree op1, const value_range *vr)
+test_for_singularity (enum tree_code cond_code, tree op1,
+ tree op2, const value_range *vr)
 {
   tree min = NULL;
   tree max = NULL;
@@ -923,13 +923,13 @@ test_for_singularity (enum tree_code cond_code, tree op0,
  written.  */
   if (cond_code == LE_EXPR || cond_code == LT_EXPR)
 {
-  min = TYPE_MIN_VALUE (TREE_TYPE (op0));
+  min = TYPE_MIN_VALUE (TREE_TYPE (op1));
 
-  max = op1;
+  max = op2;
   if (cond_code == LT_EXPR)
{
- tree one = build_int_cst (TREE_TYPE (op0), 1);
- max = fold_build2 (MINUS_EXPR, TREE_TYPE (op0), max, one);
+ tree one = build_int_cst (TREE_TYPE (op1), 1);
+ max = fold_build2 (MINUS_EXPR, TREE_TYPE (op1), max, one);
  /* Signal to compare_values_warnv this expr doesn't overflow.  */
  if (EXPR_P (max))
suppress_warning (max, OPT_Woverflow);
@@ -937,13 +937,13 @@ test_for_singularity (enum tree_code cond_code, tree op0,
 }
   else if (cond_code == GE_EXPR || cond_code == GT_EXPR)
 {
-  max = TYPE_MAX_VALUE (TREE_TYPE (op0));
+  max = TYPE_MAX_VALUE (TREE_TYPE (op1));
 
-  min = op1;
+  min = op2;
   if (cond_code == GT_EXPR)
{
- tree one = build_int_cst (TREE_TYPE (op0), 1);
- min = fold_build2 (PLUS_EXPR, TREE_TYPE (op0), min, one);
+ tree one = build_int_cst (TREE_TYPE (op1), 1);
+ min = fold_build2 (PLUS_EXPR, TREE_TYPE (op1), min, one);
  /* Signal to compare_values_warnv this expr doesn't overflow.  */
  if (EXPR_P (min))
suppress_warning (min, OPT_Woverflow);
@@ -951,10 +951,10 @@ test_for_singularity (enum tree_code cond_code, tree op0,
 }
 
   /* Now refine the minimum and maximum values using any
- value range information we have for op0.  */
+ value range information we have for op1.  */
   if (min && max)
 {
-  tree type = TREE_TYPE (op0);
+  tree type = TREE_TYPE (op1);
   tree tmin = wide_int_to_tree (type, vr->lower_bound ());
   tree tmax = wide_int_to_tree (type, vr->upper_bound ());
   if (compare_values (tmin, min) == 1)
-- 
2.31.1



Re: [PATCH 2/2] VR-VALUES: Rewrite test_for_singularity using range_op_handler

2023-09-01 Thread Andrew Pinski via Gcc-patches
On Fri, Aug 11, 2023 at 8:08 AM Andrew MacLeod via Gcc-patches
 wrote:
>
>
> On 8/11/23 05:51, Richard Biener wrote:
> > On Fri, Aug 11, 2023 at 11:17 AM Andrew Pinski via Gcc-patches
> >  wrote:
> >> So it turns out there was a simplier way of starting to
> >> improve VRP to start to fix PR 110131, PR 108360, and PR 108397.
> >> That was rewrite test_for_singularity to use range_op_handler
> >> and Value_Range.
> >>
> >> This patch implements that and
> >>
> >> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> > I'm hoping Andrew/Aldy can have a look here.
> >
> > Richard.
> >
> >> gcc/ChangeLog:
> >>
> >>  * vr-values.cc (test_for_singularity): Add edge argument
> >>  and rewrite using range_op_handler.
> >>  (simplify_compare_using_range_pairs): Use Value_Range
> >>  instead of value_range and update test_for_singularity call.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>  * gcc.dg/tree-ssa/vrp124.c: New test.
> >>  * gcc.dg/tree-ssa/vrp125.c: New test.
> >> ---
> >>   gcc/testsuite/gcc.dg/tree-ssa/vrp124.c | 44 +
> >>   gcc/testsuite/gcc.dg/tree-ssa/vrp125.c | 44 +
> >>   gcc/vr-values.cc   | 91 --
> >>   3 files changed, 114 insertions(+), 65 deletions(-)
> >>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
> >>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp125.c
> >>
> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c 
> >> b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
> >> new file mode 100644
> >> index 000..6ccbda35d1b
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
> >> @@ -0,0 +1,44 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> >> +
> >> +/* Should be optimized to a == -100 */
> >> +int g(int a)
> >> +{
> >> +  if (a == -100 || a >= 0)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a < 0;
> >> +}
> >> +
> >> +/* Should optimize to a == 0 */
> >> +int f(int a)
> >> +{
> >> +  if (a == 0 || a > 100)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a < 50;
> >> +}
> >> +
> >> +/* Should be optimized to a == 0. */
> >> +int f2(int a)
> >> +{
> >> +  if (a == 0 || a > 100)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a < 100;
> >> +}
> >> +
> >> +/* Should optimize to a == 100 */
> >> +int f1(int a)
> >> +{
> >> +  if (a < 0 || a == 100)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a > 50;
> >> +}
> >> +
> >> +/* { dg-final { scan-tree-dump-not "goto " "optimized" } } */
> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp125.c 
> >> b/gcc/testsuite/gcc.dg/tree-ssa/vrp125.c
> >> new file mode 100644
> >> index 000..f6c2f8e35f1
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp125.c
> >> @@ -0,0 +1,44 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> >> +
> >> +/* Should be optimized to a == -100 */
> >> +int g(int a)
> >> +{
> >> +  if (a == -100 || a == -50 || a >= 0)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a < -50;
> >> +}
> >> +
> >> +/* Should optimize to a == 0 */
> >> +int f(int a)
> >> +{
> >> +  if (a == 0 || a == 50 || a > 100)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a < 50;
> >> +}
> >> +
> >> +/* Should be optimized to a == 0. */
> >> +int f2(int a)
> >> +{
> >> +  if (a == 0 || a == 50 || a > 100)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a < 25;
> >> +}
> >> +
> >> +/* Should optimize to a == 100 */
> >> +int f1(int a)
> >> +{
> >> +  if (a < 0 || a == 50 || a == 100)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a > 50;
> >>

[PATCH] MATCH: `(nop_convert)-a` into -(nop_convert)a if the negate is single use and a is known not to be signed min value

2023-08-31 Thread Andrew Pinski via Gcc-patches
This pushes the conversion further down the chain which allows to optimize away 
more
conversions in many cases.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/107765
PR tree-optimization/107137

gcc/ChangeLog:

* match.pd (`(nop_convert)-a`): New pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/neg-cast-1.c: New test.
* gcc.dg/tree-ssa/neg-cast-2.c: New test.
* gcc.dg/tree-ssa/neg-cast-3.c: New test.
---
 gcc/match.pd   | 31 ++
 gcc/testsuite/gcc.dg/tree-ssa/neg-cast-1.c | 17 
 gcc/testsuite/gcc.dg/tree-ssa/neg-cast-2.c | 20 ++
 gcc/testsuite/gcc.dg/tree-ssa/neg-cast-3.c | 15 +++
 4 files changed, 83 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/neg-cast-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/neg-cast-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/neg-cast-3.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 487a7e38719..3cff9b03d92 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -959,6 +959,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 #endif

 
+/* (nop_cast)-var -> -(nop_cast)(var)
+   if -var is known to not overflow; that is does not include
+   the signed integer MIN. */
+(simplify
+ (convert (negate:s @0))
+ (if (INTEGRAL_TYPE_P (type)
+  && tree_nop_conversion_p (type, TREE_TYPE (@0)))
+  (with {
+/* If the top is not set, there is no overflow happening. */
+bool contains_signed_min = !wi::ges_p (tree_nonzero_bits (@0), 0);
+#if GIMPLE
+int_range_max vr;
+if (contains_signed_min
+&& TREE_CODE (@0) == SSA_NAME
+   && get_range_query (cfun)->range_of_expr (vr, @0)
+   && !vr.undefined_p ())
+  {
+tree stype = signed_type_for (type);
+   auto minvalue = wi::min_value (stype);
+   int_range_max valid_range (TREE_TYPE (@0), minvalue, minvalue);
+   vr.intersect (valid_range);
+   /* If the range does not include min value,
+  then we can do this change around. */
+   if (vr.undefined_p ())
+ contains_signed_min = false;
+  }
+#endif
+   }
+   (if (!contains_signed_min)
+(negate (convert @0))
+
 (for op (negate abs)
  /* Simplify cos(-x) and cos(|x|) -> cos(x).  Similarly for cosh.  */
  (for coss (COS COSH)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/neg-cast-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/neg-cast-1.c
new file mode 100644
index 000..7ddf40aca29
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/neg-cast-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp" } */
+/* PR tree-optimization/107765 */
+
+#include 
+
+int a(int input)
+{
+if (input == INT_MIN) __builtin_unreachable();
+unsigned t = input;
+int tt =  -t;
+return tt == -input;
+}
+
+/* Should be able to optimize this down to just `return 1;` during evrp. */
+/* { dg-final { scan-tree-dump "return 1;" "evrp" } } */
+/* { dg-final { scan-tree-dump-not " - " "evrp" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/neg-cast-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/neg-cast-2.c
new file mode 100644
index 000..ce49079e235
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/neg-cast-2.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-fre3 -fdump-tree-optimized" } */
+/* part of PR tree-optimization/108397 */
+
+long long
+foo (unsigned char o)
+{
+  unsigned long long t1 = -(long long) (o == 0);
+  unsigned long long t2 = -(long long) (t1 != 0);
+  unsigned long long t3 = -(long long) (t1 <= t2);
+  return t3;
+}
+
+/* Should be able to optimize this down to just `return -1;` during fre3. */
+/* { dg-final { scan-tree-dump "return -1;" "fre3" } } */
+/* FRE does not remove all dead statements */
+/* { dg-final { scan-tree-dump-not " - " "fre3" { xfail *-*-* } } } */
+
+/* { dg-final { scan-tree-dump "return -1;" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " - " "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/neg-cast-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/neg-cast-3.c
new file mode 100644
index 000..a26a6051bda
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/neg-cast-3.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-forwprop2 -fdump-tree-optimized" } */
+/* PR tree-optimization/107137 */
+
+unsigned f(_Bool a)
+{
+  int t = a;
+  t = -t;
+  return t;
+}
+
+/* There should be no cast to int at all. */
+/* Forwprop2 does not remove all of the statements. */
+/* { dg-final { scan-tree-dump-not "\\\(int\\\)" "forwprop2" { xfail *-*-* } } 
} */
+/* { dg-final { scan-tree-dump-not "\\\(int\\\)" "optimized" } } */
-- 
2.31.1



Re: [RFC] gimple ssa: SCCP - A new PHI optimization pass

2023-08-31 Thread Andrew Pinski via Gcc-patches
On Thu, Aug 31, 2023 at 5:15 AM Richard Biener via Gcc-patches
 wrote:
>
> On Thu, 31 Aug 2023, Filip Kastl wrote:
>
> > > The most obvious places would be right after SSA construction and before 
> > > RTL expansion.
> > > Can you provide measurements for those positions?
> >
> > The algorithm should only remove PHIs that break SSA form minimality. Since
> > GCC's SSA construction already produces minimal SSA form, the algorithm 
> > isn't
> > expected to remove any PHIs if run right after the construction. I even
> > measured it and indeed -- no PHIs got removed (except for 502.gcc_r, where 
> > the
> > algorithm managed to remove exactly 1 PHI, which is weird).
> >
> > I tried putting the pass before pass_expand. There isn't a lot of PHIs to
> > remove at that point, but there still are some.
>
> That's interesting.  Your placement at
>
>   NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
>   NEXT_PASS (pass_phiopt, true /* early_p */);
> + NEXT_PASS (pass_sccp);
>
> and
>
>NEXT_PASS (pass_tsan);
>NEXT_PASS (pass_dse, true /* use DR analysis */);
>NEXT_PASS (pass_dce);
> +  NEXT_PASS (pass_sccp);
>
> isn't immediately after the "best" existing pass we have to
> remove dead PHIs which is pass_cd_dce.  phiopt might leave
> dead PHIs around and the second instance runs long after the
> last CD-DCE.

Actually the last phiopt is run before last pass_cd_dce:
  NEXT_PASS (pass_dce, true /* update_address_taken_p */);
  /* After late DCE we rewrite no longer addressed locals into SSA
 form if possible.  */
  NEXT_PASS (pass_forwprop);
  NEXT_PASS (pass_sink_code, true /* unsplit edges */);
  NEXT_PASS (pass_phiopt, false /* early_p */);
  NEXT_PASS (pass_fold_builtins);
  NEXT_PASS (pass_optimize_widening_mul);
  NEXT_PASS (pass_store_merging);
  /* If DCE is not run before checking for uninitialized uses,
 we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
 However, this also causes us to misdiagnose cases that should be
 real warnings (e.g., testsuite/gcc.dg/pr18501.c).  */
  NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);

Thanks,
Andrew Pinski


>
> So I wonder if your pass just detects unnecessary PHIs we'd have
> removed by other means and what survives until RTL expansion is
> what we should count?



>
> Can you adjust your original early placement to right after
> the cd-dce pass and for the late placement turn the dce pass
> before it into cd-dce and re-do your measurements?
>
> > 500.perlbench_r
> > Started with 43111
> > Ended with 42942
> > Removed PHI % .39201131961680313700
> >
> > 502.gcc_r
> > Started with 141392
> > Ended with 140455
> > Removed PHI % .66269661649881181400
> >
> > 505.mcf_r
> > Started with 482
> > Ended with 478
> > Removed PHI % .82987551867219917100
> >
> > 523.xalancbmk_r
> > Started with 136040
> > Ended with 135629
> > Removed PHI % .30211702440458688700
> >
> > 531.deepsjeng_r
> > Started with 2150
> > Ended with 2148
> > Removed PHI % .09302325581395348900
> >
> > 541.leela_r
> > Started with 4664
> > Ended with 4650
> > Removed PHI % .30017152658662092700
> >
> > 557.xz_r
> > Started with 43
> > Ended with 43
> > Removed PHI % 0
> >
> > > Can the pass somehow be used as part of propagations like during value 
> > > numbering?
> >
> > I don't think that the pass could be used as a part of different 
> > optimizations
> > since it works on the whole CFG (except for copy propagation as I noted in 
> > the
> > RFC). I'm adding Honza into Cc. He'll have more insight into this.
> >
> > > Could the new file be called gimple-ssa-sccp.cc or something similar?
> >
> > Certainly. Though I'm not sure, but wouldn't tree-ssa-sccp.cc be more
> > appropriate?
> >
> > I'm thinking about naming the pass 'scc-copy' and the file
> > 'tree-ssa-scc-copy.cc'.
> >
> > > Removing some PHIs is nice, but it would be also interesting to know what
> > > are the effects on generated code size and/or performance.
> > > And also if it has any effects on debug information coverage.
> >
> > Regarding performance: I ran some benchmarks on a Zen3 machine with -O3 with
> > and without the new pass. *I got ~2% speedup for 505.mcf_r and 541.leela_r.
> > Here are the full results. What do you think? Should I run more benchmarks? 
> > Or
> > benchmark multiple times? Or run the benchmarks on different machines?*
> >
> > 500.perlbench_r
> > Without SCCP: 244.151807s
> > With SCCP: 242.448438s
> > -0.7025695913124297%
> >
> > 502.gcc_r
> > Without SCCP: 211.029606s
> > With SCCP: 211.614523s
> > +0.27640683243653763%
> >
> > 505.mcf_r
> > Without SCCP: 298.782621s
> > With SCCP: 291.671468s
> > -2.438069465197046%
> >
> > 523.xalancbmk_r
> > Without SCCP: 189.940639s
> > With SCCP: 189.876261s
> > -0.03390523894928332%
> >
> > 531.deepsjeng_r
> > Without SCCP: 250.63648s
> > With SCCP: 250.988624s
> > +0.1403027732444051%
> >
> > 541.leela_r
> > 

[PATCH] MATCH [PR19832]: Optimize some `(a != b) ? a OP b : c`

2023-08-31 Thread Andrew Pinski via Gcc-patches
This patch adds the following match patterns to optimize these:
 /* (a != b) ? (a - b) : 0 -> (a - b) */
 /* (a != b) ? (a ^ b) : 0 -> (a ^ b) */
 /* (a != b) ? (a & b) : a -> (a & b) */
 /* (a != b) ? (a | b) : a -> (a | b) */
 /* (a != b) ? min(a,b) : a -> min(a,b) */
 /* (a != b) ? max(a,b) : a -> max(a,b) */
 /* (a != b) ? (a * b) : (a * a) -> (a * b) */
 /* (a != b) ? (a + b) : (a + a) -> (a + b) */
 /* (a != b) ? (a + b) : (2 * a) -> (a + b) */
Note currently only integer types (include vector types)
are handled. Floating point types can be added later on.

OK? Bootstrapped and tested on x86_64-linux-gnu.

The first pattern had still shows up in GCC in cse.c's preferable
function which was the original motivation for this patch.

PR tree-optimization/19832

gcc/ChangeLog:

* match.pd: Add pattern to optimize
`(a != b) ? a OP b : c`.

gcc/testsuite/ChangeLog:

* g++.dg/opt/vectcond-1.C: New test.
* gcc.dg/tree-ssa/phi-opt-same-1.c: New test.
---
 gcc/match.pd  | 31 ++
 gcc/testsuite/g++.dg/opt/vectcond-1.C | 57 ++
 .../gcc.dg/tree-ssa/phi-opt-same-1.c  | 60 +++
 3 files changed, 148 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/opt/vectcond-1.C
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-same-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index c01362ee359..487a7e38719 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5261,6 +5261,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(convert @c0
 #endif
 
+(for cnd (cond vec_cond)
+ /* (a != b) ? (a - b) : 0 -> (a - b) */
+ (simplify
+  (cnd (ne:c @0 @1) (minus@2 @0 @1) integer_zerop)
+  @2)
+ /* (a != b) ? (a ^ b) : 0 -> (a ^ b) */
+ (simplify
+  (cnd (ne:c @0 @1) (bit_xor:c@2 @0 @1) integer_zerop)
+  @2)
+ /* (a != b) ? (a & b) : a -> (a & b) */
+ /* (a != b) ? (a | b) : a -> (a | b) */
+ /* (a != b) ? min(a,b) : a -> min(a,b) */
+ /* (a != b) ? max(a,b) : a -> max(a,b) */
+ (for op (bit_and bit_ior min max)
+  (simplify
+   (cnd (ne:c @0 @1) (op:c@2 @0 @1) @0)
+   @2))
+ /* (a != b) ? (a * b) : (a * a) -> (a * b) */
+ /* (a != b) ? (a + b) : (a + a) -> (a + b) */
+ (for op (mult plus)
+  (simplify
+   (cnd (ne:c @0 @1) (op@2 @0 @1) (op @0 @0))
+   (if (ANY_INTEGRAL_TYPE_P (type))
+@2)))
+ /* (a != b) ? (a + b) : (2 * a) -> (a + b) */
+ (simplify
+  (cnd (ne:c @0 @1) (plus@2 @0 @1) (mult @0 uniform_integer_cst_p@3))
+  (if (wi::to_wide (uniform_integer_cst_p (@3)) == 2)
+   @2))
+)
+
 /* These was part of minmax phiopt.  */
 /* Optimize (a CMP b) ? minmax : minmax
to minmax, c> */
diff --git a/gcc/testsuite/g++.dg/opt/vectcond-1.C 
b/gcc/testsuite/g++.dg/opt/vectcond-1.C
new file mode 100644
index 000..3877ad11414
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/vectcond-1.C
@@ -0,0 +1,57 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ccp1 -fdump-tree-optimized" } */
+/* This is the vector version of these optimizations. */
+/* PR tree-optimization/19832 */
+
+#define vector __attribute__((vector_size(sizeof(unsigned)*2)))
+
+static inline vector int max_(vector int a, vector int b)
+{
+   return (a > b)? a : b;
+}
+static inline vector int min_(vector int a, vector int b)
+{
+  return (a < b) ? a : b;
+}
+
+vector int f_minus(vector int a, vector int b)
+{
+  return (a != b) ? a - b : (a - a);
+}
+vector int f_xor(vector int a, vector int b)
+{
+  return (a != b) ? a ^ b : (a ^ a);
+}
+
+vector int f_ior(vector int a, vector int b)
+{
+  return (a != b) ? a | b : (a | a);
+}
+vector int f_and(vector int a, vector int b)
+{
+  return (a != b) ? a & b : (a & a);
+}
+vector int f_max(vector int a, vector int b)
+{
+  return (a != b) ? max_(a, b) : max_(a, a);
+}
+vector int f_min(vector int a, vector int b)
+{
+  return (a != b) ? min_(a, b) : min_(a, a);
+}
+vector int f_mult(vector int a, vector int b)
+{
+  return (a != b) ? a * b : (a * a);
+}
+vector int f_plus(vector int a, vector int b)
+{
+  return (a != b) ? a + b : (a + a);
+}
+vector int f_plus_alt(vector int a, vector int b)
+{
+  return (a != b) ? a + b : (a * 2);
+}
+
+/* All of the above function's VEC_COND_EXPR should have been optimized away. 
*/
+/* { dg-final { scan-tree-dump-not "VEC_COND_EXPR " "ccp1" } } */
+/* { dg-final { scan-tree-dump-not "VEC_COND_EXPR " "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-same-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-same-1.c
new file mode 100644
index 000..24e757b9b9f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-same-1.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-phiopt1 -fdump-tree-optimized" } */
+/* PR tree-optimization/19832 */
+
+static inline int max_(int a, int b)
+{
+  if (a > b) return a;
+  return b;
+}
+static inline int min_(int a, int b)
+{
+  if (a < b) return a;
+  return b;
+}
+
+int f_minus(int a, int b)
+{
+  if (a != b) return a - b;
+  return a - a;
+}
+int f_xor(int 

[PATCH] MATCH: extend min_value/max_value match to vectors

2023-08-30 Thread Andrew Pinski via Gcc-patches
This simple patch extends the min_value/max_value match to vector integer types.
Using uniform_integer_cst_p makes this easy.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

The testcases pr110915-*.c are the same as pr88784-*.c except using vector
types instead.

PR tree-optimization/110915

gcc/ChangeLog:

* match.pd (min_value, max_value): Extend to vector constants.

gcc/testsuite/ChangeLog:

* gcc.dg/pr110915-1.c: New test.
* gcc.dg/pr110915-10.c: New test.
* gcc.dg/pr110915-11.c: New test.
* gcc.dg/pr110915-12.c: New test.
* gcc.dg/pr110915-2.c: New test.
* gcc.dg/pr110915-3.c: New test.
* gcc.dg/pr110915-4.c: New test.
* gcc.dg/pr110915-5.c: New test.
* gcc.dg/pr110915-6.c: New test.
* gcc.dg/pr110915-7.c: New test.
* gcc.dg/pr110915-8.c: New test.
* gcc.dg/pr110915-9.c: New test.
---
 gcc/match.pd   | 24 ++
 gcc/testsuite/gcc.dg/pr110915-1.c  | 31 
 gcc/testsuite/gcc.dg/pr110915-10.c | 33 ++
 gcc/testsuite/gcc.dg/pr110915-11.c | 31 
 gcc/testsuite/gcc.dg/pr110915-12.c | 31 
 gcc/testsuite/gcc.dg/pr110915-2.c  | 31 
 gcc/testsuite/gcc.dg/pr110915-3.c  | 33 ++
 gcc/testsuite/gcc.dg/pr110915-4.c  | 33 ++
 gcc/testsuite/gcc.dg/pr110915-5.c  | 32 +
 gcc/testsuite/gcc.dg/pr110915-6.c  | 32 +
 gcc/testsuite/gcc.dg/pr110915-7.c  | 32 +
 gcc/testsuite/gcc.dg/pr110915-8.c  | 32 +
 gcc/testsuite/gcc.dg/pr110915-9.c  | 33 ++
 13 files changed, 400 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr110915-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr110915-10.c
 create mode 100644 gcc/testsuite/gcc.dg/pr110915-11.c
 create mode 100644 gcc/testsuite/gcc.dg/pr110915-12.c
 create mode 100644 gcc/testsuite/gcc.dg/pr110915-2.c
 create mode 100644 gcc/testsuite/gcc.dg/pr110915-3.c
 create mode 100644 gcc/testsuite/gcc.dg/pr110915-4.c
 create mode 100644 gcc/testsuite/gcc.dg/pr110915-5.c
 create mode 100644 gcc/testsuite/gcc.dg/pr110915-6.c
 create mode 100644 gcc/testsuite/gcc.dg/pr110915-7.c
 create mode 100644 gcc/testsuite/gcc.dg/pr110915-8.c
 create mode 100644 gcc/testsuite/gcc.dg/pr110915-9.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 6a7edde5736..c01362ee359 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2750,16 +2750,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   & (bitpos / BITS_PER_UNIT))); }
 
 (match min_value
- INTEGER_CST
- (if ((INTEGRAL_TYPE_P (type)
-   || POINTER_TYPE_P(type))
-  && wi::eq_p (wi::to_wide (t), wi::min_value (type)
+ uniform_integer_cst_p
+ (with {
+   tree int_cst = uniform_integer_cst_p (t);
+   tree inner_type = TREE_TYPE (int_cst);
+  }
+  (if ((INTEGRAL_TYPE_P (inner_type)
+|| POINTER_TYPE_P (inner_type))
+   && wi::eq_p (wi::to_wide (int_cst), wi::min_value (inner_type))
 
 (match max_value
- INTEGER_CST
- (if ((INTEGRAL_TYPE_P (type)
-   || POINTER_TYPE_P(type))
-  && wi::eq_p (wi::to_wide (t), wi::max_value (type)
+ uniform_integer_cst_p
+ (with {
+   tree int_cst = uniform_integer_cst_p (t);
+   tree itype = TREE_TYPE (int_cst);
+  }
+ (if ((INTEGRAL_TYPE_P (itype)
+   || POINTER_TYPE_P (itype))
+  && wi::eq_p (wi::to_wide (int_cst), wi::max_value (itype))
 
 /* x >  y  &&  x != XXX_MIN  -->  x > y
x >  y  &&  x == XXX_MIN  -->  false . */
diff --git a/gcc/testsuite/gcc.dg/pr110915-1.c 
b/gcc/testsuite/gcc.dg/pr110915-1.c
new file mode 100644
index 000..2e1e871b9a0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr110915-1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ifcombine" } */
+#define vector __attribute__((vector_size(sizeof(unsigned)*2)))
+
+#include 
+
+vector signed and1(vector unsigned x, vector unsigned y)
+{
+  /* (x > y) & (x != 0)  --> x > y */
+  return (x > y) & (x != 0);
+}
+
+vector signed and2(vector unsigned x, vector unsigned y)
+{
+  /* (x < y) & (x != UINT_MAX)  --> x < y */
+  return (x < y) & (x != UINT_MAX);
+}
+
+vector signed and3(vector signed x, vector signed y)
+{
+  /* (x > y) & (x != INT_MIN)  --> x > y */
+  return (x > y) & (x != INT_MIN);
+}
+
+vector signed and4(vector signed x, vector signed y)
+{
+  /* (x < y) & (x != INT_MAX)  --> x < y */
+  return (x < y) & (x != INT_MAX);
+}
+
+/* { dg-final { scan-tree-dump-not " != " "ifcombine" } } */
diff --git a/gcc/testsuite/gcc.dg/pr110915-10.c 
b/gcc/testsuite/gcc.dg/pr110915-10.c
new file mode 100644
index 000..b0644bf3123
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr110915-10.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { 

[PATCH] MATCH: Move `(x | y) & (~x ^ y)` over to use bitwise_inverted_equal_p

2023-08-28 Thread Andrew Pinski via Gcc-patches
This moves the match pattern `(x | y) & (~x ^ y)` over to use 
bitwise_inverted_equal_p.
This now also allows to optmize comparisons and also catches the missed `(~x | 
y) & (x ^ y)`
transformation into `~x & y`.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optmization/47
* match.pd (`(x | y) & (~x ^ y)`) Use bitwise_inverted_equal_p
instead of matching bit_not.

gcc/testsuite/ChangeLog:

PR tree-optmization/47
* gcc.dg/tree-ssa/cmpbit-4.c: New test.
---
 gcc/match.pd |  7 +++-
 gcc/testsuite/gcc.dg/tree-ssa/cmpbit-4.c | 47 
 2 files changed, 52 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cmpbit-4.c

diff --git a/gcc/match.pd b/gcc/match.pd
index e6bdc3149b6..47d2733211a 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1616,8 +1616,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 /* (x | y) & (~x ^ y) -> x & y */
 (simplify
- (bit_and:c (bit_ior:c @0 @1) (bit_xor:c @1 (bit_not @0)))
- (bit_and @0 @1))
+ (bit_and:c (bit_ior:c @0 @1) (bit_xor:c @1 @2))
+ (with { bool wascmp; }
+  (if (bitwise_inverted_equal_p (@0, @2, wascmp)
+   && (!wascmp || element_precision (type) == 1))
+   (bit_and @0 @1
 
 /* (~x | y) & (x | ~y) -> ~(x ^ y) */
 (simplify
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cmpbit-4.c 
b/gcc/testsuite/gcc.dg/tree-ssa/cmpbit-4.c
new file mode 100644
index 000..cdba5d623af
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/cmpbit-4.c
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+
+int g(int x, int y)
+{
+  int xp = ~x;
+  return (x | y) & (xp ^ y); // x & y
+}
+int g0(int x, int y)
+{
+  int xp = ~x;
+  return (xp | y) & (x ^ y); // ~x & y
+}
+
+_Bool gb(_Bool x, _Bool y)
+{
+  _Bool xp = !x;
+  return (x | y) & (xp ^ y); // x & y
+}
+_Bool gb0(_Bool x, _Bool y)
+{
+  _Bool xp = !x;
+  return (xp | y) & (x ^ y); // !x & y
+}
+
+
+_Bool gbi(int a, int b)
+{
+  _Bool x = a < 2;
+  _Bool y = b < 3;
+  _Bool xp = !x;
+  return (x | y) & (xp ^ y); // x & y
+}
+_Bool gbi0(int a, int b)
+{
+  _Bool x = a < 2;
+  _Bool y = b < 3;
+  _Bool xp = !x;
+  return (xp | y) & (x ^ y); // !x & y
+}
+
+/* All of these should be optimized to `x & y` or `~x & y` */
+/* { dg-final { scan-tree-dump-times "le_expr, " 3 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "gt_expr, " 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-not "bit_xor_expr, " "optimized" } } */
+/* { dg-final { scan-tree-dump-times "bit_and_expr, " 6 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "bit_not_expr, " 2 "optimized" } } */
-- 
2.31.1



[PATCH] Fix cond-bool-2.c on powerpc and other targets

2023-08-28 Thread Andrew Pinski via Gcc-patches
This adds `--param logical-op-non-short-circuit=1` to the tescase
so it becomes a target indepdendent testcase now.
I filed PR 111217 as the variant of the testcase which fails indepdendently
of the param.

Committed as obvious after testing to make sure it passes on powerpc now.

gcc/testsuite/ChangeLog:

PR testsuite/111215
* gcc.dg/tree-ssa/cond-bool-2.c: Add
`--param logical-op-non-short-circuit=1` to the options.
---
 gcc/testsuite/gcc.dg/tree-ssa/cond-bool-2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cond-bool-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/cond-bool-2.c
index b3e7e25dec6..7de89cc0de2 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/cond-bool-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/cond-bool-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+/* { dg-options "-O2 --param logical-op-non-short-circuit=1 
-fdump-tree-optimized-raw" } */
 
 /* PR tree-optimization/95929 */
 
-- 
2.31.1



[PATCH] MATCH: Remove redundant pattern for `(x | y) & ~x`

2023-08-27 Thread Andrew Pinski via Gcc-patches
After r14-2885-gb9237226fdc938, this pattern becomes
redundant as we match it using bitwise_inverted_equal_p.

There is already a testcase (gcc.dg/nand.c) for this pattern
and it still passes after the removal.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/46
* match.pd (`(x | y) & ~x`, `(x & y) | ~x`): Remove
redundant pattern.
---
 gcc/match.pd | 8 
 1 file changed, 8 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index fa598d5ca2e..0076392c522 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1556,14 +1556,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (bit_ior:c (bit_xor:s @0 @1) (bit_not:s (bit_ior:s @0 @1)))
  (bit_not (bit_and @0 @1)))
 
-/* (x | y) & ~x -> y & ~x */
-/* (x & y) | ~x -> y | ~x */
-(for bitop (bit_and bit_ior)
- rbitop (bit_ior bit_and)
- (simplify
-  (bitop:c (rbitop:c @0 @1) (bit_not@2 @0))
-  (bitop @1 @2)))
-
 /* (x & y) ^ (x | y) -> x ^ y */
 (simplify
  (bit_xor:c (bit_and @0 @1) (bit_ior @0 @1))
-- 
2.31.1



[PATCH] IFCOMBINE: Remove outer condition for two same conditionals

2023-08-27 Thread Andrew Pinski via Gcc-patches
This adds a simple case to remove an outer condition if the two inner
condtionals are the same and lead the same location.
This can show up due to jump threading or inlining or someone wrote code
like this.

ifcombine-1.c shows the simple case where this is supposed to solve.
Even though PRE can handle some cases, ifcombine is earlier and even runs
at -O1.

Note in the case of the PR here, it comes from jump threading.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/110891
* tree-ssa-ifcombine.cc (ifcombine_bb_same): New function.
(tree_ssa_ifcombine_bb): Call ifcombine_bb_same.

gcc/testsuite/ChangeLog:

PR tree-optimization/110891
* gcc.dg/tree-ssa/ifcombine-1.c: New test.
* gcc.dg/tree-ssa/pr110891-1.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/ifcombine-1.c |  27 ++
 gcc/testsuite/gcc.dg/tree-ssa/pr110891-1.c  |  53 +++
 gcc/tree-ssa-ifcombine.cc   | 100 
 3 files changed, 180 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ifcombine-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110891-1.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifcombine-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ifcombine-1.c
new file mode 100644
index 000..02d08efef87
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifcombine-1.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized -fdump-tree-ifcombine" } */
+
+int g();
+int h();
+
+int j, l;
+
+int f(int a, int *b)
+{
+if (a == 0)
+{
+if (b == ) goto L9; else goto L7;
+}
+else
+{
+if (b == ) goto L9; else goto L7;
+}
+L7: return g();
+L9: return h();
+}
+
+/* ifcombine can optimize away the outer most if here. */
+/* { dg-final { scan-tree-dump-times "optimized away the test from bb " 1 
"ifcombine" } } */
+/* We should have remove the outer if and one of the inner ifs; leaving us 
with one if. */
+/* { dg-final { scan-tree-dump-times "if " 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "goto " 3 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr110891-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr110891-1.c
new file mode 100644
index 000..320d8823077
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr110891-1.c
@@ -0,0 +1,53 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+void foo(void);
+static int a, c = 7, d, o, q;
+static int *b = , *f, *j = , *n = , *ae;
+static short e, m;
+static short *i = 
+static char r;
+void __assert_fail(char *, char *, int, const char *) 
__attribute__((__noreturn__));
+static const short g();
+static void h();
+static int *k(int) {
+(*i)++;
+*j ^= *b;
+return 
+}
+static void l(unsigned p) {
+int *aa = 
+h();
+o = 5 ^ g() && p;
+if (f ==  || f ==  || f == )
+;
+else {
+foo();
+__assert_fail("", "", 3, __PRETTY_FUNCTION__);
+}
+*aa ^= *n;
+if (*aa)
+if (!(((p) >= 0) && ((p) <= 0))) {
+__builtin_unreachable();
+}
+k(p);
+}
+static const short g() { return q; }
+static void h() {
+unsigned ag = c;
+d = ag > r ? ag : 0;
+ae = k(c);
+f = ae;
+if (ae ==  || ae ==  || ae == )
+;
+else
+__assert_fail("", "", 4, __PRETTY_FUNCTION__);
+}
+int main() {
+l(a);
+m || (*b |= 64);
+*b &= 5;
+}
+
+/* We should be able to optimize away foo. */
+/* { dg-final { scan-tree-dump-not "foo " "optimized" } } */
diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
index 46b076804f4..f79545b9a0b 100644
--- a/gcc/tree-ssa-ifcombine.cc
+++ b/gcc/tree-ssa-ifcombine.cc
@@ -666,6 +666,103 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool 
inner_inv,
   return false;
 }
 
+/* Function to remove an outer condition if two inner basic blocks have the 
same condition and both empty otherwise. */
+
+static bool
+ifcombine_bb_same (basic_block cond_bb, basic_block outer_cond_bb,
+  basic_block then_bb, basic_block else_bb)
+{
+  basic_block inner_cond_bbt = nullptr, inner_cond_bbf = nullptr;
+
+  /* See if the the outer condition is a condition. */
+  if (!recognize_if_then_else (outer_cond_bb, _cond_bbt, 
_cond_bbf))
+return false;
+  basic_block other_cond_bb;
+  if (cond_bb == inner_cond_bbt)
+other_cond_bb = inner_cond_bbf;
+  else
+other_cond_bb = inner_cond_bbt;
+
+  /* The other bb has to have a single predecessor too. */
+  if (!single_pred_p (other_cond_bb))
+return false;
+
+  /* Other inner conditional bb needs to go to the same then and else blocks. 
*/
+  if (!recognize_if_then_else (other_cond_bb, _bb, _bb))
+return false;
+
+  /* Both edges of both inner basic blocks need to have the same values for 
the incoming phi for both then and else basic blocks. */
+  if (!same_phi_args_p (cond_bb, other_cond_bb, 

[PATCH] PHIOPT: Add dump for match and simplify and early phiopt

2023-08-25 Thread Andrew Pinski via Gcc-patches
This adds dump on the full result of the match-and-simplify
for phiopt and specifically to know if we are rejecting something
due to being in early phi-opt.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* tree-ssa-phiopt.cc (gimple_simplify_phiopt): Add dump information
when resimplify returns true.
(match_simplify_replacement): Print only if accepted the 
match-and-simplify
result rather than the full sequence.
---
 gcc/tree-ssa-phiopt.cc | 70 ++
 1 file changed, 44 insertions(+), 26 deletions(-)

diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index 7e63fb115db..9993bbe5b76 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -499,7 +499,6 @@ gimple_simplify_phiopt (bool early_p, tree type, gimple 
*comp_stmt,
tree arg0, tree arg1,
gimple_seq *seq)
 {
-  tree result;
   gimple_seq seq1 = NULL;
   enum tree_code comp_code = gimple_cond_code (comp_stmt);
   location_t loc = gimple_location (comp_stmt);
@@ -529,18 +528,29 @@ gimple_simplify_phiopt (bool early_p, tree type, gimple 
*comp_stmt,
 
   if (op.resimplify (, follow_all_ssa_edges))
 {
-  /* Early we want only to allow some generated tree codes. */
-  if (!early_p
- || phiopt_early_allow (seq1, op))
+  bool allowed = !early_p || phiopt_early_allow (seq1, op);
+  tree result = maybe_push_res_to_seq (, );
+  if (dump_file && (dump_flags & TDF_FOLDING))
{
- result = maybe_push_res_to_seq (, );
+ fprintf (dump_file, "\nphiopt match-simplify back:\n");
+ if (seq1)
+   print_gimple_seq (dump_file, seq1, 0, TDF_VOPS|TDF_MEMSYMS);
+ fprintf (dump_file, "result: ");
  if (result)
-   {
- if (loc != UNKNOWN_LOCATION)
-   annotate_all_with_location (seq1, loc);
- gimple_seq_add_seq_without_update (seq, seq1);
- return result;
-   }
+   print_generic_expr (dump_file, result);
+ else
+   fprintf (dump_file, " (none)");
+ fprintf (dump_file, "\n");
+ if (!allowed)
+   fprintf (dump_file, "rejected because early\n");
+   }
+  /* Early we want only to allow some generated tree codes. */
+  if (allowed && result)
+   {
+ if (loc != UNKNOWN_LOCATION)
+   annotate_all_with_location (seq1, loc);
+ gimple_seq_add_seq_without_update (seq, seq1);
+ return result;
}
 }
   gimple_seq_discard (seq1);
@@ -572,18 +582,29 @@ gimple_simplify_phiopt (bool early_p, tree type, gimple 
*comp_stmt,
 
   if (op1.resimplify (, follow_all_ssa_edges))
 {
-  /* Early we want only to allow some generated tree codes. */
-  if (!early_p
- || phiopt_early_allow (seq1, op1))
+  bool allowed = !early_p || phiopt_early_allow (seq1, op1);
+  tree result = maybe_push_res_to_seq (, );
+  if (dump_file && (dump_flags & TDF_FOLDING))
{
- result = maybe_push_res_to_seq (, );
+ fprintf (dump_file, "\nphiopt match-simplify back:\n");
+ if (seq1)
+   print_gimple_seq (dump_file, seq1, 0, TDF_VOPS|TDF_MEMSYMS);
+ fprintf (dump_file, "result: ");
  if (result)
-   {
- if (loc != UNKNOWN_LOCATION)
-   annotate_all_with_location (seq1, loc);
- gimple_seq_add_seq_without_update (seq, seq1);
- return result;
-   }
+   print_generic_expr (dump_file, result);
+ else
+   fprintf (dump_file, " (none)");
+ fprintf (dump_file, "\n");
+ if (!allowed)
+   fprintf (dump_file, "rejected because early\n");
+   }
+  /* Early we want only to allow some generated tree codes. */
+  if (allowed && result)
+   {
+ if (loc != UNKNOWN_LOCATION)
+   annotate_all_with_location (seq1, loc);
+ gimple_seq_add_seq_without_update (seq, seq1);
+ return result;
}
 }
   gimple_seq_discard (seq1);
@@ -855,6 +876,8 @@ match_simplify_replacement (basic_block cond_bb, 
basic_block middle_bb,
 
   if (!result)
 return false;
+  if (dump_file && (dump_flags & TDF_FOLDING))
+fprintf (dump_file, "accepted the phiopt match-simplify.\n");
 
   auto_bitmap exprs_maybe_dce;
 
@@ -881,11 +904,6 @@ match_simplify_replacement (basic_block cond_bb, 
basic_block middle_bb,
  if (name && TREE_CODE (name) == SSA_NAME)
bitmap_set_bit (exprs_maybe_dce, SSA_NAME_VERSION (name));
}
-  if (dump_file && (dump_flags & TDF_FOLDING))
-   {
- fprintf (dump_file, "Folded into the sequence:\n");
- print_gimple_seq (dump_file, seq, 0, TDF_VOPS|TDF_MEMSYMS);
-   }
 gsi_insert_seq_before (, seq, GSI_CONTINUE_LINKING);
   }
 
-- 
2.31.1



[PATCH] Fix phi-opt-34.c testcase

2023-08-25 Thread Andrew Pinski via Gcc-patches
Somehow when I was testing the new testcase, it was working but
when I re-ran the full testsuite it was not. Anyways the issue
was just a simple space before the `}` for dg-options directive.

Committed as obvious.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/phi-opt-34.c: Fix dg-options directive.
---
 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-34.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-34.c 
b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-34.c
index 157c3ea9a0b..61054231b4c 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-34.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-34.c
@@ -1,7 +1,7 @@
 /* { dg-do compile } */
 /* Disable early phiopt1 as  early ccp1 does not export non-zero bits
so at the point of phiopt1, we don't know that a is [0,1] range */
-/* { dg-options "-O1 -fdisable-tree-phiopt1 -fdump-tree-phiopt2-folding"} */
+/* { dg-options "-O1 -fdisable-tree-phiopt1 -fdump-tree-phiopt2-folding" } */
 
 unsigned f(unsigned a)
 {
-- 
2.31.1



Re: [PATCH 3/3] PHIOPT: Allow BIT_AND and BIT_IOR in early phiopt

2023-08-25 Thread Andrew Pinski via Gcc-patches
On Thu, Aug 24, 2023 at 11:47 PM Richard Biener via Gcc-patches
 wrote:
>
> On Thu, Aug 24, 2023 at 9:16 PM Andrew Pinski via Gcc-patches
>  wrote:
> >
> > Now that MIN/MAX can sometimes be transformed into BIT_AND/BIT_IOR,
> > we should allow BIT_AND and BIT_IOR in the early phiopt.
> > Also we produce BIT_AND/BIT_IOR for things like `bool0 ? bool1 : 0`
> > which seems like a good thing to allow early on too.
>
> Hum.
>
> I think if we allow AND/IOR we should also allow XOR and NOT.

Yes, XOR and NOT most likely should be added too. Maybe even
comparisons without a conversion too.

>
> Can you add dumping for replacements we disallow?  I'm esp. curious
> for those otherwise being "singleton".  I know when doing early phiopt
> I wanted to be very conservative (also to reduce testsuite fallout), and
> I was mostly interested in MIN/MAX which I then extended to similar
> things like ABS.  But maybe we can revisit this if we understand which
> cases we definitely do not want to do early?

I have a patch which prints out the dumping of the result and will
submit it later today. In the next couple of days I will look into the
dump when compiling GCC to see if there are others that seem fine. It
might be the case where we want to reject only non single statement
ones (except for MIN/MAX were allowing 2 there too).

Thanks,
Andrew

>
> > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> >
> > gcc/ChangeLog:
> >
> > * tree-ssa-phiopt.cc (phiopt_early_allow): Allow
> > BIT_AND_EXPR and BIT_IOR_EXPR.
> > ---
> >  gcc/tree-ssa-phiopt.cc | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> > index 54706f4c7e7..7e63fb115db 100644
> > --- a/gcc/tree-ssa-phiopt.cc
> > +++ b/gcc/tree-ssa-phiopt.cc
> > @@ -469,6 +469,9 @@ phiopt_early_allow (gimple_seq , gimple_match_op 
> > )
> >  {
> >case MIN_EXPR:
> >case MAX_EXPR:
> > +  /* MIN/MAX could be convert into these. */
> > +  case BIT_IOR_EXPR:
> > +  case BIT_AND_EXPR:
> >case ABS_EXPR:
> >case ABSU_EXPR:
> >case NEGATE_EXPR:
> > --
> > 2.31.1
> >


[COMMITTEDv2] MATCH: Move `a ? one_zero : one_zero` matching after min/max matching

2023-08-25 Thread Andrew Pinski via Gcc-patches
In PR 106677, I noticed that on the trunk we were producing:
```
  _25 = SR.116_117 == 0;
  _27 = (unsigned char) _25;
  _32 = _27 | SR.116_117;
```
>From `SR.115_117 != 0 ? SR.115_117 : 1`
Rather than:
```
  _119 = MAX_EXPR <1, SR.115_117>;
```
Or (rather)
```
  _119 = SR.115_117 | 1;
```
Due to the order of the patterns.

Committed as approved with the new comment and testcase.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* match.pd (`a ? one_zero : one_zero`): Move
below detection of minmax.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/phi-opt-34.c: New test.
---
 gcc/match.pd   | 42 --
 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-34.c | 23 
 2 files changed, 47 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-34.c

diff --git a/gcc/match.pd b/gcc/match.pd
index d9f35e9e25b..fa598d5ca2e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4961,24 +4961,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  )
 )
 
-(simplify
- (cond @0 zero_one_valued_p@1 zero_one_valued_p@2)
- (switch
-  /* bool0 ? bool1 : 0 -> bool0 & bool1 */
-  (if (integer_zerop (@2))
-   (bit_and (convert @0) @1))
-  /* bool0 ? 0 : bool2 -> (bool0^1) & bool2 */
-  (if (integer_zerop (@1))
-   (bit_and (bit_xor (convert @0) { build_one_cst (type); } ) @2))
-  /* bool0 ? 1 : bool2 -> bool0 | bool2 */
-  (if (integer_onep (@1))
-   (bit_ior (convert @0) @2))
-  /* bool0 ? bool1 : 1 -> (bool0^1) | bool1 */
-  (if (integer_onep (@2))
-   (bit_ior (bit_xor (convert @0) @2) @1))
- )
-)
-
 /* Optimize
# x_5 in range [cst1, cst2] where cst2 = cst1 + 1
x_5 ? cstN ? cst4 : cst3
@@ -5309,6 +5291,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   && integer_nonzerop (fold_build2 (GE_EXPR, boolean_type_node, @3, 
@1)))
   (max @2 @4))
 
+#if GIMPLE
+/* These patterns should be after min/max detection as simplifications
+   of `(type)(zero_one ==/!= 0)` to `(type)(zero_one)`
+   and `(type)(zero_one^1)` are not done yet.  See PR 110637.
+   Even without those, reaching min/max/and/ior faster is better.  */
+(simplify
+ (cond @0 zero_one_valued_p@1 zero_one_valued_p@2)
+ (switch
+  /* bool0 ? bool1 : 0 -> bool0 & bool1 */
+  (if (integer_zerop (@2))
+   (bit_and (convert @0) @1))
+  /* bool0 ? 0 : bool2 -> (bool0^1) & bool2 */
+  (if (integer_zerop (@1))
+   (bit_and (bit_xor (convert @0) { build_one_cst (type); } ) @2))
+  /* bool0 ? 1 : bool2 -> bool0 | bool2 */
+  (if (integer_onep (@1))
+   (bit_ior (convert @0) @2))
+  /* bool0 ? bool1 : 1 -> (bool0^1) | bool1 */
+  (if (integer_onep (@2))
+   (bit_ior (bit_xor (convert @0) @2) @1))
+ )
+)
+#endif
+
 /* X != C1 ? -X : C2 simplifies to -X when -C1 == C2.  */
 (simplify
  (cond (ne @0 INTEGER_CST@1) (negate@3 @0) INTEGER_CST@2)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-34.c 
b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-34.c
new file mode 100644
index 000..157c3ea9a0b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-34.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* Disable early phiopt1 as  early ccp1 does not export non-zero bits
+   so at the point of phiopt1, we don't know that a is [0,1] range */
+/* { dg-options "-O1 -fdisable-tree-phiopt1 -fdump-tree-phiopt2-folding"} */
+
+unsigned f(unsigned a)
+{
+  a &= 1;
+  if (a > 0)
+return a;
+  return 1;
+}
+/* PHIOPT2 should be able to change this into just return 1;
+   (which was `MAX` or `a | 1` but since a is known to be a
+   range of [0,1], it should be folded into 1)
+   And not fold it into `(a == 0) | a`. */
+/* { dg-final { scan-tree-dump-not " == " "phiopt2" } } */
+/* { dg-final { scan-tree-dump-not " if " "phiopt2" } } */
+/* { dg-final { scan-tree-dump-not "Folded into the sequence:" "phiopt2" } } */
+/* { dg-final { scan-tree-dump "return 1;" "phiopt2" } } */
+/* We want to make sure that phiopt2 is happening and not some other pass
+   before it does the transformation. */
+/* { dg-final { scan-tree-dump "Removing basic block" "phiopt2" } } */
-- 
2.31.1



Re: [PATCH 1/3] MATCH: Move `a ? one_zero : one_zero` matching after min/max matching

2023-08-25 Thread Andrew Pinski via Gcc-patches
On Fri, Aug 25, 2023 at 11:11 AM Andrew Pinski  wrote:
>
> On Thu, Aug 24, 2023 at 11:39 PM Richard Biener via Gcc-patches
>  wrote:
> >
> > On Thu, Aug 24, 2023 at 9:16 PM Andrew Pinski via Gcc-patches
> >  wrote:
> > >
> > > In PR 106677, I noticed that on the trunk we were producing:
> > > ```
> > >   _25 = SR.116_117 == 0;
> > >   _27 = (unsigned char) _25;
> > >   _32 = _27 | SR.116_117;
> > > ```
> > > From `SR.115_117 != 0 ? SR.115_117 : 1`
> > > Rather than:
> > > ```
> > >   _119 = MAX_EXPR <1, SR.115_117>;
> > > ```
> > > Or (rather)
> > > ```
> > >   _119 = SR.115_117 | 1;
> > > ```
> > > Due to the order of the patterns.
> >
> > Hmm, that means the former when present in source isn't optimized?
>
> That it is correct; they are not optimized at the gimple level down to
> 1. it is sometimes (but not on all targets) optimized at the RTL level
> though.

I forgot to mention that this is recorded as PR 110637 already.

Thanks,
Andrew

>
> >
> > > OK? Bootstrapped and tested on x86_64-linux-gnu with no
> > > regressions.
> >
> > OK, but please add a comment indicating the ordering requirement.
> >
> > Can you also add a testcase?
>
> Yes and yes. Will send out a new patch in a few minutes with the added
> comment and testcase.
>
> Thanks,
> Andrew
>
> >
> > Richard.
> >
> > > gcc/ChangeLog:
> > >
> > > * match.pd (`a ? one_zero : one_zero`): Move
> > > below detection of minmax.
> > > ---
> > >  gcc/match.pd | 38 --
> > >  1 file changed, 20 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > index 890f050cbad..c87a0795667 100644
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -4950,24 +4950,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >   )
> > >  )
> > >
> > > -(simplify
> > > - (cond @0 zero_one_valued_p@1 zero_one_valued_p@2)
> > > - (switch
> > > -  /* bool0 ? bool1 : 0 -> bool0 & bool1 */
> > > -  (if (integer_zerop (@2))
> > > -   (bit_and (convert @0) @1))
> > > -  /* bool0 ? 0 : bool2 -> (bool0^1) & bool2 */
> > > -  (if (integer_zerop (@1))
> > > -   (bit_and (bit_xor (convert @0) { build_one_cst (type); } ) @2))
> > > -  /* bool0 ? 1 : bool2 -> bool0 | bool2 */
> > > -  (if (integer_onep (@1))
> > > -   (bit_ior (convert @0) @2))
> > > -  /* bool0 ? bool1 : 1 -> (bool0^1) | bool1 */
> > > -  (if (integer_onep (@2))
> > > -   (bit_ior (bit_xor (convert @0) @2) @1))
> > > - )
> > > -)
> > > -
> > >  /* Optimize
> > > # x_5 in range [cst1, cst2] where cst2 = cst1 + 1
> > > x_5 ? cstN ? cst4 : cst3
> > > @@ -5298,6 +5280,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >&& integer_nonzerop (fold_build2 (GE_EXPR, boolean_type_node, 
> > > @3, @1)))
> > >(max @2 @4))
> > >
> > > +#if GIMPLE
> > > +(simplify
> > > + (cond @0 zero_one_valued_p@1 zero_one_valued_p@2)
> > > + (switch
> > > +  /* bool0 ? bool1 : 0 -> bool0 & bool1 */
> > > +  (if (integer_zerop (@2))
> > > +   (bit_and (convert @0) @1))
> > > +  /* bool0 ? 0 : bool2 -> (bool0^1) & bool2 */
> > > +  (if (integer_zerop (@1))
> > > +   (bit_and (bit_xor (convert @0) { build_one_cst (type); } ) @2))
> > > +  /* bool0 ? 1 : bool2 -> bool0 | bool2 */
> > > +  (if (integer_onep (@1))
> > > +   (bit_ior (convert @0) @2))
> > > +  /* bool0 ? bool1 : 1 -> (bool0^1) | bool1 */
> > > +  (if (integer_onep (@2))
> > > +   (bit_ior (bit_xor (convert @0) @2) @1))
> > > + )
> > > +)
> > > +#endif
> > > +
> > >  /* X != C1 ? -X : C2 simplifies to -X when -C1 == C2.  */
> > >  (simplify
> > >   (cond (ne @0 INTEGER_CST@1) (negate@3 @0) INTEGER_CST@2)
> > > --
> > > 2.31.1
> > >


Re: [PATCH 1/3] MATCH: Move `a ? one_zero : one_zero` matching after min/max matching

2023-08-25 Thread Andrew Pinski via Gcc-patches
On Thu, Aug 24, 2023 at 11:39 PM Richard Biener via Gcc-patches
 wrote:
>
> On Thu, Aug 24, 2023 at 9:16 PM Andrew Pinski via Gcc-patches
>  wrote:
> >
> > In PR 106677, I noticed that on the trunk we were producing:
> > ```
> >   _25 = SR.116_117 == 0;
> >   _27 = (unsigned char) _25;
> >   _32 = _27 | SR.116_117;
> > ```
> > From `SR.115_117 != 0 ? SR.115_117 : 1`
> > Rather than:
> > ```
> >   _119 = MAX_EXPR <1, SR.115_117>;
> > ```
> > Or (rather)
> > ```
> >   _119 = SR.115_117 | 1;
> > ```
> > Due to the order of the patterns.
>
> Hmm, that means the former when present in source isn't optimized?

That it is correct; they are not optimized at the gimple level down to
1. it is sometimes (but not on all targets) optimized at the RTL level
though.

>
> > OK? Bootstrapped and tested on x86_64-linux-gnu with no
> > regressions.
>
> OK, but please add a comment indicating the ordering requirement.
>
> Can you also add a testcase?

Yes and yes. Will send out a new patch in a few minutes with the added
comment and testcase.

Thanks,
Andrew

>
> Richard.
>
> > gcc/ChangeLog:
> >
> > * match.pd (`a ? one_zero : one_zero`): Move
> > below detection of minmax.
> > ---
> >  gcc/match.pd | 38 --
> >  1 file changed, 20 insertions(+), 18 deletions(-)
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 890f050cbad..c87a0795667 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -4950,24 +4950,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >   )
> >  )
> >
> > -(simplify
> > - (cond @0 zero_one_valued_p@1 zero_one_valued_p@2)
> > - (switch
> > -  /* bool0 ? bool1 : 0 -> bool0 & bool1 */
> > -  (if (integer_zerop (@2))
> > -   (bit_and (convert @0) @1))
> > -  /* bool0 ? 0 : bool2 -> (bool0^1) & bool2 */
> > -  (if (integer_zerop (@1))
> > -   (bit_and (bit_xor (convert @0) { build_one_cst (type); } ) @2))
> > -  /* bool0 ? 1 : bool2 -> bool0 | bool2 */
> > -  (if (integer_onep (@1))
> > -   (bit_ior (convert @0) @2))
> > -  /* bool0 ? bool1 : 1 -> (bool0^1) | bool1 */
> > -  (if (integer_onep (@2))
> > -   (bit_ior (bit_xor (convert @0) @2) @1))
> > - )
> > -)
> > -
> >  /* Optimize
> > # x_5 in range [cst1, cst2] where cst2 = cst1 + 1
> > x_5 ? cstN ? cst4 : cst3
> > @@ -5298,6 +5280,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >&& integer_nonzerop (fold_build2 (GE_EXPR, boolean_type_node, 
> > @3, @1)))
> >(max @2 @4))
> >
> > +#if GIMPLE
> > +(simplify
> > + (cond @0 zero_one_valued_p@1 zero_one_valued_p@2)
> > + (switch
> > +  /* bool0 ? bool1 : 0 -> bool0 & bool1 */
> > +  (if (integer_zerop (@2))
> > +   (bit_and (convert @0) @1))
> > +  /* bool0 ? 0 : bool2 -> (bool0^1) & bool2 */
> > +  (if (integer_zerop (@1))
> > +   (bit_and (bit_xor (convert @0) { build_one_cst (type); } ) @2))
> > +  /* bool0 ? 1 : bool2 -> bool0 | bool2 */
> > +  (if (integer_onep (@1))
> > +   (bit_ior (convert @0) @2))
> > +  /* bool0 ? bool1 : 1 -> (bool0^1) | bool1 */
> > +  (if (integer_onep (@2))
> > +   (bit_ior (bit_xor (convert @0) @2) @1))
> > + )
> > +)
> > +#endif
> > +
> >  /* X != C1 ? -X : C2 simplifies to -X when -C1 == C2.  */
> >  (simplify
> >   (cond (ne @0 INTEGER_CST@1) (negate@3 @0) INTEGER_CST@2)
> > --
> > 2.31.1
> >


[PATCH] MATCH: Move `(X & ~Y) | (~X & Y)` over to use bitwise_inverted_equal_p

2023-08-25 Thread Andrew Pinski via Gcc-patches
This moves the pattern `(X & ~Y) | (~X & Y)` to use bitwise_inverted_equal_p
so we can simplify earlier the case where X and Y are defined by comparisons.
We were able to optimize to (!X)^(!Y) in the end due to the pattern added in
r14-3110-g7fb65f102851248bafa0815 and the older pattern r13-4620-g4d9db4bdd458 .
But folding it earlier is better.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Note pr87009.c now gets `return x ^ s; in one case where the test had been 
expecting
`return s ^ x;` both are valid and would be expectly the same; just we now 
chose a slightly
different order of simplification which causes the order of the operands to be 
different.

gcc/ChangeLog:

* match.pd (`(X & ~Y) | (~X & Y)`): Use bitwise_inverted_equal_p
instead of specifically checking for ~X.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/cmpbit-3.c: New test.
* gcc.dg/pr87009.c: Update test.
---
 gcc/match.pd | 13 +-
 gcc/testsuite/gcc.dg/pr87009.c   |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/cmpbit-3.c | 33 
 3 files changed, 41 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cmpbit-3.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 70884bd48eb..e41403664d0 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1228,12 +1228,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 /* Simplify (X & ~Y) |^+ (~X & Y) -> X ^ Y.  */
 (for op (bit_ior bit_xor plus)
  (simplify
-  (op (bit_and:c @0 (bit_not @1)) (bit_and:c (bit_not @0) @1))
-   (bit_xor @0 @1))
- (simplify
-  (op:c (bit_and @0 INTEGER_CST@2) (bit_and (bit_not @0) INTEGER_CST@1))
-  (if (~wi::to_wide (@2) == wi::to_wide (@1))
-   (bit_xor @0 @1
+  (op (bit_and:c @0 @2) (bit_and:c @3 @1))
+  (with { bool wascmp0, wascmp1; }
+   (if (bitwise_inverted_equal_p (@2, @1, wascmp0)
+&& bitwise_inverted_equal_p (@0, @3, wascmp1)
+   && ((!wascmp0 && !wascmp1)
+   || element_precision (type) == 1))
+   (bit_xor @0 @1)
 
 /* PR53979: Transform ((a ^ b) | a) -> (a | b) */
 (simplify
diff --git a/gcc/testsuite/gcc.dg/pr87009.c b/gcc/testsuite/gcc.dg/pr87009.c
index eb8a4ecd920..6f0341d17cc 100644
--- a/gcc/testsuite/gcc.dg/pr87009.c
+++ b/gcc/testsuite/gcc.dg/pr87009.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O -fdump-tree-original" } */
-/* { dg-final { scan-tree-dump-times "return s \\^ x;" 4 "original" } } */
+/* { dg-final { scan-tree-dump-times "return s \\^ x;|return x \\^ s;" 4 
"original" } } */
 
 int f1 (int x, int s)
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cmpbit-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/cmpbit-3.c
new file mode 100644
index 000..936c0934a10
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/cmpbit-3.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw -fdump-tree-dse1-raw 
-fdump-tree-forwprop1" } */
+
+_Bool f(int a, int b)
+{
+  _Bool X = a==1, Y = b == 2;
+return (X & !Y) | (!X & Y);
+}
+
+
+_Bool f1(int a, int b)
+{
+  _Bool X = a==1, Y = b == 2;
+  _Bool c = (X & !Y);
+  _Bool d = (!X & Y);
+  return c | d;
+}
+
+/* Both of these should be optimized to (a==1) ^ (b==2) or (a != 1) ^ (b != 2) 
*/
+/* { dg-final { scan-tree-dump-not "gimple_cond " "optimized" } } */
+/* { dg-final { scan-tree-dump-not "gimple_phi " "optimized" } } */
+/* { dg-final { scan-tree-dump-times "ne_expr|eq_expr, " 4 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "bit_xor_expr, " 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "gimple_assign " 6 "optimized" } } */
+
+/* Both of these should be optimized early in the pipeline after forwprop1 */
+/* { dg-final { scan-tree-dump-times "ne_expr|eq_expr, " 4 "forwprop1" { xfail 
*-*-* } } } */
+/* { dg-final { scan-tree-dump-times "bit_xor_expr, " 2 "forwprop1" { xfail 
*-*-* } } } */
+/* { dg-final { scan-tree-dump-times "gimple_assign " 6 "forwprop1" { xfail 
*-*-* } } } */
+/* Note forwprop1 does not remove all unused statements sometimes so test dse1 
also. */
+/* { dg-final { scan-tree-dump-times "ne_expr|eq_expr, " 4 "dse1" } } */
+/* { dg-final { scan-tree-dump-times "bit_xor_expr, " 2 "dse1" } } */
+/* { dg-final { scan-tree-dump-times "gimple_assign " 6 "dse1" } } */
-- 
2.31.1



Re: [PATCH 2/3] MATCH: `a | C -> C` when we know that `a & ~C == 0`

2023-08-25 Thread Andrew Pinski via Gcc-patches
On Thu, Aug 24, 2023 at 11:37 PM Richard Biener via Gcc-patches
 wrote:
>
> On Thu, Aug 24, 2023 at 9:16 PM Andrew Pinski via Gcc-patches
>  wrote:
> >
> > Even though this is handled by other code inside both VRP and CCP,
> > sometimes we want to optimize this outside of VRP and CCP.
> > An example is given in PR 106677 where phiopt will happen
> > after VRP (which removes a cast for a comparison) and then
> > phiopt will optimize the phi to be `a | 1` which can then
> > be optimized to `1` due to this patch.
>
> Also works for xor, no?

No, because IOR is a saturation operation while XOR is not. So if you
know that x and C are full subsets (nonzero(x) & ~nonzero(C) == 0)
then A^C is not the constant C but rather just (~A) which we already
a pattern for that to turn it back in to A^C:
/* Simplify (~X & Y) to X ^ Y if we know that (X & ~Y) is 0.  */

The only thing you can do for XOR is that if you have `A ^ B` and A
and B are known not to share any bits in common (that is nonzero(A) &
nonzero(B) == 0), you can convert it to `A | B` (that is what
simplify-rtx.cc does). Which looks like we don't do on the gimple
level.

>
> > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
>
> OK with or without adding XOR.

Ok.

Thanks,
Andrew

>
> Richard.
>
> > Note Similar code already exists in simplify_rtx for the RTL level;
> > it was moved from combine to simplify_rtx in r0-72539-gbd1ef757767f6d.
> > gcc/ChangeLog:
> >
> > * match.pd (`a | C -> C`): New pattern.
> > ---
> >  gcc/match.pd | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index c87a0795667..3bbeceb37b4 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -1456,6 +1456,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> >&& wi::bit_and_not (get_nonzero_bits (@0), wi::to_wide (@1)) == 0)
> >@0))
> > +/* x | C -> C if we know that x & ~C == 0.  */
> > +(simplify
> > + (bit_ior SSA_NAME@0 INTEGER_CST@1)
> > + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> > +  && wi::bit_and_not (get_nonzero_bits (@0), wi::to_wide (@1)) == 0)
> > +  @1))
> >  #endif
> >
> >  /* ~(~X - Y) -> X + Y and ~(~X + Y) -> X - Y.  */
> > --
> > 2.31.1
> >


[PATCH 3/3] PHIOPT: Allow BIT_AND and BIT_IOR in early phiopt

2023-08-24 Thread Andrew Pinski via Gcc-patches
Now that MIN/MAX can sometimes be transformed into BIT_AND/BIT_IOR,
we should allow BIT_AND and BIT_IOR in the early phiopt.
Also we produce BIT_AND/BIT_IOR for things like `bool0 ? bool1 : 0`
which seems like a good thing to allow early on too.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* tree-ssa-phiopt.cc (phiopt_early_allow): Allow
BIT_AND_EXPR and BIT_IOR_EXPR.
---
 gcc/tree-ssa-phiopt.cc | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index 54706f4c7e7..7e63fb115db 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -469,6 +469,9 @@ phiopt_early_allow (gimple_seq , gimple_match_op )
 {
   case MIN_EXPR:
   case MAX_EXPR:
+  /* MIN/MAX could be convert into these. */
+  case BIT_IOR_EXPR:
+  case BIT_AND_EXPR:
   case ABS_EXPR:
   case ABSU_EXPR:
   case NEGATE_EXPR:
-- 
2.31.1



[PATCH 1/3] MATCH: Move `a ? one_zero : one_zero` matching after min/max matching

2023-08-24 Thread Andrew Pinski via Gcc-patches
In PR 106677, I noticed that on the trunk we were producing:
```
  _25 = SR.116_117 == 0;
  _27 = (unsigned char) _25;
  _32 = _27 | SR.116_117;
```
>From `SR.115_117 != 0 ? SR.115_117 : 1`
Rather than:
```
  _119 = MAX_EXPR <1, SR.115_117>;
```
Or (rather)
```
  _119 = SR.115_117 | 1;
```
Due to the order of the patterns.

OK? Bootstrapped and tested on x86_64-linux-gnu with no
regressions.

gcc/ChangeLog:

* match.pd (`a ? one_zero : one_zero`): Move
below detection of minmax.
---
 gcc/match.pd | 38 --
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index 890f050cbad..c87a0795667 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4950,24 +4950,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  )
 )
 
-(simplify
- (cond @0 zero_one_valued_p@1 zero_one_valued_p@2)
- (switch
-  /* bool0 ? bool1 : 0 -> bool0 & bool1 */
-  (if (integer_zerop (@2))
-   (bit_and (convert @0) @1))
-  /* bool0 ? 0 : bool2 -> (bool0^1) & bool2 */
-  (if (integer_zerop (@1))
-   (bit_and (bit_xor (convert @0) { build_one_cst (type); } ) @2))
-  /* bool0 ? 1 : bool2 -> bool0 | bool2 */
-  (if (integer_onep (@1))
-   (bit_ior (convert @0) @2))
-  /* bool0 ? bool1 : 1 -> (bool0^1) | bool1 */
-  (if (integer_onep (@2))
-   (bit_ior (bit_xor (convert @0) @2) @1))
- )
-)
-
 /* Optimize
# x_5 in range [cst1, cst2] where cst2 = cst1 + 1
x_5 ? cstN ? cst4 : cst3
@@ -5298,6 +5280,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   && integer_nonzerop (fold_build2 (GE_EXPR, boolean_type_node, @3, 
@1)))
   (max @2 @4))
 
+#if GIMPLE
+(simplify
+ (cond @0 zero_one_valued_p@1 zero_one_valued_p@2)
+ (switch
+  /* bool0 ? bool1 : 0 -> bool0 & bool1 */
+  (if (integer_zerop (@2))
+   (bit_and (convert @0) @1))
+  /* bool0 ? 0 : bool2 -> (bool0^1) & bool2 */
+  (if (integer_zerop (@1))
+   (bit_and (bit_xor (convert @0) { build_one_cst (type); } ) @2))
+  /* bool0 ? 1 : bool2 -> bool0 | bool2 */
+  (if (integer_onep (@1))
+   (bit_ior (convert @0) @2))
+  /* bool0 ? bool1 : 1 -> (bool0^1) | bool1 */
+  (if (integer_onep (@2))
+   (bit_ior (bit_xor (convert @0) @2) @1))
+ )
+)
+#endif
+
 /* X != C1 ? -X : C2 simplifies to -X when -C1 == C2.  */
 (simplify
  (cond (ne @0 INTEGER_CST@1) (negate@3 @0) INTEGER_CST@2)
-- 
2.31.1



[PATCH 2/3] MATCH: `a | C -> C` when we know that `a & ~C == 0`

2023-08-24 Thread Andrew Pinski via Gcc-patches
Even though this is handled by other code inside both VRP and CCP,
sometimes we want to optimize this outside of VRP and CCP.
An example is given in PR 106677 where phiopt will happen
after VRP (which removes a cast for a comparison) and then
phiopt will optimize the phi to be `a | 1` which can then
be optimized to `1` due to this patch.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Note Similar code already exists in simplify_rtx for the RTL level;
it was moved from combine to simplify_rtx in r0-72539-gbd1ef757767f6d.
gcc/ChangeLog:

* match.pd (`a | C -> C`): New pattern.
---
 gcc/match.pd | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index c87a0795667..3bbeceb37b4 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1456,6 +1456,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
   && wi::bit_and_not (get_nonzero_bits (@0), wi::to_wide (@1)) == 0)
   @0))
+/* x | C -> C if we know that x & ~C == 0.  */
+(simplify
+ (bit_ior SSA_NAME@0 INTEGER_CST@1)
+ (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+  && wi::bit_and_not (get_nonzero_bits (@0), wi::to_wide (@1)) == 0)
+  @1))
 #endif
 
 /* ~(~X - Y) -> X + Y and ~(~X + Y) -> X - Y.  */
-- 
2.31.1



[PATCH] MATCH: remove negate for 1bit types

2023-08-23 Thread Andrew Pinski via Gcc-patches
For 1bit types, negate is either undefined or don't change the value.
In either cases we want to remove them.
This patch adds a match pattern to do that.
Also converting to a 1bit type we can remove the negate just like we already do
for `&1` so this patch adds that too.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Notes on the testcases:
This patch is the last part to fix PR 95929; cond-bool-2.c testcase.
bit1neg-1.c is a 1bit-field testcase where we could remove the assignment
all the way in one case (which happened on the RTL level for some targets but 
not all).
cond-bool-2.c is the reduced testcase of PR 95929.

PR tree-optimization/95929

gcc/ChangeLog:

* match.pd (convert?(-a)): New pattern
for 1bit integer types.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bit1neg-1.c: New test.
* gcc.dg/tree-ssa/cond-bool-1.c: New test.
* gcc.dg/tree-ssa/cond-bool-2.c: New test.
---
 gcc/match.pd| 12 ++
 gcc/testsuite/gcc.dg/tree-ssa/bit1neg-1.c   | 23 ++
 gcc/testsuite/gcc.dg/tree-ssa/cond-bool-1.c | 21 +
 gcc/testsuite/gcc.dg/tree-ssa/cond-bool-2.c | 26 +
 4 files changed, 82 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bit1neg-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cond-bool-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cond-bool-2.c

diff --git a/gcc/match.pd b/gcc/match.pd
index a2e56d5a4e8..3bbeceb37b4 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -9090,6 +9090,18 @@ and,
  (if (!TYPE_OVERFLOW_SANITIZED (type))
   (bit_and @0 @1)))
 
+/* `-a` is just `a` if the type is 1bit wide or when converting
+   to a 1bit type; similar to the above transformation of `(-x)&1`.
+   This is used mostly with the transformation of
+   `a ? ~b : b` into `(-a)^b`.
+   It also can show up with bitfields.  */
+(simplify
+ (convert? (negate @0))
+ (if (INTEGRAL_TYPE_P (type)
+  && TYPE_PRECISION (type) == 1
+  && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@0)))
+  (convert @0)))
+
 /* Optimize
c1 = VEC_PERM_EXPR (a, a, mask)
c2 = VEC_PERM_EXPR (b, b, mask)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bit1neg-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/bit1neg-1.c
new file mode 100644
index 000..2f123fbb9b5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bit1neg-1.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+struct f
+{
+  int a:1;
+};
+
+void g(struct f *a)
+{
+ int t = a->a;
+ t = -t;
+ a->a = t;
+}
+void g1(struct f *a, int b)
+{
+ int t = b;
+ t = -t;
+ a->a = t;
+}
+/* the 2 negates should have been removed as this is basically the same
+   as (-a) & 1. */
+/* { dg-final { scan-tree-dump-not " = -" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cond-bool-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/cond-bool-1.c
new file mode 100644
index 000..752a3030ad1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/cond-bool-1.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+_Bool f1(int a, int b)
+{
+  _Bool _1 = b != 0;
+  _Bool _2 = a != 0;
+  _Bool _8 = a == 0;
+  _Bool _13;
+  if (_1) _13 = _8; else _13 = _2;
+  return _13;
+}
+
+/* We should be able to optimize this to (a != 0) ^ (b != 0) */
+/* There should be no negate_expr nor gimple_cond here. */
+
+/* { dg-final { scan-tree-dump-not "negate_expr, " "optimized" } } */
+/* { dg-final { scan-tree-dump-times "ne_expr, " 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-not "gimple_cond " "optimized" } } */
+/* { dg-final { scan-tree-dump-not "gimple_phi " "optimized" } } */
+/* { dg-final { scan-tree-dump-times "bit_xor_expr, " 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "gimple_assign " 3 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cond-bool-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/cond-bool-2.c
new file mode 100644
index 000..b3e7e25dec6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/cond-bool-2.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+
+/* PR tree-optimization/95929 */
+
+
+static inline _Bool nand(_Bool a, _Bool b)
+{
+return !(a && b);
+}
+
+_Bool f(int a, int b)
+{
+return nand(nand(b, nand(a, a)), nand(a, nand(b, b)));
+}
+
+/* We should be able to optimize this to (a != 0) ^ (b != 0) */
+/* There should be no negate_expr nor gimple_cond here. */
+
+/* { dg-final { scan-tree-dump-not "negate_expr, " "optimized" } } */
+/* { dg-final { scan-tree-dump-times "ne_expr, " 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-not "gimple_cond " "optimized" } } */
+/* { dg-final { scan-tree-dump-not "cond_expr, " "optimized" } } */
+/* { dg-final { scan-tree-dump-not "gimple_phi " "optimized" } } */
+/* { dg-final { scan-tree-dump-times "bit_xor_expr, " 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "gimple_assign " 3 

[PATCH] MATCH: [PR111109] Fix bit_ior(cond, cond) when comparisons are fp

2023-08-23 Thread Andrew Pinski via Gcc-patches
The patterns that were added in r13-4620-g4d9db4bdd458, missed that
(a > b) and (a <= b) are not inverse of each other for floating point
comparisons (if NaNs are supported). Even though there was a check for
intergal types, it was only for the result of the cond rather for the
type of what is being compared. The fix is to check to see if cmp and
icmp are inverse of each other by using the invert_tree_comparison function.

OK for trunk and GCC 13 branch? Bootstrapped and tested on x86_64-linux-gnu 
with no regressions.

I added the testcase to execute/ieee as it requires support for NAN.

PR tree-optimization/09

gcc/ChangeLog:

* match.pd (ior(cond,cond), ior(vec_cond,vec_cond)):
Add check to make sure cmp and icmp are inverse.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/ieee/fp-cmp-cond-1.c: New test.
---
 gcc/match.pd  | 11 ++-
 .../execute/ieee/fp-cmp-cond-1.c  | 78 +++
 2 files changed, 86 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/ieee/fp-cmp-cond-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 85b7d323a19..b666d73b189 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2087,6 +2087,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(bit_and:c (convert? (cmp@0  @01 @02)) @3)
(bit_and:c (convert? (icmp@4 @01 @02)) @5))
 (if (INTEGRAL_TYPE_P (type)
+&& invert_tree_comparison (cmp, HONOR_NANS (@01)) == icmp
 /* The scalar version has to be canonicalized after vectorization
because it makes unconditional loads conditional ones, which
means we lose vectorization because the loads may trap.  */
@@ -2101,6 +2102,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(cond (cmp@0  @01 @02) @3 zerop)
(cond (icmp@4 @01 @02) @5 zerop))
 (if (INTEGRAL_TYPE_P (type)
+&& invert_tree_comparison (cmp, HONOR_NANS (@01)) == icmp
 /* The scalar version has to be canonicalized after vectorization
because it makes unconditional loads conditional ones, which
means we lose vectorization because the loads may trap.  */
@@ -2113,13 +2115,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (bit_ior
(bit_and:c (vec_cond:s (cmp@0 @6 @7) @4 @5) @2)
(bit_and:c (vec_cond:s (icmp@1 @6 @7) @4 @5) @3))
-(if (integer_zerop (@5))
+(if (integer_zerop (@5)
+&& invert_tree_comparison (cmp, HONOR_NANS (@6)) == icmp)
  (switch
   (if (integer_onep (@4))
(bit_and (vec_cond @0 @2 @3) @4))
(if (integer_minus_onep (@4))
 (vec_cond @0 @2 @3)))
-(if (integer_zerop (@4))
+(if (integer_zerop (@4)
+&& invert_tree_comparison (cmp, HONOR_NANS (@6)) == icmp)
  (switch
   (if (integer_onep (@5))
(bit_and (vec_cond @0 @3 @2) @5))
@@ -2132,7 +2136,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (bit_ior
(vec_cond:s (cmp@0 @4 @5) @2 integer_zerop)
(vec_cond:s (icmp@1 @4 @5) @3 integer_zerop))
-(vec_cond @0 @2 @3)))
+  (if (invert_tree_comparison (cmp, HONOR_NANS (@4)) == icmp)
+   (vec_cond @0 @2 @3
 
 /* Transform X & -Y into X * Y when Y is { 0 or 1 }.  */
 (simplify
diff --git a/gcc/testsuite/gcc.c-torture/execute/ieee/fp-cmp-cond-1.c 
b/gcc/testsuite/gcc.c-torture/execute/ieee/fp-cmp-cond-1.c
new file mode 100644
index 000..4a3c4b0eee2
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/ieee/fp-cmp-cond-1.c
@@ -0,0 +1,78 @@
+/* PR tree-optimization/09 */
+
+/*
+   f should return 0 if either fa and fb are a nan.
+   Rather than the value of a or b.
+*/
+__attribute__((noipa))
+int f(int a, int b, float fa, float fb) {
+  const _Bool c = fa < fb;
+  const _Bool c1 = fa >= fb;
+  return (c * a) | (c1 * b);
+}
+
+/*
+   f1 should return 0 if either fa and fb are a nan.
+   Rather than the value of a&1 or b&1.
+*/
+__attribute__((noipa))
+int f1(int a, int b, float fa, float fb) {
+  const _Bool c = fa < fb;
+  const _Bool c1 = fa >= fb;
+  return (c & a) | (c1 & b);
+}
+
+#if __SIZEOF_INT__ == __SIZEOF_FLOAT__
+typedef int v4si __attribute__ ((vector_size (1*sizeof(int;
+typedef float v4sf __attribute__ ((vector_size (1*sizeof(float;
+/*
+   fvf0 should return {0} if either fa and fb are a nan.
+   Rather than the value of a or b.
+*/
+__attribute__((noipa))
+v4si vf0(v4si a, v4si b, v4sf fa, v4sf fb) {
+  const v4si c = fa < fb;
+  const v4si c1 = fa >= fb;
+  return (c & a) | (c1 & b);
+}
+
+
+#endif
+
+int main(void)
+{
+  float a = __builtin_nan("");
+
+  if (f(-1,-1, a, a) != 0)
+__builtin_abort();
+  if (f(-1,-1, a, 0) != 0)
+__builtin_abort();
+  if (f(-1,-1, 0, a) != 0)
+__builtin_abort();
+  if (f(-1,-1, 0, 0) != -1)
+__builtin_abort();
+
+
+  if (f1(1,1, a, a) != 0)
+__builtin_abort();
+  if (f1(1,1, a, 0) != 0)
+__builtin_abort();
+  if (f1(1,1, 0, a) != 0)
+__builtin_abort();
+  if (f1(1,1, 0, 0) != 1)
+__builtin_abort();
+
+#if __SIZEOF_INT__ == __SIZEOF_FLOAT__
+  

Re: Patch ping Re: [PATCH 0/12] GCC _BitInt support [PR102989]

2023-08-22 Thread Andrew Pinski via Gcc-patches
On Mon, Aug 21, 2023 at 8:25 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> On Wed, Aug 09, 2023 at 08:14:14PM +0200, Jakub Jelinek via Gcc-patches wrote:
> > Jakub Jelinek (12):
> >   expr: Small optimization [PR102989]
> >   lto-streamer-in: Adjust assert [PR102989]
> >   phiopt: Fix phiopt ICE on vops [PR102989]
> >   Middle-end _BitInt support [PR102989]
> >   _BitInt lowering support [PR102989]
> >   i386: Enable _BitInt on x86-64 [PR102989]
> >   ubsan: _BitInt -fsanitize=undefined support [PR102989]
> >   libgcc: Generated tables for _BitInt <-> _Decimal* conversions [PR102989]
> >   libgcc _BitInt support [PR102989]
> >   C _BitInt support [PR102989]
> >   testsuite part 1 for _BitInt support [PR102989]
> >   testsuite part 2 for _BitInt support [PR102989]
>
> +   C _BitInt incremental fixes [PR102989]
>
> I'd like to ping this patch series.
> First 3 patches are committed, the rest awaits patch review.
>
> Joseph, could I ask now at least for an overall design review of the
> C patches (8-10,13) whether its interfaces with middle-end are ok,
> so that Richi can review the middle-end parts?

On a related note, does it make sense to add this as a C++ front-end
as an Extension too?
I noticed clang supports it for C++.

Thanks,
Andrew

>
> Thanks.
>
> Jakub
>


Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns

2023-08-21 Thread Andrew Pinski via Gcc-patches
On Mon, Aug 21, 2023 at 10:42 PM Lehua Ding  wrote:
>
> Hi,
>
> This patch add conditional unary neg/abs/not autovec patterns to RISC-V 
> backend.
> Consider this C code:
>
> void
> test_3 (float *__restrict a, float *__restrict b, int *__restrict pred, int n)
> {
>   for (int i = 0; i < n; i += 1)
> {
>   a[i] = pred[i] ? __builtin_fabsf (b[i]) : a[i];
> }
> }
>
> Before this patch:
> ...
> vsetvli a7,zero,e32,m1,ta,ma
> vfabs.v v2,v2
> vmerge.vvm  v1,v1,v2,v0
> ...
>
> After this patch:
> ...
> vsetvli a7,zero,e32,m1,ta,mu
> vfabs.v v1,v2,v0.t
> ...
>
> For int neg/abs/not and FP neg patterns, Defining the corresponding cond_xxx
> paterns is enough.

Maybe we should add optabs and IFN support for conditional ABS too.
I added it for conditional not with r14-3257-ga32de58c9e63 to fix up a
regression I had introduced with SVE code.

Thanks,
Andrew

> For the FP abs pattern, We need to change the definition of `abs2` and
> `@vcond_mask_` pattern from define_expand to define_insn_and_split
> in order to fuse them into a new pattern `*cond_abs` at the combine 
> pass.
> After changing the pattern of neg, a vlmax copysin + neg fusion pattern needs
> to be added.
>
> A fusion process similar to the one below:
>
> (insn 30 29 31 4 (set (reg:RVVM1SF 152 [ vect_iftmp.15 ])
> (abs:RVVM1SF (reg:RVVM1SF 137 [ vect__6.14 ]))) "float.c":15:56 
> discrim 1 12799 {absrvvm1sf2}
>  (expr_list:REG_DEAD (reg:RVVM1SF 137 [ vect__6.14 ])
> (nil)))
>
> (insn 31 30 32 4 (set (reg:RVVM1SF 140 [ vect_iftmp.19 ])
> (if_then_else:RVVM1SF (reg:RVVMF32BI 136 [ mask__27.11 ])
> (reg:RVVM1SF 152 [ vect_iftmp.15 ])
> (reg:RVVM1SF 139 [ vect_iftmp.18 ]))) 12707 
> {vcond_mask_rvvm1sfrvvmf32bi}
>  (expr_list:REG_DEAD (reg:RVVM1SF 152 [ vect_iftmp.15 ])
> (expr_list:REG_DEAD (reg:RVVM1SF 139 [ vect_iftmp.18 ])
> (expr_list:REG_DEAD (reg:RVVMF32BI 136 [ mask__27.11 ])
> (nil)
> ==>
>
> (insn 31 30 32 4 (set (reg:RVVM1SF 140 [ vect_iftmp.19 ])
> (if_then_else:RVVM1SF (reg:RVVMF32BI 136 [ mask__27.11 ])
> (abs:RVVM1SF (reg:RVVM1SF 137 [ vect__6.14 ]))
> (reg:RVVM1SF 139 [ vect_iftmp.18 ]))) 13444 {*cond_absrvvm1sf}
>  (expr_list:REG_DEAD (reg:RVVM1SF 137 [ vect__6.14 ])
> (expr_list:REG_DEAD (reg:RVVMF32BI 136 [ mask__27.11 ])
> (expr_list:REG_DEAD (reg:RVVM1SF 139 [ vect_iftmp.18 ])
> (nil)
>
> Best,
> Lehua
>
> gcc/ChangeLog:
>
> * config/riscv/autovec-opt.md (*cond_abs): New combine pattern.
> (*copysign_neg): Ditto.
> * config/riscv/autovec.md (@vcond_mask_): Adjust.
> (2): Ditto.
> (cond_): New.
> (cond_len_): Ditto.
> * config/riscv/riscv-protos.h (enum insn_type): New.
> (expand_cond_len_unop): New helper func.
> * config/riscv/riscv-v.cc (shuffle_merge_patterns): Adjust.
> (expand_cond_len_unop): New helper func.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_1.c: New test.
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_1_run.c: New test.
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_2.c: New test.
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_2_run.c: New test.
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_3.c: New test.
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_3_run.c: New test.
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_4.c: New test.
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_4_run.c: New test.
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_5.c: New test.
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_5_run.c: New test.
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_6.c: New test.
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_6_run.c: New test.
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_7.c: New test.
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_7_run.c: New test.
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_8.c: New test.
> * gcc.target/riscv/rvv/autovec/cond/cond_unary_8_run.c: New test.
>
> ---
>  gcc/config/riscv/autovec-opt.md   | 39 
>  gcc/config/riscv/autovec.md   | 97 +--
>  gcc/config/riscv/riscv-protos.h   |  7 +-
>  gcc/config/riscv/riscv-v.cc   | 56 ++-
>  .../riscv/rvv/autovec/cond/cond_unary_1.c | 43 
>  .../riscv/rvv/autovec/cond/cond_unary_1_run.c | 27 ++
>  .../riscv/rvv/autovec/cond/cond_unary_2.c | 47 +
>  .../riscv/rvv/autovec/cond/cond_unary_2_run.c | 28 ++
>  .../riscv/rvv/autovec/cond/cond_unary_3.c | 43 
>  .../riscv/rvv/autovec/cond/cond_unary_3_run.c | 27 ++
>  .../riscv/rvv/autovec/cond/cond_unary_4.c | 43 
>  

Re: [PATCH 2/2] VR-VALUES: Rewrite test_for_singularity using range_op_handler

2023-08-21 Thread Andrew Pinski via Gcc-patches
On Fri, Aug 11, 2023 at 8:08 AM Andrew MacLeod via Gcc-patches
 wrote:
>
>
> On 8/11/23 05:51, Richard Biener wrote:
> > On Fri, Aug 11, 2023 at 11:17 AM Andrew Pinski via Gcc-patches
> >  wrote:
> >> So it turns out there was a simplier way of starting to
> >> improve VRP to start to fix PR 110131, PR 108360, and PR 108397.
> >> That was rewrite test_for_singularity to use range_op_handler
> >> and Value_Range.
> >>
> >> This patch implements that and
> >>
> >> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> > I'm hoping Andrew/Aldy can have a look here.
> >
> > Richard.
> >
> >> gcc/ChangeLog:
> >>
> >>  * vr-values.cc (test_for_singularity): Add edge argument
> >>  and rewrite using range_op_handler.
> >>  (simplify_compare_using_range_pairs): Use Value_Range
> >>  instead of value_range and update test_for_singularity call.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>  * gcc.dg/tree-ssa/vrp124.c: New test.
> >>  * gcc.dg/tree-ssa/vrp125.c: New test.
> >> ---
> >>   gcc/testsuite/gcc.dg/tree-ssa/vrp124.c | 44 +
> >>   gcc/testsuite/gcc.dg/tree-ssa/vrp125.c | 44 +
> >>   gcc/vr-values.cc   | 91 --
> >>   3 files changed, 114 insertions(+), 65 deletions(-)
> >>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
> >>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp125.c
> >>
> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c 
> >> b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
> >> new file mode 100644
> >> index 000..6ccbda35d1b
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
> >> @@ -0,0 +1,44 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> >> +
> >> +/* Should be optimized to a == -100 */
> >> +int g(int a)
> >> +{
> >> +  if (a == -100 || a >= 0)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a < 0;
> >> +}
> >> +
> >> +/* Should optimize to a == 0 */
> >> +int f(int a)
> >> +{
> >> +  if (a == 0 || a > 100)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a < 50;
> >> +}
> >> +
> >> +/* Should be optimized to a == 0. */
> >> +int f2(int a)
> >> +{
> >> +  if (a == 0 || a > 100)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a < 100;
> >> +}
> >> +
> >> +/* Should optimize to a == 100 */
> >> +int f1(int a)
> >> +{
> >> +  if (a < 0 || a == 100)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a > 50;
> >> +}
> >> +
> >> +/* { dg-final { scan-tree-dump-not "goto " "optimized" } } */
> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp125.c 
> >> b/gcc/testsuite/gcc.dg/tree-ssa/vrp125.c
> >> new file mode 100644
> >> index 000..f6c2f8e35f1
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp125.c
> >> @@ -0,0 +1,44 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> >> +
> >> +/* Should be optimized to a == -100 */
> >> +int g(int a)
> >> +{
> >> +  if (a == -100 || a == -50 || a >= 0)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a < -50;
> >> +}
> >> +
> >> +/* Should optimize to a == 0 */
> >> +int f(int a)
> >> +{
> >> +  if (a == 0 || a == 50 || a > 100)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a < 50;
> >> +}
> >> +
> >> +/* Should be optimized to a == 0. */
> >> +int f2(int a)
> >> +{
> >> +  if (a == 0 || a == 50 || a > 100)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a < 25;
> >> +}
> >> +
> >> +/* Should optimize to a == 100 */
> >> +int f1(int a)
> >> +{
> >> +  if (a < 0 || a == 50 || a == 100)
> >> +;
> >> +  else
> >> +return 0;
> >> +  return a > 50;
> >>

[PATCH] MATCH: [PR111002] Sink view_convert for vec_cond

2023-08-20 Thread Andrew Pinski via Gcc-patches
Like convert we can sink view_convert into vec_cond but
we can only do it if the element types are nop_conversions.
This is to allow conversion between signed and unsigned types only.
Rather than between integer and float types which mess up the vec_cond
so that isel does not understand `a?-1:0` is still that.

OK? Bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-gnu.

PR tree-optimization/111002

gcc/ChangeLog:

* match.pd (view_convert(vec_cond(a,b,c))): New pattern.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/cond_convert_8.c: New test.
---
 gcc/match.pd  |  9 
 .../gcc.target/aarch64/sve/cond_convert_8.c   | 22 +++
 2 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_convert_8.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 851f1af6eac..81666f28465 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4718,6 +4718,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   && types_match (TREE_TYPE (@0), truth_type_for (type)))
   (vec_cond @0 (convert! @1) (convert! @2
 
+/* Likewise for view_convert of nop_conversions. */
+(simplify
+ (view_convert (vec_cond:s @0 @1 @2))
+ (if (VECTOR_TYPE_P (type) && VECTOR_TYPE_P (TREE_TYPE (@1))
+  && known_eq (TYPE_VECTOR_SUBPARTS (type),
+  TYPE_VECTOR_SUBPARTS (TREE_TYPE (@1)))
+  && tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE (@1
+  (vec_cond @0 (view_convert! @1) (view_convert! @2
+
 /* Sink binary operation to branches, but only if we can fold it.  */
 (for op (tcc_comparison plus minus mult bit_and bit_ior bit_xor
 lshift rshift rdiv trunc_div ceil_div floor_div round_div
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_convert_8.c 
b/gcc/testsuite/gcc.target/aarch64/sve/cond_convert_8.c
new file mode 100644
index 000..d8b96e5fcfb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_convert_8.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -moverride=sve_width=256 
-fdump-tree-optimized" } */
+/* PR tree-optimization/111002 */
+
+/* We should be able to remove the neg. */
+
+void __attribute__ ((noipa))
+f (int *__restrict r,
+   int *__restrict a,
+   short *__restrict pred)
+{
+  for (int i = 0; i < 1024; ++i)
+r[i] = pred[i] != 0 ? -1 : 0;
+}
+
+
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, p[0-7]+/z, #-1} 1 } } 
*/
+/* { dg-final { scan-assembler-not {\tmov\tz[0-9]+\.[hs], p[0-7]+/z, #1} } } */
+
+/* { dg-final { scan-tree-dump-not "VIEW_CONVERT_EXPR " "optimized" } } */
+/* { dg-final { scan-tree-dump-not " = -" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " = \\\(vector" "optimized" } } */
-- 
2.31.1



[PATCHv2/COMMITTED] MATCH: Sink convert for vec_cond

2023-08-20 Thread Andrew Pinski via Gcc-patches
Convert be sinked into a vec_cond if both sides
fold. Unlike other unary operations, we need to check that we still can handle
this vec_cond's first operand is the same as the new truth type.

I tried a few different versions of this patch:
view_convert to the new truth_type but that does not work as we always support 
all vec_cond
afterwards.
using expand_vec_cond_expr_p; but that would allow too much.

I also tried to see if view_convert can be handled here but we end up with:
  _3 = VEC_COND_EXPR <_2, {  Nan(-1),  Nan(-1),  Nan(-1),  Nan(-1) }, { 0.0, 
0.0, 0.0, 0.0 }>;
Which isel does not know how to handle as just being a view_convert from 
`vector(4) `
to `vector(4) float` and causes a regression with `g++.target/i386/pr88152.C`

Note, in the case of the SVE testcase, we will sink negate after the convert 
and be able
to remove a few extra instructions in the end.
Also with this change gcc.target/aarch64/sve/cond_unary_5.c will now pass.

Committed as approved after a bootstrapped and tested on x86_64-linux-gnu and 
aarch64-linux-gnu.

gcc/ChangeLog:

PR tree-optimization/111006
PR tree-optimization/110986
* match.pd: (op(vec_cond(a,b,c))): Handle convert for op.

gcc/testsuite/ChangeLog:

PR tree-optimization/111006
* gcc.target/aarch64/sve/cond_convert_7.c: New test.
---
 gcc/match.pd  |  8 +++
 .../gcc.target/aarch64/sve/cond_convert_7.c   | 23 +++
 2 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_convert_7.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 6b2d3a11776..851f1af6eac 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4710,6 +4710,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (op (vec_cond:s @0 @1 @2))
   (vec_cond @0 (op! @1) (op! @2
 
+/* Sink unary conversions to branches, but only if we do fold both
+   and the target's truth type is the same as we already have.  */
+(simplify
+ (convert (vec_cond:s @0 @1 @2))
+ (if (VECTOR_TYPE_P (type)
+  && types_match (TREE_TYPE (@0), truth_type_for (type)))
+  (vec_cond @0 (convert! @1) (convert! @2
+
 /* Sink binary operation to branches, but only if we can fold it.  */
 (for op (tcc_comparison plus minus mult bit_and bit_ior bit_xor
 lshift rshift rdiv trunc_div ceil_div floor_div round_div
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_convert_7.c 
b/gcc/testsuite/gcc.target/aarch64/sve/cond_convert_7.c
new file mode 100644
index 000..4bb95b92195
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_convert_7.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -moverride=sve_width=256 
-fdump-tree-optimized" } */
+
+/* This is a modified reduced version of cond_unary_5.c */
+
+void __attribute__ ((noipa))
+f0 (unsigned short *__restrict r,
+   int *__restrict a,
+   int *__restrict pred)
+{
+  for (int i = 0; i < 1024; ++i)
+  {
+int p = pred[i]?-1:0;
+r[i] = p ;
+  }
+}
+
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, p[0-7]+/z, #-1} 1 } } 
*/
+/* { dg-final { scan-assembler-not {\tmov\tz[0-9]+\.[hs], p[0-7]+/z, #1} } } */
+
+/* { dg-final { scan-tree-dump-not "VIEW_CONVERT_EXPR " "optimized" } } */
+/* { dg-final { scan-tree-dump-not " = -" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " = \\\(vector" "optimized" } } */
-- 
2.31.1



[PATCH] Document cond_neg, cond_one_cmpl, cond_len_neg and cond_len_one_cmpl standard patterns

2023-08-17 Thread Andrew Pinski via Gcc-patches
When I added `cond_one_cmpl` (and the corresponding IFN) I had noticed cond_neg
standard named pattern was not documented and this adds the documentation for
all 4 named patterns now.

OK? Tested by building the manual.

gcc/ChangeLog:

* doc/md.texi (Standard patterns): Document cond_neg, cond_one_cmpl,
cond_len_neg and cond_len_one_cmpl.
---
 gcc/doc/md.texi | 62 +
 1 file changed, 62 insertions(+)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 70590e68ffe..89562fdb43c 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -7194,6 +7194,40 @@ move operand 2 or (operands 2 + operand 3) into operand 
0 according to the
 comparison in operand 1.  If the comparison is false, operand 2 is moved into
 operand 0, otherwise (operand 2 + operand 3) is moved.
 
+@cindex @code{cond_neg@var{mode}} instruction pattern
+@cindex @code{cond_one_cmpl@var{mode}} instruction pattern
+@item @samp{cond_neg@var{mode}}
+@itemx @samp{cond_one_cmpl@var{mode}}
+When operand 1 is true, perform an operation on operands 2 and
+store the result in operand 0, otherwise store operand 3 in operand 0.
+The operation works elementwise if the operands are vectors.
+
+The scalar case is equivalent to:
+
+@smallexample
+op0 = op1 ? @var{op} op2 : op3;
+@end smallexample
+
+while the vector case is equivalent to:
+
+@smallexample
+for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)
+  op0[i] = op1[i] ? @var{op} op2[i] : op3[i];
+@end smallexample
+
+where, for example, @var{op} is @code{~} for @samp{cond_one_cmpl@var{mode}}.
+
+When defined for floating-point modes, the contents of @samp{op2[i]}
+are not interpreted if @samp{op1[i]} is false, just like they would not
+be in a normal C @samp{?:} condition.
+
+Operands 0, 2, and 3 all have mode @var{m}.  Operand 1 is a scalar
+integer if @var{m} is scalar, otherwise it has the mode returned by
+@code{TARGET_VECTORIZE_GET_MASK_MODE}.
+
+@samp{cond_@var{op}@var{mode}} generally corresponds to a conditional
+form of @samp{@var{op}@var{mode}2}.
+
 @cindex @code{cond_add@var{mode}} instruction pattern
 @cindex @code{cond_sub@var{mode}} instruction pattern
 @cindex @code{cond_mul@var{mode}} instruction pattern
@@ -7281,6 +7315,34 @@ for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)
   op0[i] = op1[i] ? fma (op2[i], op3[i], op4[i]) : op5[i];
 @end smallexample
 
+@cindex @code{cond_len_neg@var{mode}} instruction pattern
+@cindex @code{cond_len_one_cmpl@var{mode}} instruction pattern
+@item @samp{cond_len_neg@var{mode}}
+@itemx @samp{cond_len_one_cmpl@var{mode}}
+When operand 1 is true and element index < operand 4 + operand 5, perform an 
operation on operands 1 and
+store the result in operand 0, otherwise store operand 2 in operand 0.
+The operation only works for the operands are vectors.
+
+@smallexample
+for (i = 0; i < ops[4] + ops[5]; i++)
+  op0[i] = op1[i] ? @var{op} op2[i] : op3[i];
+@end smallexample
+
+where, for example, @var{op} is @code{~} for 
@samp{cond_len_one_cmpl@var{mode}}.
+
+When defined for floating-point modes, the contents of @samp{op2[i]}
+are not interpreted if @samp{op1[i]} is false, just like they would not
+be in a normal C @samp{?:} condition.
+
+Operands 0, 2, and 3 all have mode @var{m}.  Operand 1 is a scalar
+integer if @var{m} is scalar, otherwise it has the mode returned by
+@code{TARGET_VECTORIZE_GET_MASK_MODE}.  Operand 4 has whichever
+integer mode the target prefers.
+
+@samp{cond_len_@var{op}@var{mode}} generally corresponds to a conditional
+form of @samp{@var{op}@var{mode}2}.
+
+
 @cindex @code{cond_len_add@var{mode}} instruction pattern
 @cindex @code{cond_len_sub@var{mode}} instruction pattern
 @cindex @code{cond_len_mul@var{mode}} instruction pattern
-- 
2.31.1



[PATCH] MATCH: Sink convert for vec_cond

2023-08-16 Thread Andrew Pinski via Gcc-patches
Convert be sinked into a vec_cond if both sides
fold. Unlike other unary operations, we need to check that we still can handle
this vec_cond's first operand is the same as the new truth type.

I tried a few different versions of this patch:
view_convert to the new truth_type but that does not work as we always support 
all vec_cond
afterwards.
using expand_vec_cond_expr_p; but that would allow too much.

I also tried to see if view_convert can be handled here but we end up with:
  _3 = VEC_COND_EXPR <_2, {  Nan(-1),  Nan(-1),  Nan(-1),  Nan(-1) }, { 0.0, 
0.0, 0.0, 0.0 }>;
Which isel does not know how to handle as just being a view_convert from 
`vector(4) `
to `vector(4) float` and causes a regression with `g++.target/i386/pr88152.C`

Note, in the case of the SVE testcase, we will sink negate after the convert 
and be able
to remove a few extra instructions in the end.
Also with this change gcc.target/aarch64/sve/cond_unary_5.c will now pass.

OK? Bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-gnu.

gcc/ChangeLog:

PR tree-optimization/111006
PR tree-optimization/110986
* match.pd: (op(vec_cond(a,b,c))): Handle convert for op.

gcc/testsuite/ChangeLog:

PR tree-optimization/111006
* gcc.target/aarch64/sve/cond_convert_7.c: New test.
---
 gcc/match.pd  |  9 
 .../gcc.target/aarch64/sve/cond_convert_7.c   | 23 +++
 2 files changed, 32 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_convert_7.c

diff --git a/gcc/match.pd b/gcc/match.pd
index acd2a964917..ca5ab6f289d 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4704,6 +4704,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (op (vec_cond:s @0 @1 @2))
   (vec_cond @0 (op! @1) (op! @2
 
+/* Sink unary conversions to branches, but only if we do fold both
+   and the target's truth type is the same as we already have.  */
+(for op (convert)
+ (simplify
+  (op (vec_cond:s @0 @1 @2))
+  (if (VECTOR_TYPE_P (type)
+   && types_match (TREE_TYPE (@0), truth_type_for (type)))
+   (vec_cond @0 (op! @1) (op! @2)
+
 /* Sink binary operation to branches, but only if we can fold it.  */
 (for op (tcc_comparison plus minus mult bit_and bit_ior bit_xor
 lshift rshift rdiv trunc_div ceil_div floor_div round_div
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_convert_7.c 
b/gcc/testsuite/gcc.target/aarch64/sve/cond_convert_7.c
new file mode 100644
index 000..4bb95b92195
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_convert_7.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -moverride=sve_width=256 
-fdump-tree-optimized" } */
+
+/* This is a modified reduced version of cond_unary_5.c */
+
+void __attribute__ ((noipa))
+f0 (unsigned short *__restrict r,
+   int *__restrict a,
+   int *__restrict pred)
+{
+  for (int i = 0; i < 1024; ++i)
+  {
+int p = pred[i]?-1:0;
+r[i] = p ;
+  }
+}
+
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, p[0-7]+/z, #-1} 1 } } 
*/
+/* { dg-final { scan-assembler-not {\tmov\tz[0-9]+\.[hs], p[0-7]+/z, #1} } } */
+
+/* { dg-final { scan-tree-dump-not "VIEW_CONVERT_EXPR " "optimized" } } */
+/* { dg-final { scan-tree-dump-not " = -" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " = \\\(vector" "optimized" } } */
-- 
2.31.1



Re: [PATCH] Add support for vector conitional not

2023-08-16 Thread Andrew Pinski via Gcc-patches
On Mon, Aug 14, 2023 at 2:54 PM Andrew Pinski  wrote:
>
> On Mon, Aug 14, 2023 at 2:37 PM Richard Sandiford via Gcc-patches
>  wrote:
> >
> > Andrew Pinski via Gcc-patches  writes:
> > > Like the support conditional neg (r12-4470-g20dcda98ed376cb61c74b2c71),
> > > this just adds conditional not too.
> > > Also we should be able to turn `(a ? -1 : 0) ^ b` into a conditional
> > > not.
> > >
> > > OK? Bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-gnu.
> > >
> > > gcc/ChangeLog:
> > >
> > >   * internal-fn.def (COND_NOT): New internal function.
> > >   * match.pd (UNCOND_UNARY, COND_UNARY): Add bit_not/not
> > >   to the lists.
> > >   (`vec (a ? -1 : 0) ^ b`): New pattern to convert
> > >   into conditional not.
> > >   * optabs.def (cond_one_cmpl): New optab.
> > >   (cond_len_one_cmpl): Likewise.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   PR target/110986
> > >   * gcc.target/aarch64/sve/cond_unary_9.c: New test.
> > > ---
> > >  gcc/internal-fn.def   |  2 ++
> > >  gcc/match.pd  | 15 --
> > >  gcc/optabs.def|  2 ++
> > >  .../gcc.target/aarch64/sve/cond_unary_9.c | 20 +++
> > >  4 files changed, 37 insertions(+), 2 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_unary_9.c
> > >
> > > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> > > index b3c410f4b6a..3e8693dfddb 100644
> > > --- a/gcc/internal-fn.def
> > > +++ b/gcc/internal-fn.def
> > > @@ -69,6 +69,7 @@ along with GCC; see the file COPYING3.  If not see
> > >   lround2.
> > >
> > > - cond_binary: a conditional binary optab, such as cond_add
> > > +   - cond_unary: a conditional unary optab, such as cond_neg
> > > - cond_ternary: a conditional ternary optab, such as 
> > > cond_fma_rev
> > >
> > > - fold_left: for scalar = FN (scalar, vector), keyed off the vector 
> > > mode
> > > @@ -276,6 +277,7 @@ DEF_INTERNAL_COND_FN (FNMA, ECF_CONST, fnma, ternary)
> > >  DEF_INTERNAL_COND_FN (FNMS, ECF_CONST, fnms, ternary)
> > >
> > >  DEF_INTERNAL_COND_FN (NEG, ECF_CONST, neg, unary)
> > > +DEF_INTERNAL_COND_FN (NOT, ECF_CONST, one_cmpl, unary)
> > >
> > >  DEF_INTERNAL_OPTAB_FN (RSQRT, ECF_CONST, rsqrt, unary)
> > >
> > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > index 6791060891d..2ee6d24ccee 100644
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -84,9 +84,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >
> > >  /* Unary operations and their associated IFN_COND_* function.  */
> > >  (define_operator_list UNCOND_UNARY
> > > -  negate)
> > > +  negate bit_not)
> > >  (define_operator_list COND_UNARY
> > > -  IFN_COND_NEG)
> > > +  IFN_COND_NEG IFN_COND_NOT)
> > >
> > >  /* Binary operations and their associated IFN_COND_* function.  */
> > >  (define_operator_list UNCOND_BINARY
> > > @@ -8482,6 +8482,17 @@ and,
> > >  && is_truth_type_for (op_type, TREE_TYPE (@0)))
> > >   (cond_op (bit_not @0) @2 @1)
> > >
> > > +/* `(a ? -1 : 0) ^ b` can be converted into a conditional not.  */
> > > +(simplify
> > > + (bit_xor:c (vec_cond @0 uniform_integer_cst_p@1 
> > > uniform_integer_cst_p@2) @3)
> > > + (if (canonicalize_math_after_vectorization_p ()
> > > +  && vectorized_internal_fn_supported_p (IFN_COND_NOT, type)
> > > +  && is_truth_type_for (type, TREE_TYPE (@0)))
> > > + (if (integer_all_onesp (@1) && integer_zerop (@2))
> > > +  (IFN_COND_NOT @0 @3 @3))
> > > +  (if (integer_all_onesp (@2) && integer_zerop (@1))
> > > +   (vec_cond (bit_not @0) @3 @3
> >
> > Looks like this should be IFN_COND_NOT rather than vec_cond.
>
> Yes that should have been IFN_COND_NOT, when I was converting it to be
> explicitly IFN_COND_NOT rather than depending on vec_cond, I had
> missed that part of the conversion.
> Thanks for noticing that.
>
> >
> > LGTM otherwise, but please give Richi 24hrs to comment.
>
> Will do.

Committed now with the above change (bootstrapped and tested to make
sure it worked after the change).

Thanks,
Andrew

>
> Thanks,
> A

Re: [PATCH] RISC-V: Add rotate immediate regression test

2023-08-16 Thread Andrew Pinski via Gcc-patches
On Wed, Aug 16, 2023 at 4:15 PM Patrick O'Neill  wrote:
>
> This adds new regression tests to ensure half-register rotations are
> correctly optimized into rori instructions.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zbb-rol-ror-04.c: Add half-register rotation
> cases.
> * gcc.target/riscv/zbb-rol-ror-05.c: Add half-register rotation
> case.

My suggestion is to add a new file instead of appending the testcase.

Thanks,
Andrew Pinski

>
> Co-authored-by: Charlie Jenkins 
> Signed-off-by: Patrick O'Neill 
> ---
> Trunk optimized these added testcases correctly.
> GCC 13.2 and earlier do not optimize these cases correctly.
>
> Expands on testcases added in:
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;f=gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c;h=0ccf520d349a82dafca0deb3d307a1080e8589a0
> ---
>  .../gcc.target/riscv/zbb-rol-ror-04.c | 20 +++
>  .../gcc.target/riscv/zbb-rol-ror-05.c | 10 ++
>  2 files changed, 30 insertions(+)
>
> diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c 
> b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c
> index 7ef4c29dd5b..dcd7be874ab 100644
> --- a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c
> +++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c
> @@ -51,3 +51,23 @@ unsigned int foo5(unsigned int rs1, unsigned int rs2)
>  {
>  return (rs1 >> rs2) | (rs1 << (32 - rs2));
>  }
> +
> +/*
> +**foo6:
> +** roria0,a0,32
> +** ret
> +*/
> +unsigned long foo6(unsigned long rotate)
> +{
> +return (rotate << 32) | (rotate >> 32);
> +}
> +
> +/*
> +**foo7:
> +** roriw   a0,a0,16
> +** ret
> +*/
> +unsigned int foo7(unsigned int rotate)
> +{
> +return (rotate << 16) | (rotate >> 16);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c 
> b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c
> index 2108ccc3e77..5ae1d4a92d9 100644
> --- a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c
> +++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c
> @@ -23,3 +23,13 @@ unsigned int foo2(unsigned int rs1)
>  {
>  return (rs1 << 10) | (rs1 >> 22);
>  }
> +
> +/*
> +**foo3:
> +** roria0,a0,16
> +** ret
> +*/
> +unsigned int foo3(unsigned int rs1)
> +{
> +return (rs1 << 16) | (rs1 >> 16);
> +}
> --
> 2.34.1
>
>


Re: [PATCH v2 1/2] libstdc++: Implement more maintainable header

2023-08-16 Thread Andrew Pinski via Gcc-patches
On Wed, Aug 16, 2023 at 3:36 PM David Edelsohn via Gcc-patches
 wrote:
>
> Was the dependency added to the dependencies in contrib/gcc_update?
> Otherwise the timestamp can get out of sync in a Git checkout.

I checked in https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627667.html
which just added it to gcc_update.

Thanks,
Andrew

>
> Thanks, David
>
>
> On Wed, Aug 16, 2023 at 6:20 PM Jonathan Wakely  wrote:
>
> > On Wed, 16 Aug 2023 at 22:56, Jonathan Wakely  wrote:
> > >
> > > On Wed, 16 Aug 2023 at 22:39, David Edelsohn  wrote:
> > > >
> > > > Hi, Arsen
> > > >
> > > > This patch broke bootstrap because it has introduced a new GCC build
> > requirement for autogen that is not a previous requirement to build GCC.
> > Previously the repository has included post-processed files.
> > >
> > > The repo does include the generated bits/version.h file. autogen
> > > should only be needed if you modify version.dep
> >
> > And I've just checked again with an x86_64-pc-linux-gnu bootstrap on a
> > box without autogen, and it worked.
> >
> > >
> > > >
> > > > +# AutoGen .
> > > > +.PHONY: update-version
> > > > +update-version:
> > > > + cd ${bits_srcdir} && \
> > > > + autogen version.def
> > > > +
> > > >
> > > >
> > > > Thanks, David
> > > >
> > > >
> >
> >


[PATCH] Add libstdc++-v3/include/bits/version.h to gcc_update touch part

2023-08-16 Thread Andrew Pinski via Gcc-patches
This adds libstdc++-v3/include/bits/version.h so it has the correct timestamp.

Committed as obvious after running contrib/gcc_update --touch

contrib/ChangeLog:

* gcc_update: Add libstdc++-v3/include/bits/version.h.
---
 contrib/gcc_update | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/gcc_update b/contrib/gcc_update
index 1bfc67ac91a..1d7bfab4935 100755
--- a/contrib/gcc_update
+++ b/contrib/gcc_update
@@ -182,6 +182,7 @@ libphobos/config.h.in: libphobos/configure.ac 
libphobos/aclocal.m4
 libphobos/configure: libphobos/configure.ac libphobos/aclocal.m4
 libphobos/src/Makefile.in: libphobos/src/Makefile.am libphobos/aclocal.m4
 libphobos/testsuite/Makefile.in: libphobos/testsuite/Makefile.am 
libphobos/aclocal.m4
+libstdc++-v3/include/bits/version.h: libstdc++-v3/include/bits/version.def 
libstdc++-v3/include/bits/version.tpl
 # Top level
 Makefile.in: Makefile.tpl Makefile.def
 configure: configure.ac config/acx.m4
-- 
2.31.1



Re: [PATCH] Add support for vector conitional not

2023-08-14 Thread Andrew Pinski via Gcc-patches
On Mon, Aug 14, 2023 at 2:37 PM Richard Sandiford via Gcc-patches
 wrote:
>
> Andrew Pinski via Gcc-patches  writes:
> > Like the support conditional neg (r12-4470-g20dcda98ed376cb61c74b2c71),
> > this just adds conditional not too.
> > Also we should be able to turn `(a ? -1 : 0) ^ b` into a conditional
> > not.
> >
> > OK? Bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-gnu.
> >
> > gcc/ChangeLog:
> >
> >   * internal-fn.def (COND_NOT): New internal function.
> >   * match.pd (UNCOND_UNARY, COND_UNARY): Add bit_not/not
> >   to the lists.
> >   (`vec (a ? -1 : 0) ^ b`): New pattern to convert
> >   into conditional not.
> >   * optabs.def (cond_one_cmpl): New optab.
> >   (cond_len_one_cmpl): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   PR target/110986
> >   * gcc.target/aarch64/sve/cond_unary_9.c: New test.
> > ---
> >  gcc/internal-fn.def   |  2 ++
> >  gcc/match.pd  | 15 --
> >  gcc/optabs.def|  2 ++
> >  .../gcc.target/aarch64/sve/cond_unary_9.c | 20 +++
> >  4 files changed, 37 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_unary_9.c
> >
> > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> > index b3c410f4b6a..3e8693dfddb 100644
> > --- a/gcc/internal-fn.def
> > +++ b/gcc/internal-fn.def
> > @@ -69,6 +69,7 @@ along with GCC; see the file COPYING3.  If not see
> >   lround2.
> >
> > - cond_binary: a conditional binary optab, such as cond_add
> > +   - cond_unary: a conditional unary optab, such as cond_neg
> > - cond_ternary: a conditional ternary optab, such as cond_fma_rev
> >
> > - fold_left: for scalar = FN (scalar, vector), keyed off the vector mode
> > @@ -276,6 +277,7 @@ DEF_INTERNAL_COND_FN (FNMA, ECF_CONST, fnma, ternary)
> >  DEF_INTERNAL_COND_FN (FNMS, ECF_CONST, fnms, ternary)
> >
> >  DEF_INTERNAL_COND_FN (NEG, ECF_CONST, neg, unary)
> > +DEF_INTERNAL_COND_FN (NOT, ECF_CONST, one_cmpl, unary)
> >
> >  DEF_INTERNAL_OPTAB_FN (RSQRT, ECF_CONST, rsqrt, unary)
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 6791060891d..2ee6d24ccee 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -84,9 +84,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >
> >  /* Unary operations and their associated IFN_COND_* function.  */
> >  (define_operator_list UNCOND_UNARY
> > -  negate)
> > +  negate bit_not)
> >  (define_operator_list COND_UNARY
> > -  IFN_COND_NEG)
> > +  IFN_COND_NEG IFN_COND_NOT)
> >
> >  /* Binary operations and their associated IFN_COND_* function.  */
> >  (define_operator_list UNCOND_BINARY
> > @@ -8482,6 +8482,17 @@ and,
> >  && is_truth_type_for (op_type, TREE_TYPE (@0)))
> >   (cond_op (bit_not @0) @2 @1)
> >
> > +/* `(a ? -1 : 0) ^ b` can be converted into a conditional not.  */
> > +(simplify
> > + (bit_xor:c (vec_cond @0 uniform_integer_cst_p@1 uniform_integer_cst_p@2) 
> > @3)
> > + (if (canonicalize_math_after_vectorization_p ()
> > +  && vectorized_internal_fn_supported_p (IFN_COND_NOT, type)
> > +  && is_truth_type_for (type, TREE_TYPE (@0)))
> > + (if (integer_all_onesp (@1) && integer_zerop (@2))
> > +  (IFN_COND_NOT @0 @3 @3))
> > +  (if (integer_all_onesp (@2) && integer_zerop (@1))
> > +   (vec_cond (bit_not @0) @3 @3
>
> Looks like this should be IFN_COND_NOT rather than vec_cond.

Yes that should have been IFN_COND_NOT, when I was converting it to be
explicitly IFN_COND_NOT rather than depending on vec_cond, I had
missed that part of the conversion.
Thanks for noticing that.

>
> LGTM otherwise, but please give Richi 24hrs to comment.

Will do.

Thanks,
Andrew


>
> Thanks,
> Richard
>
> > +
> >  /* Simplify:
> >
> >   a = a1 op a2
> > diff --git a/gcc/optabs.def b/gcc/optabs.def
> > index 1ea1947b3b5..a58819bc665 100644
> > --- a/gcc/optabs.def
> > +++ b/gcc/optabs.def
> > @@ -254,6 +254,7 @@ OPTAB_D (cond_fms_optab, "cond_fms$a")
> >  OPTAB_D (cond_fnma_optab, "cond_fnma$a")
> >  OPTAB_D (cond_fnms_optab, "cond_fnms$a")
> >  OPTAB_D (cond_neg_optab, "cond_neg$a")
> > +OPTAB_D (cond_one_cmpl_optab, "cond_one_cmpl$a")
> >  OPTAB_D (cond_len_add_optab, "cond_len_add$a")
> >  OPTAB_D (cond_l

Re: [PATCH] gcc/reload.h: Change type of x_spill_indirect_levels

2023-08-13 Thread Andrew Pinski via Gcc-patches
On Sun, Aug 13, 2023 at 12:20 PM Eddy Young  wrote:
>
> This patch changes the type of `x_spill_indirect_levels` member of
> `struct target reload` from `bool` to `unsigned char`.
>
> Without this change, the build of esp-open-sdk fails with GCC 11 and
> above.

This was done back in d57c99458933 for GCC 6.
https://gcc.gnu.org/r6-535-gd57c99458933a2 .
Why are you posting a patch against a branch which has not been
supported for years now?

Thanks,
Andrew Pinski


>
> (Please bear with me, this is my first patch submission.)
>
> Cheers,
> Eddy
>
> ---
>  ChangeLog| 5 +
>  gcc/reload.h | 2 +-
>  2 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/ChangeLog b/ChangeLog
> index 3dd1ce544af..442aa9192a9 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-08-13 Eddy Young 
> +
> +   * gcc/reload.h: Change type of x_spill_indirect_levels of struct
> +   target_reload to support C++17 build.
> +
>  2015-06-23  Release Manager
>
> * GCC 4.8.5 released.
> diff --git a/gcc/reload.h b/gcc/reload.h
> index 7a13ad30e82..1e94d8ea93b 100644
> --- a/gcc/reload.h
> +++ b/gcc/reload.h
> @@ -166,7 +166,7 @@ struct target_reload {
>   value indicates the level of indirect addressing supported, e.g., two
>   means that (MEM (MEM (REG n))) is also valid if (REG n) does not get
>   a hard register.  */
> -  bool x_spill_indirect_levels;
> +  unsigned char x_spill_indirect_levels;
>
>/* True if caller-save has been reinitialized.  */
>bool x_caller_save_initialized_p;
> --
> 2.39.2
>


Re:

2023-08-13 Thread Andrew Pinski via Gcc-patches
On Sun, Aug 13, 2023 at 12:05 PM Eddy Young Tie Yang
 wrote:
>
> From d57ac4f9a095a2f616863efd524ac2d87276becb Mon Sep 17 00:00:00 2001
> From: Eddy Young 
> Date: Sun, 13 Aug 2023 19:59:12 +0100
> Subject: [PATCH] gcc/reload.h: Change type of x_spill_indirect_levels
>
> ---
>  ChangeLog| 5 +
>  gcc/reload.h | 2 +-
>  2 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/ChangeLog b/ChangeLog
> index 3dd1ce544af..442aa9192a9 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-08-13 Eddy Young 
> +
> +   * gcc/reload.h: Change type of x_spill_indirect_levels of struct
> +   target_reload to support C++17 build.

This was done back in d57c99458933 for GCC 6.
https://gcc.gnu.org/r6-535-gd57c99458933a2 .
Why are you posting a patch against a branch which has not been
supported for years now?

Thanks,
Andrew Pinski

> +
>  2015-06-23  Release Manager
>
> * GCC 4.8.5 released.
> diff --git a/gcc/reload.h b/gcc/reload.h
> index 7a13ad30e82..1e94d8ea93b 100644
> --- a/gcc/reload.h
> +++ b/gcc/reload.h
> @@ -166,7 +166,7 @@ struct target_reload {
>   value indicates the level of indirect addressing supported, e.g., two
>   means that (MEM (MEM (REG n))) is also valid if (REG n) does not get
>   a hard register.  */
> -  bool x_spill_indirect_levels;
> +  unsigned char x_spill_indirect_levels;
>
>/* True if caller-save has been reinitialized.  */
>bool x_caller_save_initialized_p;
> --
> 2.39.2
>


[PATCH] Add support for vector conitional not

2023-08-12 Thread Andrew Pinski via Gcc-patches
Like the support conditional neg (r12-4470-g20dcda98ed376cb61c74b2c71),
this just adds conditional not too.
Also we should be able to turn `(a ? -1 : 0) ^ b` into a conditional
not.

OK? Bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-gnu.

gcc/ChangeLog:

* internal-fn.def (COND_NOT): New internal function.
* match.pd (UNCOND_UNARY, COND_UNARY): Add bit_not/not
to the lists.
(`vec (a ? -1 : 0) ^ b`): New pattern to convert
into conditional not.
* optabs.def (cond_one_cmpl): New optab.
(cond_len_one_cmpl): Likewise.

gcc/testsuite/ChangeLog:

PR target/110986
* gcc.target/aarch64/sve/cond_unary_9.c: New test.
---
 gcc/internal-fn.def   |  2 ++
 gcc/match.pd  | 15 --
 gcc/optabs.def|  2 ++
 .../gcc.target/aarch64/sve/cond_unary_9.c | 20 +++
 4 files changed, 37 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_unary_9.c

diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index b3c410f4b6a..3e8693dfddb 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -69,6 +69,7 @@ along with GCC; see the file COPYING3.  If not see
  lround2.
 
- cond_binary: a conditional binary optab, such as cond_add
+   - cond_unary: a conditional unary optab, such as cond_neg
- cond_ternary: a conditional ternary optab, such as cond_fma_rev
 
- fold_left: for scalar = FN (scalar, vector), keyed off the vector mode
@@ -276,6 +277,7 @@ DEF_INTERNAL_COND_FN (FNMA, ECF_CONST, fnma, ternary)
 DEF_INTERNAL_COND_FN (FNMS, ECF_CONST, fnms, ternary)
 
 DEF_INTERNAL_COND_FN (NEG, ECF_CONST, neg, unary)
+DEF_INTERNAL_COND_FN (NOT, ECF_CONST, one_cmpl, unary)
 
 DEF_INTERNAL_OPTAB_FN (RSQRT, ECF_CONST, rsqrt, unary)
 
diff --git a/gcc/match.pd b/gcc/match.pd
index 6791060891d..2ee6d24ccee 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -84,9 +84,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 /* Unary operations and their associated IFN_COND_* function.  */
 (define_operator_list UNCOND_UNARY
-  negate)
+  negate bit_not)
 (define_operator_list COND_UNARY
-  IFN_COND_NEG)
+  IFN_COND_NEG IFN_COND_NOT)
 
 /* Binary operations and their associated IFN_COND_* function.  */
 (define_operator_list UNCOND_BINARY
@@ -8482,6 +8482,17 @@ and,
 && is_truth_type_for (op_type, TREE_TYPE (@0)))
  (cond_op (bit_not @0) @2 @1)
 
+/* `(a ? -1 : 0) ^ b` can be converted into a conditional not.  */
+(simplify
+ (bit_xor:c (vec_cond @0 uniform_integer_cst_p@1 uniform_integer_cst_p@2) @3)
+ (if (canonicalize_math_after_vectorization_p ()
+  && vectorized_internal_fn_supported_p (IFN_COND_NOT, type)
+  && is_truth_type_for (type, TREE_TYPE (@0)))
+ (if (integer_all_onesp (@1) && integer_zerop (@2))
+  (IFN_COND_NOT @0 @3 @3))
+  (if (integer_all_onesp (@2) && integer_zerop (@1))
+   (vec_cond (bit_not @0) @3 @3
+
 /* Simplify:
 
  a = a1 op a2
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 1ea1947b3b5..a58819bc665 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -254,6 +254,7 @@ OPTAB_D (cond_fms_optab, "cond_fms$a")
 OPTAB_D (cond_fnma_optab, "cond_fnma$a")
 OPTAB_D (cond_fnms_optab, "cond_fnms$a")
 OPTAB_D (cond_neg_optab, "cond_neg$a")
+OPTAB_D (cond_one_cmpl_optab, "cond_one_cmpl$a")
 OPTAB_D (cond_len_add_optab, "cond_len_add$a")
 OPTAB_D (cond_len_sub_optab, "cond_len_sub$a")
 OPTAB_D (cond_len_smul_optab, "cond_len_mul$a")
@@ -278,6 +279,7 @@ OPTAB_D (cond_len_fms_optab, "cond_len_fms$a")
 OPTAB_D (cond_len_fnma_optab, "cond_len_fnma$a")
 OPTAB_D (cond_len_fnms_optab, "cond_len_fnms$a")
 OPTAB_D (cond_len_neg_optab, "cond_len_neg$a")
+OPTAB_D (cond_len_one_cmpl_optab, "cond_len_one_cmpl$a")
 OPTAB_D (cmov_optab, "cmov$a6")
 OPTAB_D (cstore_optab, "cstore$a4")
 OPTAB_D (ctrap_optab, "ctrap$a4")
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_9.c 
b/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_9.c
new file mode 100644
index 000..d6bc0409630
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_9.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -moverride=sve_width=256 
-fdump-tree-optimized" } */
+
+/* This is a reduced version of cond_unary_5.c */
+
+void __attribute__ ((noipa))
+f (short *__restrict r,
+   short *__restrict a,
+   short *__restrict pred)
+{
+  for (int i = 0; i < 1024; ++i)
+r[i] = pred[i] != 0 ? ~(a[i]) : a[i];
+}
+
+/* { dg-final { scan-assembler-times {\tnot\tz[0-9]+\.h, p[0-7]/m,} 1 } } */
+
+/* { dg-final { scan-assembler-not {\teor\tz} } } */
+/* { dg-final { scan-assembler-not {\tmov\tz[0-9]+\.h, p[0-7]/m, #-1} } } */
+
+/* { dg-final { scan-tree-dump-times ".COND_NOT " 1 "optimized" } } */
-- 
2.31.1



[PATCH 2/2] VR-VALUES: Rewrite test_for_singularity using range_op_handler

2023-08-11 Thread Andrew Pinski via Gcc-patches
So it turns out there was a simplier way of starting to
improve VRP to start to fix PR 110131, PR 108360, and PR 108397.
That was rewrite test_for_singularity to use range_op_handler
and Value_Range.

This patch implements that and

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* vr-values.cc (test_for_singularity): Add edge argument
and rewrite using range_op_handler.
(simplify_compare_using_range_pairs): Use Value_Range
instead of value_range and update test_for_singularity call.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/vrp124.c: New test.
* gcc.dg/tree-ssa/vrp125.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/vrp124.c | 44 +
 gcc/testsuite/gcc.dg/tree-ssa/vrp125.c | 44 +
 gcc/vr-values.cc   | 91 --
 3 files changed, 114 insertions(+), 65 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp125.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
new file mode 100644
index 000..6ccbda35d1b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
@@ -0,0 +1,44 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+/* Should be optimized to a == -100 */
+int g(int a)
+{
+  if (a == -100 || a >= 0)
+;
+  else
+return 0;
+  return a < 0;
+}
+
+/* Should optimize to a == 0 */
+int f(int a)
+{
+  if (a == 0 || a > 100)
+;
+  else
+return 0;
+  return a < 50;
+}
+
+/* Should be optimized to a == 0. */
+int f2(int a)
+{
+  if (a == 0 || a > 100)
+;
+  else
+return 0;
+  return a < 100;
+}
+
+/* Should optimize to a == 100 */
+int f1(int a)
+{
+  if (a < 0 || a == 100)
+;
+  else
+return 0;
+  return a > 50;
+}
+
+/* { dg-final { scan-tree-dump-not "goto " "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp125.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp125.c
new file mode 100644
index 000..f6c2f8e35f1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp125.c
@@ -0,0 +1,44 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+/* Should be optimized to a == -100 */
+int g(int a)
+{
+  if (a == -100 || a == -50 || a >= 0)
+;
+  else
+return 0;
+  return a < -50;
+}
+
+/* Should optimize to a == 0 */
+int f(int a)
+{
+  if (a == 0 || a == 50 || a > 100)
+;
+  else
+return 0;
+  return a < 50;
+}
+
+/* Should be optimized to a == 0. */
+int f2(int a)
+{
+  if (a == 0 || a == 50 || a > 100)
+;
+  else
+return 0;
+  return a < 25;
+}
+
+/* Should optimize to a == 100 */
+int f1(int a)
+{
+  if (a < 0 || a == 50 || a == 100)
+;
+  else
+return 0;
+  return a > 50;
+}
+
+/* { dg-final { scan-tree-dump-not "goto " "optimized" } } */
diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
index a4fddd62841..7004b0224bd 100644
--- a/gcc/vr-values.cc
+++ b/gcc/vr-values.cc
@@ -907,66 +907,30 @@ simplify_using_ranges::simplify_bit_ops_using_ranges
a known value range VR.
 
If there is one and only one value which will satisfy the
-   conditional, then return that value.  Else return NULL.
-
-   If signed overflow must be undefined for the value to satisfy
-   the conditional, then set *STRICT_OVERFLOW_P to true.  */
+   conditional on the EDGE, then return that value.
+   Else return NULL.  */
 
 static tree
 test_for_singularity (enum tree_code cond_code, tree op0,
- tree op1, const value_range *vr)
+ tree op1, Value_Range vr, bool edge)
 {
-  tree min = NULL;
-  tree max = NULL;
-
-  /* Extract minimum/maximum values which satisfy the conditional as it was
- written.  */
-  if (cond_code == LE_EXPR || cond_code == LT_EXPR)
+  /* This is already a singularity.  */
+  if (cond_code == NE_EXPR || cond_code == EQ_EXPR)
+return NULL;
+  auto range_op = range_op_handler (cond_code);
+  int_range<2> op1_range (TREE_TYPE (op0));
+  wide_int w = wi::to_wide (op1);
+  op1_range.set (TREE_TYPE (op1), w, w);
+  Value_Range vr1(TREE_TYPE (op0));
+  if (range_op.op1_range (vr1, TREE_TYPE (op0),
+ edge ? range_true () : range_false (),
+ op1_range))
 {
-  min = TYPE_MIN_VALUE (TREE_TYPE (op0));
-
-  max = op1;
-  if (cond_code == LT_EXPR)
-   {
- tree one = build_int_cst (TREE_TYPE (op0), 1);
- max = fold_build2 (MINUS_EXPR, TREE_TYPE (op0), max, one);
- /* Signal to compare_values_warnv this expr doesn't overflow.  */
- if (EXPR_P (max))
-   suppress_warning (max, OPT_Woverflow);
-   }
-}
-  else if (cond_code == GE_EXPR || cond_code == GT_EXPR)
-{
-  max = TYPE_MAX_VALUE (TREE_TYPE (op0));
-
-  min = op1;
-  if (cond_code == GT_EXPR)
-   {
- tree one = build_int_cst (TREE_TYPE (op0), 1);
- min = fold_build2 (PLUS_EXPR, 

[PATCH 1/2] PHI-OPT [PR 110984]: Add support for NE_EXPR/EQ_EXPR with casts to spaceship_replacement

2023-08-11 Thread Andrew Pinski via Gcc-patches
So with my next VRP patch, VRP causes:
```
  # c$_M_value_18 = PHI <-1(3), 0(2), 1(4)>
  _11 = (unsigned int) c$_M_value_18;
  _16 = _11 <= 1;
```
To be changed to:
```
  # c$_M_value_18 = PHI <-1(3), 0(2), 1(4)>
  _11 = (unsigned int) c$_M_value_18;
  _16 = _11 != 4294967295;
```

So let's add support for the above.
A few changes was needed, first to change
the range check of the rhs of the comparison to possibly
integer_all_onesp also.

The next is to add support for the cast and EQ/NE case.

Note on the testcases pr110984-1.c is basically pr94589-2.c but
with what the C++ code is doing with the signed char type;
pr110984-2.c is pr110984-1.c with the cast added to give an
explicit testcase to test against.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

PR tree-optimization/110984

gcc/ChangeLog:

* tree-ssa-phiopt.cc (spaceship_replacement): Add support for
NE/EQ for the cast case.

gcc/testsuite/ChangeLog:

* gcc.dg/pr110984-1.c: New test.
* gcc.dg/pr110984-2.c: New test.
---
 gcc/testsuite/gcc.dg/pr110984-1.c | 37 +++
 gcc/testsuite/gcc.dg/pr110984-2.c | 21 ++
 gcc/tree-ssa-phiopt.cc| 19 +---
 3 files changed, 74 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr110984-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr110984-2.c

diff --git a/gcc/testsuite/gcc.dg/pr110984-1.c 
b/gcc/testsuite/gcc.dg/pr110984-1.c
new file mode 100644
index 000..85b19eb8279
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr110984-1.c
@@ -0,0 +1,37 @@
+/* PR tree-optimization/110984 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -g0 -ffast-math -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-times "\[ij]_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) 
\[ij]_\[0-9]+\\(D\\)" 14 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "i_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) 
5\\.0" 14 "optimized" } } */
+
+/* This is similar to pr94589-2.c except use signed char as the type for the 
[-1,2] case */
+
+#define A __attribute__((noipa))
+A int f1 (double i, double j) { signed char c; if (i == j) c = 0; else if (i < 
j) c = -1; else if (i > j) c = 1; else c = 2; return c == 0; }
+A int f2 (double i, double j) { signed char c; if (i == j) c = 0; else if (i < 
j) c = -1; else if (i > j) c = 1; else c = 2; return c != 0; }
+A int f3 (double i, double j) { signed char c; if (i == j) c = 0; else if (i < 
j) c = -1; else if (i > j) c = 1; else c = 2; return c > 0; }
+A int f4 (double i, double j) { signed char c; if (i == j) c = 0; else if (i < 
j) c = -1; else if (i > j) c = 1; else c = 2; return c < 0; }
+A int f5 (double i, double j) { signed char c; if (i == j) c = 0; else if (i < 
j) c = -1; else if (i > j) c = 1; else c = 2; return c >= 0; }
+A int f6 (double i, double j) { signed char c; if (i == j) c = 0; else if (i < 
j) c = -1; else if (i > j) c = 1; else c = 2; return c <= 0; }
+A int f7 (double i, double j) { signed char c; if (i == j) c = 0; else if (i < 
j) c = -1; else if (i > j) c = 1; else c = 2; return c == -1; }
+A int f8 (double i, double j) { signed char c; if (i == j) c = 0; else if (i < 
j) c = -1; else if (i > j) c = 1; else c = 2; return c != -1; }
+A int f9 (double i, double j) { signed char c; if (i == j) c = 0; else if (i < 
j) c = -1; else if (i > j) c = 1; else c = 2; return c > -1; }
+A int f10 (double i, double j) { signed char c; if (i == j) c = 0; else if (i 
< j) c = -1; else if (i > j) c = 1; else c = 2; return c <= -1; }
+A int f11 (double i, double j) { signed char c; if (i == j) c = 0; else if (i 
< j) c = -1; else if (i > j) c = 1; else c = 2; return c == 1; }
+A int f12 (double i, double j) { signed char c; if (i == j) c = 0; else if (i 
< j) c = -1; else if (i > j) c = 1; else c = 2; return c != 1; }
+A int f13 (double i, double j) { signed char c; if (i == j) c = 0; else if (i 
< j) c = -1; else if (i > j) c = 1; else c = 2; return c < 1; }
+A int f14 (double i, double j) { signed char c; if (i == j) c = 0; else if (i 
< j) c = -1; else if (i > j) c = 1; else c = 2; return c >= 1; }
+A int f15 (double i) { signed char c; if (i == 5.0) c = 0; else if (i < 5.0) c 
= -1; else if (i > 5.0) c = 1; else c = 2; return c == 0; }
+A int f16 (double i) { signed char c; if (i == 5.0) c = 0; else if (i < 5.0) c 
= -1; else if (i > 5.0) c = 1; else c = 2; return c != 0; }
+A int f17 (double i) { signed char c; if (i == 5.0) c = 0; else if (i < 5.0) c 
= -1; else if (i > 5.0) c = 1; else c = 2; return c > 0; }
+A int f18 (double i) { signed char c; if (i == 5.0) c = 0; else if (i < 5.0) c 
= -1; else if (i > 5.0) c = 1; else c = 2; return c < 0; }
+A int f19 (double i) { signed char c; if (i == 5.0) c = 0; else if (i < 5.0) c 
= -1; else if (i > 5.0) c = 1; else c = 2; return c >= 0; }
+A int f20 (double i) { signed char c; if (i == 5.0) c = 0; else if (i < 5.0) c 
= -1; else if (i > 5.0) c = 1; else c = 2; return c <= 0; }
+A int 

Re: [PATCH] VR-VALUES: Simplify comparison using range pairs

2023-08-10 Thread Andrew Pinski via Gcc-patches
On Thu, Aug 10, 2023 at 12:08 PM Andrew Pinski  wrote:
>
> On Thu, Aug 10, 2023 at 12:18 AM Richard Biener via Gcc-patches
>  wrote:
> >
> > On Wed, Aug 9, 2023 at 6:16 PM Andrew Pinski via Gcc-patches
> >  wrote:
> > >
> > > If `A` has a range of `[0,0][100,INF]` and the comparison
> > > of `A < 50`. This should be optimized to `A <= 0` (which then
> > > will be optimized to just `A == 0`).
> > > This patch implement this via a new function which sees if
> > > the constant of a comparison is in the middle of 2 range pairs
> > > and change the constant to the either upper bound of the first pair
> > > or the lower bound of the second pair depending on the comparison.
> > >
> > > This is the first step in fixing the following PRS:
> > > PR 110131, PR 108360, and PR 108397.
> > >
> > > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> >
> >
> >
> > > gcc/ChangeLog:
> > >
> > > * vr-values.cc (simplify_compare_using_range_pairs): New function.
> > > (simplify_using_ranges::simplify_compare_using_ranges_1): Call
> > > it.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.dg/tree-ssa/vrp124.c: New test.
> > > * gcc.dg/pr21643.c: Disable VRP.
> > > ---
> > >  gcc/testsuite/gcc.dg/pr21643.c |  6 ++-
> > >  gcc/testsuite/gcc.dg/tree-ssa/vrp124.c | 44 +
> > >  gcc/vr-values.cc   | 65 ++
> > >  3 files changed, 114 insertions(+), 1 deletion(-)
> > >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
> > >
> > > diff --git a/gcc/testsuite/gcc.dg/pr21643.c 
> > > b/gcc/testsuite/gcc.dg/pr21643.c
> > > index 4e7f93d351a..7f121d7006f 100644
> > > --- a/gcc/testsuite/gcc.dg/pr21643.c
> > > +++ b/gcc/testsuite/gcc.dg/pr21643.c
> > > @@ -1,6 +1,10 @@
> > >  /* PR tree-optimization/21643 */
> > >  /* { dg-do compile } */
> > > -/* { dg-options "-O2 -fdump-tree-reassoc1-details --param 
> > > logical-op-non-short-circuit=1" } */
> > > +/* Note VRP is able to transform `c >= 0x20` in f7
> > > +   to `c >= 0x21` since we want to test
> > > +   reassociation and not VRP, turn it off. */
> > > +
> > > +/* { dg-options "-O2 -fdump-tree-reassoc1-details --param 
> > > logical-op-non-short-circuit=1 -fno-tree-vrp" } */
> > >
> > >  int
> > >  f1 (unsigned char c)
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c 
> > > b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
> > > new file mode 100644
> > > index 000..6ccbda35d1b
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
> > > @@ -0,0 +1,44 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O2 -fdump-tree-optimized" } */
> > > +
> > > +/* Should be optimized to a == -100 */
> > > +int g(int a)
> > > +{
> > > +  if (a == -100 || a >= 0)
> > > +;
> > > +  else
> > > +return 0;
> > > +  return a < 0;
> > > +}
> > > +
> > > +/* Should optimize to a == 0 */
> > > +int f(int a)
> > > +{
> > > +  if (a == 0 || a > 100)
> > > +;
> > > +  else
> > > +return 0;
> > > +  return a < 50;
> > > +}
> > > +
> > > +/* Should be optimized to a == 0. */
> > > +int f2(int a)
> > > +{
> > > +  if (a == 0 || a > 100)
> > > +;
> > > +  else
> > > +return 0;
> > > +  return a < 100;
> > > +}
> > > +
> > > +/* Should optimize to a == 100 */
> > > +int f1(int a)
> > > +{
> > > +  if (a < 0 || a == 100)
> > > +;
> > > +  else
> > > +return 0;
> > > +  return a > 50;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-not "goto " "optimized" } } */
> > > diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
> > > index a4fddd62841..1262e7cf9f0 100644
> > > --- a/gcc/vr-values.cc
> > > +++ b/gcc/vr-values.cc
> > > @@ -968,9 +968,72 @@ test_for_singularity (enum tree_code cond_code, tree 
> > > op0,
> > >if (operand_equal_p (min, max, 0) && is_gimple_min_invariant (min))
> > > return min;
> > 

[PATCHv2] Fix PR 110954: wrong code with cmp | !cmp

2023-08-10 Thread Andrew Pinski via Gcc-patches
This was an oversight on my part forgetting that
cmp will might have a different true value than all ones
but will have a value of 1 in most cases.
This means if we have `(f < 0) | !(f < 0)` we would
optimize this to -1 rather than just 1.

This is version 2 of the patch.
Decided to go down a different route than just checking if
the precission was 1 inside bitwise_inverted_equal_p.
So instead bitwise_inverted_equal_p gets passed an argument
that will be set if there was a comparison that was being compared
and the user of bitwise_inverted_equal_p decides what needs to be done.
In most uses of bitwise_inverted_equal_p, the check will be
`!wascmp || element_precision (type) == 1` .
But in the case of `a & ~a` and `a ^| ~a` we can handle the case
of wascmp by using constant_boolean_node isntead.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR 110954

gcc/ChangeLog:

* generic-match-head.cc (bitwise_inverted_equal_p): Add
wascmp argument and set it accordingly.
* gimple-match-head.cc (bitwise_inverted_equal_p): Add
wascmp argument to the macro.
(gimple_bitwise_inverted_equal_p): Add
wascmp argument and set it accordingly.
* match.pd (`a & ~a`, `a ^| ~a`): Update call
to bitwise_inverted_equal_p and handle wascmp case.
(`(~x | y) & x`, `(~x | y) & x`, `a?~t:t`): Update
call to bitwise_inverted_equal_p and check to see
if was !wascmp or if precision was 1.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/pr110954-1.c: New test.
---
 gcc/generic-match-head.cc |  4 +-
 gcc/gimple-match-head.cc  |  8 ++--
 gcc/match.pd  | 41 +++
 .../gcc.c-torture/execute/pr110954-1.c| 10 +
 4 files changed, 43 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110954-1.c

diff --git a/gcc/generic-match-head.cc b/gcc/generic-match-head.cc
index ddaf22f2179..f40a35c48a6 100644
--- a/gcc/generic-match-head.cc
+++ b/gcc/generic-match-head.cc
@@ -127,10 +127,11 @@ bitwise_equal_p (tree expr1, tree expr2)
The types can differ through nop conversions.  */
 
 static inline bool
-bitwise_inverted_equal_p (tree expr1, tree expr2)
+bitwise_inverted_equal_p (tree expr1, tree expr2, bool )
 {
   STRIP_NOPS (expr1);
   STRIP_NOPS (expr2);
+  wascmp = false;
   if (expr1 == expr2)
 return false;
   if (!tree_nop_conversion_p (TREE_TYPE (expr1), TREE_TYPE (expr2)))
@@ -150,6 +151,7 @@ bitwise_inverted_equal_p (tree expr1, tree expr2)
 {
   tree op10 = TREE_OPERAND (expr1, 0);
   tree op20 = TREE_OPERAND (expr2, 0);
+  wascmp = true;
   if (!operand_equal_p (op10, op20))
return false;
   tree op11 = TREE_OPERAND (expr1, 1);
diff --git a/gcc/gimple-match-head.cc b/gcc/gimple-match-head.cc
index a097a494c39..ea6387a1099 100644
--- a/gcc/gimple-match-head.cc
+++ b/gcc/gimple-match-head.cc
@@ -267,8 +267,8 @@ gimple_bitwise_equal_p (tree expr1, tree expr2, tree 
(*valueize) (tree))
 /* Return true if EXPR1 and EXPR2 have the bitwise opposite value,
but not necessarily same type.
The types can differ through nop conversions.  */
-#define bitwise_inverted_equal_p(expr1, expr2) \
-  gimple_bitwise_inverted_equal_p (expr1, expr2, valueize)
+#define bitwise_inverted_equal_p(expr1, expr2, wascmp) \
+  gimple_bitwise_inverted_equal_p (expr1, expr2, wascmp, valueize)
 
 
 bool gimple_bit_not_with_nop (tree, tree *, tree (*) (tree));
@@ -277,8 +277,9 @@ bool gimple_maybe_cmp (tree, tree *, tree (*) (tree));
 /* Helper function for bitwise_equal_p macro.  */
 
 static inline bool
-gimple_bitwise_inverted_equal_p (tree expr1, tree expr2, tree (*valueize) 
(tree))
+gimple_bitwise_inverted_equal_p (tree expr1, tree expr2, bool , tree 
(*valueize) (tree))
 {
+  wascmp = false;
   if (expr1 == expr2)
 return false;
   if (!tree_nop_conversion_p (TREE_TYPE (expr1), TREE_TYPE (expr2)))
@@ -331,6 +332,7 @@ gimple_bitwise_inverted_equal_p (tree expr1, tree expr2, 
tree (*valueize) (tree)
   tree op21 = do_valueize (valueize, gimple_assign_rhs2 (a2));
   if (!operand_equal_p (op11, op21))
 return false;
+  wascmp = true;
   if (invert_tree_comparison (gimple_assign_rhs_code (a1),
  HONOR_NANS (op10))
== gimple_assign_rhs_code (a2))
diff --git a/gcc/match.pd b/gcc/match.pd
index fc630b63563..6791060891d 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1179,9 +1179,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 /* Simplify ~X & X as zero.  */
 (simplify
  (bit_and (convert? @0) (convert? @1))
- (if (types_match (TREE_TYPE (@0), TREE_TYPE (@1))
-  && bitwise_inverted_equal_p (@0, @1))
-  { build_zero_cst (type); }))
+ (with { bool wascmp; }
+  (if (types_match (TREE_TYPE (@0), TREE_TYPE (@1))
+   && bitwise_inverted_equal_p (@0, @1, wascmp))
+   { wascmp ? constant_boolean_node (false, type) : build_zero_cst 

Re: [PATCH] VR-VALUES: Simplify comparison using range pairs

2023-08-10 Thread Andrew Pinski via Gcc-patches
On Thu, Aug 10, 2023 at 12:18 AM Richard Biener via Gcc-patches
 wrote:
>
> On Wed, Aug 9, 2023 at 6:16 PM Andrew Pinski via Gcc-patches
>  wrote:
> >
> > If `A` has a range of `[0,0][100,INF]` and the comparison
> > of `A < 50`. This should be optimized to `A <= 0` (which then
> > will be optimized to just `A == 0`).
> > This patch implement this via a new function which sees if
> > the constant of a comparison is in the middle of 2 range pairs
> > and change the constant to the either upper bound of the first pair
> > or the lower bound of the second pair depending on the comparison.
> >
> > This is the first step in fixing the following PRS:
> > PR 110131, PR 108360, and PR 108397.
> >
> > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
>
>
>
> > gcc/ChangeLog:
> >
> > * vr-values.cc (simplify_compare_using_range_pairs): New function.
> > (simplify_using_ranges::simplify_compare_using_ranges_1): Call
> > it.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/vrp124.c: New test.
> > * gcc.dg/pr21643.c: Disable VRP.
> > ---
> >  gcc/testsuite/gcc.dg/pr21643.c |  6 ++-
> >  gcc/testsuite/gcc.dg/tree-ssa/vrp124.c | 44 +
> >  gcc/vr-values.cc   | 65 ++
> >  3 files changed, 114 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
> >
> > diff --git a/gcc/testsuite/gcc.dg/pr21643.c b/gcc/testsuite/gcc.dg/pr21643.c
> > index 4e7f93d351a..7f121d7006f 100644
> > --- a/gcc/testsuite/gcc.dg/pr21643.c
> > +++ b/gcc/testsuite/gcc.dg/pr21643.c
> > @@ -1,6 +1,10 @@
> >  /* PR tree-optimization/21643 */
> >  /* { dg-do compile } */
> > -/* { dg-options "-O2 -fdump-tree-reassoc1-details --param 
> > logical-op-non-short-circuit=1" } */
> > +/* Note VRP is able to transform `c >= 0x20` in f7
> > +   to `c >= 0x21` since we want to test
> > +   reassociation and not VRP, turn it off. */
> > +
> > +/* { dg-options "-O2 -fdump-tree-reassoc1-details --param 
> > logical-op-non-short-circuit=1 -fno-tree-vrp" } */
> >
> >  int
> >  f1 (unsigned char c)
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c 
> > b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
> > new file mode 100644
> > index 000..6ccbda35d1b
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
> > @@ -0,0 +1,44 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-tree-optimized" } */
> > +
> > +/* Should be optimized to a == -100 */
> > +int g(int a)
> > +{
> > +  if (a == -100 || a >= 0)
> > +;
> > +  else
> > +return 0;
> > +  return a < 0;
> > +}
> > +
> > +/* Should optimize to a == 0 */
> > +int f(int a)
> > +{
> > +  if (a == 0 || a > 100)
> > +;
> > +  else
> > +return 0;
> > +  return a < 50;
> > +}
> > +
> > +/* Should be optimized to a == 0. */
> > +int f2(int a)
> > +{
> > +  if (a == 0 || a > 100)
> > +;
> > +  else
> > +return 0;
> > +  return a < 100;
> > +}
> > +
> > +/* Should optimize to a == 100 */
> > +int f1(int a)
> > +{
> > +  if (a < 0 || a == 100)
> > +;
> > +  else
> > +return 0;
> > +  return a > 50;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-not "goto " "optimized" } } */
> > diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
> > index a4fddd62841..1262e7cf9f0 100644
> > --- a/gcc/vr-values.cc
> > +++ b/gcc/vr-values.cc
> > @@ -968,9 +968,72 @@ test_for_singularity (enum tree_code cond_code, tree 
> > op0,
> >if (operand_equal_p (min, max, 0) && is_gimple_min_invariant (min))
> > return min;
> >  }
> > +
> >return NULL;
> >  }
> >
> > +/* Simplify integer comparisons such that the constant is one of the range 
> > pairs.
> > +   For an example,
> > +   A has a range of [0,0][100,INF]
> > +   and the comparison of `A < 50`.
> > +   This should be optimized to `A <= 0`
> > +   and then test_for_singularity can optimize it to `A == 0`.   */
> > +
> > +static bool
> > +simplify_compare_using_range_pairs (tree_code _code, tree , tree 
> > ,
> > +   const value_range *vr)
> > +{
> &

Re: [PATCH] Fix PR 110954: wrong code with cmp | !cmp

2023-08-10 Thread Andrew Pinski via Gcc-patches
On Thu, Aug 10, 2023 at 12:32 AM Richard Biener via Gcc-patches
 wrote:
>
> On Thu, Aug 10, 2023 at 2:21 AM Andrew Pinski via Gcc-patches
>  wrote:
> >
> > This was an oversight on my part not realizing that
> > comparisons in generic can have a non-boolean type.
> > This means if we have `(f < 0) | !(f < 0)` we would
> > optimize this to -1 rather than just 1.
> > This patch just adds the check for the type of the comparisons
> > to be boolean type to keep the optimization in that case.
> >
> > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> >
> > PR 110954
> >
> > gcc/ChangeLog:
> >
> > * generic-match-head.cc (bitwise_inverted_equal_p): Check
> > the type of the comparison to be boolean too.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.c-torture/execute/pr110954-1.c: New test.
> > ---
> >  gcc/generic-match-head.cc|  3 ++-
> >  gcc/testsuite/gcc.c-torture/execute/pr110954-1.c | 10 ++
> >  2 files changed, 12 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110954-1.c
> >
> > diff --git a/gcc/generic-match-head.cc b/gcc/generic-match-head.cc
> > index ddaf22f2179..ac2119bfdd0 100644
> > --- a/gcc/generic-match-head.cc
> > +++ b/gcc/generic-match-head.cc
> > @@ -146,7 +146,8 @@ bitwise_inverted_equal_p (tree expr1, tree expr2)
> >&& bitwise_equal_p (expr1, TREE_OPERAND (expr2, 0)))
> >  return true;
> >if (COMPARISON_CLASS_P (expr1)
> > -  && COMPARISON_CLASS_P (expr2))
> > +  && COMPARISON_CLASS_P (expr2)
> > +  && TREE_CODE (TREE_TYPE (expr1)) == BOOLEAN_TYPE)
>
> in other places we restrict this to single-bit integral types instead which
> covers a few more cases and also would handle BOOLEAN_TYPE
> with either padding or non-padding extra bits correctly (IIRC fortran
> has only padding bits but Ada has BOOLEAN_TYPEs with possibly
> > 1 bit precision and arbitrary signedness - maybe even with custom
> true/false values).

Yes and I have a better patch which still allows us to optimize the
case of `cmp & ~cmp` and `cmp | ~cmp`. I will post it after it
finishes bootstrapping.

Thanks,
Andrew

>
> Richard.
>
> >  {
> >tree op10 = TREE_OPERAND (expr1, 0);
> >tree op20 = TREE_OPERAND (expr2, 0);
> > diff --git a/gcc/testsuite/gcc.c-torture/execute/pr110954-1.c 
> > b/gcc/testsuite/gcc.c-torture/execute/pr110954-1.c
> > new file mode 100644
> > index 000..8aad758e10f
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.c-torture/execute/pr110954-1.c
> > @@ -0,0 +1,10 @@
> > +
> > +#define comparison (f < 0)
> > +int main() {
> > +  int f = 0;
> > +  int d = comparison | !comparison;
> > +  if (d != 1)
> > +__builtin_abort();
> > +  return 0;
> > +}
> > +
> > --
> > 2.31.1
> >


Re: [PATCH] MATCH: [PR110937/PR100798] (a ? ~b : b) should be optimized to b ^ -(a)

2023-08-10 Thread Andrew Pinski via Gcc-patches
On Thu, Aug 10, 2023 at 6:39 AM Christophe Lyon via Gcc-patches
 wrote:
>
> Hi Andrew,
>
>
> On Wed, 9 Aug 2023 at 21:20, Andrew Pinski via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
>
> > This adds a simple match pattern for this case.
> > I noticed it a couple of different places.
> > One while I was looking at code generation of a parser and
> > also while I was looking at locations where bitwise_inverted_equal_p
> > should be used more.
> >
> > Committed as approved after bootstrapped and tested on x86_64-linux-gnu
> > with no regressions.
> >
> > PR tree-optimization/110937
> > PR tree-optimization/100798
> >
> > gcc/ChangeLog:
> >
> > * match.pd (`a ? ~b : b`): Handle this
> > case.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/bool-14.c: New test.
> > * gcc.dg/tree-ssa/bool-15.c: New test.
> > * gcc.dg/tree-ssa/phi-opt-33.c: New test.
> > * gcc.dg/tree-ssa/20030709-2.c: Update testcase
> > so `a ? -1 : 0` is not used to hit the match
> > pattern.
> >
>
> Our CI noticed that your patch introduced regressions as follows on aarch64:
>
>  Running gcc:gcc.target/aarch64/aarch64.exp ...
> FAIL: gcc.target/aarch64/cond_op_imm_1.c scan-assembler csinv\tw[0-9]*.*
> FAIL: gcc.target/aarch64/cond_op_imm_1.c scan-assembler csinv\tx[0-9]*.*
>
> Running gcc:gcc.target/aarch64/sve/aarch64-sve.exp ...
> FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-not \\tmov\\tz
> FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
> \\tneg\\tz[0-9]+\\.b, p[0-7]/m, 3
> FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
> \\tneg\\tz[0-9]+\\.h, p[0-7]/m, 2
> FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
> \\tneg\\tz[0-9]+\\.s, p[0-7]/m, 1
> FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
> \\tnot\\tz[0-9]+\\.b, p[0-7]/m, 3
> FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
> \\tnot\\tz[0-9]+\\.h, p[0-7]/m, 2
> FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
> \\tnot\\tz[0-9]+\\.s, p[0-7]/m, 1
>
> Hopefully you'll just need to update the testcases (I didn't check
> manually, I think you can easily reproduce this on aarch64?)

I have a few ideas of how to fix this properly inside isel without
changing the testcases. I will start working on that starting
tomorrow.
In the meantime can you file a bug report? So we don't lose track of
the regression?

Thanks,
Andrew

>
> Thanks,
>
> Christophe
>
>
>
>
> > ---
> >  gcc/match.pd   | 14 ++
> >  gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c |  5 +++--
> >  gcc/testsuite/gcc.dg/tree-ssa/bool-14.c| 15 +++
> >  gcc/testsuite/gcc.dg/tree-ssa/bool-15.c| 18 ++
> >  gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c | 13 +
> >  5 files changed, 63 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-15.c
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 9b4819e5be7..fc630b63563 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -6460,6 +6460,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >(if (cmp == NE_EXPR)
> > { constant_boolean_node (true, type); })))
> >
> > +#if GIMPLE
> > +/* a?~t:t -> (-(a))^t */
> > +(simplify
> > + (cond @0 @1 @2)
> > + (if (INTEGRAL_TYPE_P (type)
> > +  && bitwise_inverted_equal_p (@1, @2))
> > +  (with {
> > +auto prec = TYPE_PRECISION (type);
> > +auto unsign = TYPE_UNSIGNED (type);
> > +tree inttype = build_nonstandard_integer_type (prec, unsign);
> > +   }
> > +   (convert (bit_xor (negate (convert:inttype @0)) (convert:inttype
> > @2))
> > +#endif
> > +
> >  /* Simplify pointer equality compares using PTA.  */
> >  (for neeq (ne eq)
> >   (simplify
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
> > index 5009cd69cfe..78938f919d4 100644
> > --- a/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
> > @@ -29,15 +29,16 @@ union tree_node
> >  };
> >  int make_decl_rtl (tree, int);
> >  void *
> > -get_alias_set (t)
> > +get_alias_set (t, t1)
> >   tree t;
> > + void *t1;
&

Re: [PATCH 0/12] GCC _BitInt support [PR102989]

2023-08-10 Thread Andrew Pinski via Gcc-patches
On Thu, Aug 10, 2023 at 12:13 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> On Thu, Aug 10, 2023 at 06:55:05AM +, Richard Biener wrote:
> > On Wed, 9 Aug 2023, Joseph Myers wrote:
> >
> > > On Wed, 9 Aug 2023, Jakub Jelinek via Gcc-patches wrote:
> > >
> > > > - _Complex _BitInt(N) isn't supported; again mainly because none of the 
> > > > psABIs
> > > >   mention how those should be passed/returned; in a limited way they are
> > > >   supported internally because the internal functions into which
> > > >   __builtin_{add,sub,mul}_overflow{,_p} is lowered return COMPLEX_TYPE 
> > > > as a
> > > >   hack to return 2 values without using references/pointers
> > >
> > > What happens when the usual arithmetic conversions are applied to
> > > operands, one of which is a complex integer type and the other of which is
> > > a wider _BitInt type?  I don't see anything in the code to disallow this
> > > case (which would produce an expression with a _Complex _BitInt type), or
> > > any testcases for it.
> > >
> > > Other testcases I think should be present (along with any corresponding
> > > changes needed to the code itself):
> > >
> > > * Verifying that the new integer constant suffix is rejected for C++.
> > >
> > > * Verifying appropriate pedwarn-if-pedantic for the new constant suffix
> > > for versions of C before C2x (and probably for use of _BitInt type
> > > specifiers before C2x as well) - along with the expected -Wc11-c2x-compat
> > > handling (in C2x mode) / -pedantic -Wno-c11-c2x-compat in older modes.
> >
> > Can we go as far as deprecating our _Complex int extension for
> > C17 and make it unavailable for C2x, side-stepping the issue?
> > Or maybe at least considering that for C2x?
>
> I can just sorry at it for now.  And now that I search through the x86-64
> psABI again, it doesn't mention complex integers at all, so we are there on
> our own.  And it seems we don't have anything for complex integers on the
> library side and the complex lowering is before bitint lowering, so it might
> just work with < 10 lines of changes in code + testsuite, but if we do
> enable it, let's do it incrementally.

_Complex int division also has issues which is another reason to
deprecate/remove it; see PR 104937 for that and
https://gcc.gnu.org/legacy-ml/gcc/2001-11/msg00790.html (which was the
first time to deprecate _Complex int;
https://gcc.gnu.org/legacy-ml/gcc/2001-11/msg00863.html).

Thanks,
Andrew


>
> Jakub
>


[PATCH] Fix PR 110954: wrong code with cmp | !cmp

2023-08-09 Thread Andrew Pinski via Gcc-patches
This was an oversight on my part not realizing that
comparisons in generic can have a non-boolean type.
This means if we have `(f < 0) | !(f < 0)` we would
optimize this to -1 rather than just 1.
This patch just adds the check for the type of the comparisons
to be boolean type to keep the optimization in that case.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR 110954

gcc/ChangeLog:

* generic-match-head.cc (bitwise_inverted_equal_p): Check
the type of the comparison to be boolean too.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/pr110954-1.c: New test.
---
 gcc/generic-match-head.cc|  3 ++-
 gcc/testsuite/gcc.c-torture/execute/pr110954-1.c | 10 ++
 2 files changed, 12 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110954-1.c

diff --git a/gcc/generic-match-head.cc b/gcc/generic-match-head.cc
index ddaf22f2179..ac2119bfdd0 100644
--- a/gcc/generic-match-head.cc
+++ b/gcc/generic-match-head.cc
@@ -146,7 +146,8 @@ bitwise_inverted_equal_p (tree expr1, tree expr2)
   && bitwise_equal_p (expr1, TREE_OPERAND (expr2, 0)))
 return true;
   if (COMPARISON_CLASS_P (expr1)
-  && COMPARISON_CLASS_P (expr2))
+  && COMPARISON_CLASS_P (expr2)
+  && TREE_CODE (TREE_TYPE (expr1)) == BOOLEAN_TYPE)
 {
   tree op10 = TREE_OPERAND (expr1, 0);
   tree op20 = TREE_OPERAND (expr2, 0);
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr110954-1.c 
b/gcc/testsuite/gcc.c-torture/execute/pr110954-1.c
new file mode 100644
index 000..8aad758e10f
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr110954-1.c
@@ -0,0 +1,10 @@
+
+#define comparison (f < 0)
+int main() {
+  int f = 0;
+  int d = comparison | !comparison;
+  if (d != 1)
+__builtin_abort();
+  return 0;
+}
+
-- 
2.31.1



Re: [PATCH 3/12] phiopt: Fix phiopt ICE on vops [PR102989]

2023-08-09 Thread Andrew Pinski via Gcc-patches
On Wed, Aug 9, 2023 at 1:01 PM Jakub Jelinek  wrote:
>
> On Wed, Aug 09, 2023 at 11:27:48AM -0700, Andrew Pinski wrote:
> > Maybe it is better to punt for VOPS after the call to
> > single_non_singleton_phi_for_edges since none of functions called
> > afterwards support VOPs.
> > That is something like:
> > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> > index ff36bb0119b..d0b659042a7 100644
> > --- a/gcc/tree-ssa-phiopt.cc
> > +++ b/gcc/tree-ssa-phiopt.cc
> > @@ -4165,6 +4165,10 @@ pass_phiopt::execute (function *)
> >arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
> >arg1 = gimple_phi_arg_def (phi, e2->dest_idx);
> >
> > +  /* Can't do anything with a VOP here.  */
> > +  if (SSA_NAME_IS_VIRTUAL_OPERAND (arg0))
> > +   continue;
> > +
>
> That would ICE if arg0 isn't SSA_NAME (e.g. is INTEGER_CST).
> I think more canonical test for virtual phis is
> if (virtual_operand_p (gimple_phi_result (phi)))
>
> Shall already single_non_singleton_phi_for_edges punt if there is
> a virtual phi with different arguments from the edges (or if there
> is a single virtual phi)?

That was my next thought, returning NULL from
single_non_singleton_phi_for_edges if it would return a virtual OP
might be better even.
Either version of these patches are ok with me (though I am not the
maintainer here).

Thanks,
Andrew

>
> Jakub
>


[PATCH] MATCH: [PR110937/PR100798] (a ? ~b : b) should be optimized to b ^ -(a)

2023-08-09 Thread Andrew Pinski via Gcc-patches
This adds a simple match pattern for this case.
I noticed it a couple of different places.
One while I was looking at code generation of a parser and
also while I was looking at locations where bitwise_inverted_equal_p
should be used more.

Committed as approved after bootstrapped and tested on x86_64-linux-gnu with no 
regressions.

PR tree-optimization/110937
PR tree-optimization/100798

gcc/ChangeLog:

* match.pd (`a ? ~b : b`): Handle this
case.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bool-14.c: New test.
* gcc.dg/tree-ssa/bool-15.c: New test.
* gcc.dg/tree-ssa/phi-opt-33.c: New test.
* gcc.dg/tree-ssa/20030709-2.c: Update testcase
so `a ? -1 : 0` is not used to hit the match
pattern.
---
 gcc/match.pd   | 14 ++
 gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c |  5 +++--
 gcc/testsuite/gcc.dg/tree-ssa/bool-14.c| 15 +++
 gcc/testsuite/gcc.dg/tree-ssa/bool-15.c| 18 ++
 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c | 13 +
 5 files changed, 63 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-15.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 9b4819e5be7..fc630b63563 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6460,6 +6460,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (cmp == NE_EXPR)
{ constant_boolean_node (true, type); })))
 
+#if GIMPLE
+/* a?~t:t -> (-(a))^t */
+(simplify
+ (cond @0 @1 @2)
+ (if (INTEGRAL_TYPE_P (type)
+  && bitwise_inverted_equal_p (@1, @2))
+  (with {
+auto prec = TYPE_PRECISION (type);
+auto unsign = TYPE_UNSIGNED (type);
+tree inttype = build_nonstandard_integer_type (prec, unsign);
+   }
+   (convert (bit_xor (negate (convert:inttype @0)) (convert:inttype @2))
+#endif
+
 /* Simplify pointer equality compares using PTA.  */
 (for neeq (ne eq)
  (simplify
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
index 5009cd69cfe..78938f919d4 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
@@ -29,15 +29,16 @@ union tree_node
 };
 int make_decl_rtl (tree, int);
 void *
-get_alias_set (t)
+get_alias_set (t, t1)
  tree t;
+ void *t1;
 {
   long set;
   if (t->decl.rtl)
 return (t->decl.rtl->fld[1].rtmem 
? 0
: (((t->decl.rtl ? t->decl.rtl: (make_decl_rtl (t, 0), 
t->decl.rtl)))->fld[1]).rtmem);
-  return (void*)-1;
+  return t1;
 }
 
 /* There should be precisely one load of ->decl.rtl.  If there is
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bool-14.c 
b/gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
new file mode 100644
index 000..0149380a63b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+/* PR tree-optimization/110937 */
+
+_Bool f2(_Bool a, _Bool b)
+{
+if (a)
+  return !b;
+return b;
+}
+
+/* We should be able to remove the conditional and convert it to an xor. */
+/* { dg-final { scan-tree-dump-not "gimple_cond " "optimized" } } */
+/* { dg-final { scan-tree-dump-not "gimple_phi " "optimized" } } */
+/* { dg-final { scan-tree-dump-times "bit_xor_expr, " 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bool-15.c 
b/gcc/testsuite/gcc.dg/tree-ssa/bool-15.c
new file mode 100644
index 000..1f496663863
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bool-15.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+/* PR tree-optimization/110937 */
+
+_Bool f2(int x, int y, int w, int z)
+{
+  _Bool a = x == y;
+  _Bool b = w == z;
+  if (a)
+return !b;
+  return b;
+}
+
+/* We should be able to remove the conditional and convert it to an xor. */
+/* { dg-final { scan-tree-dump-not "gimple_cond " "optimized" } } */
+/* { dg-final { scan-tree-dump-not "gimple_phi " "optimized" } } */
+/* { dg-final { scan-tree-dump-not "ne_expr, " "optimized" } } */
+/* { dg-final { scan-tree-dump-times "bit_xor_expr, " 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c 
b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c
new file mode 100644
index 000..b79fe44187a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+/* PR tree-optimization/100798 */
+
+int f(int a, int t)
+{
+  return (a=='s' ? ~t : t);
+}
+
+/* This should be convert into t^-(a=='s').  */
+/* { dg-final { scan-tree-dump-times "bit_xor_expr, " 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "negate_expr, " 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-not "bit_not_expr, " "optimized" } } 

Re: [PATCH 3/12] phiopt: Fix phiopt ICE on vops [PR102989]

2023-08-09 Thread Andrew Pinski via Gcc-patches
On Wed, Aug 9, 2023 at 11:17 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> I've ran into ICE on gcc.dg/torture/bitint-42.c with -O1 or -Os
> when enabling expensive tests, and unfortunately I can't reproduce without
> _BitInt.  The IL before phiopt3 has:
>[local count: 203190070]:
>   # .MEM_428 = VDEF <.MEM_367>
>   bitint.159 = VIEW_CONVERT_EXPR(*.LC3);
>   goto ; [100.00%]
>
>[local count: 203190070]:
>   # .MEM_427 = VDEF <.MEM_367>
>   bitint.159 = VIEW_CONVERT_EXPR(*.LC4);
>
>[local count: 406380139]:
>   # .MEM_368 = PHI <.MEM_428(87), .MEM_427(88)>
>   # VUSE <.MEM_368>
>   _123 = VIEW_CONVERT_EXPR(r495[i_107].D.2780)[0];
> and factor_out_conditional_operation is called on the vop PHI, it
> sees it has exactly two operands and defining statements of both
> PHI arguments are converts (VCEs in this case), so it thinks it is
> a good idea to try to optimize that and while doing that it constructs
> void type SSA_NAMEs and the like.

Maybe it is better to punt for VOPS after the call to
single_non_singleton_phi_for_edges since none of functions called
afterwards support VOPs.
That is something like:
diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index ff36bb0119b..d0b659042a7 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -4165,6 +4165,10 @@ pass_phiopt::execute (function *)
   arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
   arg1 = gimple_phi_arg_def (phi, e2->dest_idx);

+  /* Can't do anything with a VOP here.  */
+  if (SSA_NAME_IS_VIRTUAL_OPERAND (arg0))
+   continue;
+
   /* Something is wrong if we cannot find the arguments in the PHI
  node.  */
   gcc_assert (arg0 != NULL_TREE && arg1 != NULL_TREE);

Thanks,
Andrew Pinski

>
> 2023-08-09  
>
> PR c/102989
> * tree-ssa-phiopt.cc (factor_out_conditional_operation): Punt for
> vops.
>
> --- gcc/tree-ssa-phiopt.cc.jj   2023-08-08 15:55:09.508122417 +0200
> +++ gcc/tree-ssa-phiopt.cc  2023-08-09 15:55:23.762314103 +0200
> @@ -241,6 +241,7 @@ factor_out_conditional_operation (edge e
>  }
>
>if (TREE_CODE (arg0) != SSA_NAME
> +  || SSA_NAME_IS_VIRTUAL_OPERAND (arg0)
>|| (TREE_CODE (arg1) != SSA_NAME
>   && TREE_CODE (arg1) != INTEGER_CST))
>  return NULL;
>
> Jakub
>


[PATCH] VR-VALUES: Simplify comparison using range pairs

2023-08-09 Thread Andrew Pinski via Gcc-patches
If `A` has a range of `[0,0][100,INF]` and the comparison
of `A < 50`. This should be optimized to `A <= 0` (which then
will be optimized to just `A == 0`).
This patch implement this via a new function which sees if
the constant of a comparison is in the middle of 2 range pairs
and change the constant to the either upper bound of the first pair
or the lower bound of the second pair depending on the comparison.

This is the first step in fixing the following PRS:
PR 110131, PR 108360, and PR 108397.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* vr-values.cc (simplify_compare_using_range_pairs): New function.
(simplify_using_ranges::simplify_compare_using_ranges_1): Call
it.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/vrp124.c: New test.
* gcc.dg/pr21643.c: Disable VRP.
---
 gcc/testsuite/gcc.dg/pr21643.c |  6 ++-
 gcc/testsuite/gcc.dg/tree-ssa/vrp124.c | 44 +
 gcc/vr-values.cc   | 65 ++
 3 files changed, 114 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp124.c

diff --git a/gcc/testsuite/gcc.dg/pr21643.c b/gcc/testsuite/gcc.dg/pr21643.c
index 4e7f93d351a..7f121d7006f 100644
--- a/gcc/testsuite/gcc.dg/pr21643.c
+++ b/gcc/testsuite/gcc.dg/pr21643.c
@@ -1,6 +1,10 @@
 /* PR tree-optimization/21643 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-reassoc1-details --param 
logical-op-non-short-circuit=1" } */
+/* Note VRP is able to transform `c >= 0x20` in f7
+   to `c >= 0x21` since we want to test
+   reassociation and not VRP, turn it off. */
+
+/* { dg-options "-O2 -fdump-tree-reassoc1-details --param 
logical-op-non-short-circuit=1 -fno-tree-vrp" } */
 
 int
 f1 (unsigned char c)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
new file mode 100644
index 000..6ccbda35d1b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c
@@ -0,0 +1,44 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+/* Should be optimized to a == -100 */
+int g(int a)
+{
+  if (a == -100 || a >= 0)
+;
+  else
+return 0;
+  return a < 0;
+}
+
+/* Should optimize to a == 0 */
+int f(int a)
+{
+  if (a == 0 || a > 100)
+;
+  else
+return 0;
+  return a < 50;
+}
+
+/* Should be optimized to a == 0. */
+int f2(int a)
+{
+  if (a == 0 || a > 100)
+;
+  else
+return 0;
+  return a < 100;
+}
+
+/* Should optimize to a == 100 */
+int f1(int a)
+{
+  if (a < 0 || a == 100)
+;
+  else
+return 0;
+  return a > 50;
+}
+
+/* { dg-final { scan-tree-dump-not "goto " "optimized" } } */
diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
index a4fddd62841..1262e7cf9f0 100644
--- a/gcc/vr-values.cc
+++ b/gcc/vr-values.cc
@@ -968,9 +968,72 @@ test_for_singularity (enum tree_code cond_code, tree op0,
   if (operand_equal_p (min, max, 0) && is_gimple_min_invariant (min))
return min;
 }
+
   return NULL;
 }
 
+/* Simplify integer comparisons such that the constant is one of the range 
pairs.
+   For an example, 
+   A has a range of [0,0][100,INF]
+   and the comparison of `A < 50`.
+   This should be optimized to `A <= 0`
+   and then test_for_singularity can optimize it to `A == 0`.   */
+
+static bool
+simplify_compare_using_range_pairs (tree_code _code, tree , tree ,
+   const value_range *vr)
+{
+  if (TREE_CODE (op1) != INTEGER_CST
+  || vr->num_pairs () < 2)
+return false;
+  auto val_op1 = wi::to_wide (op1);
+  tree type = TREE_TYPE (op0);
+  auto sign = TYPE_SIGN (type);
+  auto p = vr->num_pairs ();
+  /* Find the value range pair where op1
+ is in the middle of if one exist. */
+  for (unsigned i = 1; i < p; i++)
+{
+  auto lower = vr->upper_bound (i - 1);
+  auto upper = vr->lower_bound (i);
+  if (wi::lt_p (val_op1, lower, sign))
+   continue;
+  if (wi::gt_p (val_op1, upper, sign))
+   continue;
+  if (cond_code == LT_EXPR
+  && val_op1 != lower)
+{
+ op1 = wide_int_to_tree (type, lower);
+ cond_code = LE_EXPR;
+ return true;
+}
+  if (cond_code == LE_EXPR
+  && val_op1 != upper
+  && val_op1 != lower)
+{
+ op1 = wide_int_to_tree (type, lower);
+ cond_code = LE_EXPR;
+ return true;
+}
+  if (cond_code == GT_EXPR
+  && val_op1 != upper)
+{
+ op1 = wide_int_to_tree (type, upper);
+ cond_code = GE_EXPR;
+ return true;
+}
+  if (cond_code == GE_EXPR
+  && val_op1 != lower
+  && val_op1 != upper)
+{
+ op1 = wide_int_to_tree (type, upper);
+ cond_code = GE_EXPR;
+ return true;
+}
+}
+  return false;
+}
+
 /* Return whether the value range *VR fits in an integer type specified
by PRECISION and UNSIGNED_P.  */
 
@@ -1235,6 

Re: RISC-V: Added support for CRC.

2023-08-08 Thread Andrew Pinski via Gcc-patches
On Tue, Aug 8, 2023 at 4:17 PM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 8/8/23 10:38, Alexander Monakov wrote:
> >
> > On Tue, 8 Aug 2023, Jeff Law wrote:
> >
> >> That was my thinking at one time.  Then we started looking at the distros 
> >> and
> >> found enough crc implementations in there to change my mind about the 
> >> overall
> >> utility.
> >
> > The ones I'm familiar with are all table-based and look impossible to
> > pattern-match (and hence already fairly efficient comparable to bitwise
> > loop in Coremark).
> We found dozens that were the usual looking loops and, IIRC ~200 table
> lookups after analyzing about half of the packages in Fedora.

I will make a note we do handle table lookups to detect count leading
zeros, see check_ctz_array in tree-ssa-forwprop.cc for that detection.
(that was done to improve a SPEC benchmark even).
So if the tables are statically defined at compile time, there is
already an example of how it can be detected too.

Thanks,
Andrew Pinski

>
>
> >
> > So... just provide a library? A library code is easier to develop and audit,
> > it can be released independently, people can use it with their compiler of
> > choice. Not everything needs to be in libgcc.
> If the compiler can identify a CRC and collapse it down to a table or
> clmul, that's a major win and such code does exist in the real world.
> That was the whole point behind the Fedora experiment -- to determine if
> these things are showing up in the real world or if this is just a
> benchmarking exercise.
>
> And just to be clear, we're not proposing anything for libgcc.
>
> >
> > I'm talking about factoring a long chain into multiple independent chains
> > for latency hiding.
> And that could potentially be an extension.  But even without this a
> standard looking CRC loop will be much faster using table lookups or
> simple generation with clmul.
>
> Also note that latency of clmuls is improving on modern hardware.  4c
> isn't hard to achieve and I wouldn't be surprised to see 2c clmuls in
> the near future.
>
>
> >
> > Useful to whom? The Linux kernel? zlib, bzip2, xz-utils? ffmpeg?
> > These consumers need high-performance blockwise CRC, offering them
> > a latency-bound elementwise CRC primitive is a disservice. And what
> > should they use as a fallback when __builtin_crc is unavailable?
> THe point is builtin_crc would always be available.  If there is no
> clmul, then the RTL backend can expand to a table lookup version.
>
> >
> >> while at the same time putting one side of the infrastructure we need for
> >> automatic detection of CRC loops and turning them into table lookups or
> >> CLMULs.
> >>
> >> With that in mind I'm certain Mariam & I would love feedback on a builtin 
> >> API
> >> that would be more useful.
> >
> > I think offering a conventional library for CRC has substantial advantages.
> That's not what I asked.  If you think there's room for improvement to a
> builtin API, I'd love to hear it.
>
> But it seems you don't think this is worth the effort at all.  That's
> unfortunate, but if that's the consensus, then so be it.
>
> I'll note LLVM is likely going forward with CRC detection and
> optimization at some point in the next ~6 months (effectively moving the
> implementation from the hexagon port into the generic parts of their
> loop optimizer).
>
>
>
> Jeff


Re: [PATCH] MATCH: [PR110937/PR100798] (a ? ~b : b) should be optimized to b ^ -(a)

2023-08-08 Thread Andrew Pinski via Gcc-patches
On Tue, Aug 8, 2023 at 12:44 AM Richard Biener via Gcc-patches
 wrote:
>
> On Tue, Aug 8, 2023 at 2:55 AM Andrew Pinski via Gcc-patches
>  wrote:
> >
> > This adds a simple match pattern for this case.
> > I noticed it a couple of different places.
> > One while I was looking at code generation of a parser and
> > also while I was looking at locations where bitwise_inverted_equal_p
> > should be used more.
> >
> > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> >
> > PR tree-optimization/110937
> > PR tree-optimization/100798
> >
> > gcc/ChangeLog:
> >
> > * match.pd (`a ? ~b : b`): Handle this
> > case.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/bool-14.c: New test.
> > * gcc.dg/tree-ssa/bool-15.c: New test.
> > * gcc.dg/tree-ssa/phi-opt-33.c: New test.
> > * gcc.dg/tree-ssa/20030709-2.c: Update testcase
> > so `a ? -1 : 0` is not used to hit the match
> > pattern.
> > ---
> >  gcc/match.pd   | 13 +
> >  gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c |  5 +++--
> >  gcc/testsuite/gcc.dg/tree-ssa/bool-14.c| 15 +++
> >  gcc/testsuite/gcc.dg/tree-ssa/bool-15.c| 18 ++
> >  gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c | 13 +
> >  5 files changed, 62 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-15.c
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 9b4819e5be7..f887c517c81 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -6460,6 +6460,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >(if (cmp == NE_EXPR)
> > { constant_boolean_node (true, type); })))
> >
> > +#if GIMPLE
> > +/* a?~t:t -> (-(a))^t */
> > +(simplify
> > + (cond @0 @1 @2)
> > + (if (bitwise_inverted_equal_p (@1, @2))
>
> I'm not sure if that can ever match a not INTEGRAL_TYPE_P
> but we can have vector typed @1 and @2 and then the
> TYPE_PRECISION ask below would be wrong.  So can you
> add
>
>   INTEGRAL_TYPE_P (type)
>   && bitwise_in...
>
> if only for clarity?
>
> > +  (with {
> > +auto prec = TYPE_PRECISION (type);
> > +auto unsign = TYPE_UNSIGNED (type);
> > +tree inttype = build_nonstandard_integer_type (prec, unsign);
> > +   }
> > +   (convert (bit_xor (negate (convert:inttype @0)) (convert:inttype 
> > @2))
>
> so we don't get to know which of @1 or @2 is "simpler" (the not
> explicitely inverted
> operand), I suppose that's the disadvantage of using bitwise_inverted_equal_p.
> I'll note that if you make bitwise_inverted_equal_p a match you'd need a :c on
> the 'cond' but otherwise complexity would be the same as match patterns are 
> not
> "inlined".

Right, The disadvantage is definitely not knowing which is "simpler".
And I found a testcase which shows that but I suspect we can fix that.
```
int f(int a, int t)
{
  int t1 = ~t;
 return (a=='s' ? t : t1);
}
```
Basically we are missing transforming:
~(-(cast)(cmp)) into -(cast)(cmp`)
Filed as PR 110949 .

>
> In any case, OK with the INTEGRAL_TYPE_P check.
Will update the patch and commit it after a bootstrap/test.

Thanks,
Andrew

>
> Thanks,
> Richard.
>
> > +#endif
> > +
> >  /* Simplify pointer equality compares using PTA.  */
> >  (for neeq (ne eq)
> >   (simplify
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c 
> > b/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
> > index 5009cd69cfe..78938f919d4 100644
> > --- a/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
> > @@ -29,15 +29,16 @@ union tree_node
> >  };
> >  int make_decl_rtl (tree, int);
> >  void *
> > -get_alias_set (t)
> > +get_alias_set (t, t1)
> >   tree t;
> > + void *t1;
> >  {
> >long set;
> >if (t->decl.rtl)
> >  return (t->decl.rtl->fld[1].rtmem
> > ? 0
> > : (((t->decl.rtl ? t->decl.rtl: (make_decl_rtl (t, 0), 
> > t->decl.rtl)))->fld[1]).rtmem);
> > -  return (void*)-1;
> > +  return t1;
> >  }
> >
> >  /* There should be precisely one load of ->decl.rtl.  If there is
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bool-14

[PATCH] MATCH: [PR110937/PR100798] (a ? ~b : b) should be optimized to b ^ -(a)

2023-08-07 Thread Andrew Pinski via Gcc-patches
This adds a simple match pattern for this case.
I noticed it a couple of different places.
One while I was looking at code generation of a parser and
also while I was looking at locations where bitwise_inverted_equal_p
should be used more.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/110937
PR tree-optimization/100798

gcc/ChangeLog:

* match.pd (`a ? ~b : b`): Handle this
case.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bool-14.c: New test.
* gcc.dg/tree-ssa/bool-15.c: New test.
* gcc.dg/tree-ssa/phi-opt-33.c: New test.
* gcc.dg/tree-ssa/20030709-2.c: Update testcase
so `a ? -1 : 0` is not used to hit the match
pattern.
---
 gcc/match.pd   | 13 +
 gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c |  5 +++--
 gcc/testsuite/gcc.dg/tree-ssa/bool-14.c| 15 +++
 gcc/testsuite/gcc.dg/tree-ssa/bool-15.c| 18 ++
 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c | 13 +
 5 files changed, 62 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-15.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 9b4819e5be7..f887c517c81 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6460,6 +6460,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (cmp == NE_EXPR)
{ constant_boolean_node (true, type); })))
 
+#if GIMPLE
+/* a?~t:t -> (-(a))^t */
+(simplify
+ (cond @0 @1 @2)
+ (if (bitwise_inverted_equal_p (@1, @2))
+  (with {
+auto prec = TYPE_PRECISION (type);
+auto unsign = TYPE_UNSIGNED (type);
+tree inttype = build_nonstandard_integer_type (prec, unsign);
+   }
+   (convert (bit_xor (negate (convert:inttype @0)) (convert:inttype @2))
+#endif
+
 /* Simplify pointer equality compares using PTA.  */
 (for neeq (ne eq)
  (simplify
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
index 5009cd69cfe..78938f919d4 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
@@ -29,15 +29,16 @@ union tree_node
 };
 int make_decl_rtl (tree, int);
 void *
-get_alias_set (t)
+get_alias_set (t, t1)
  tree t;
+ void *t1;
 {
   long set;
   if (t->decl.rtl)
 return (t->decl.rtl->fld[1].rtmem 
? 0
: (((t->decl.rtl ? t->decl.rtl: (make_decl_rtl (t, 0), 
t->decl.rtl)))->fld[1]).rtmem);
-  return (void*)-1;
+  return t1;
 }
 
 /* There should be precisely one load of ->decl.rtl.  If there is
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bool-14.c 
b/gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
new file mode 100644
index 000..0149380a63b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+/* PR tree-optimization/110937 */
+
+_Bool f2(_Bool a, _Bool b)
+{
+if (a)
+  return !b;
+return b;
+}
+
+/* We should be able to remove the conditional and convert it to an xor. */
+/* { dg-final { scan-tree-dump-not "gimple_cond " "optimized" } } */
+/* { dg-final { scan-tree-dump-not "gimple_phi " "optimized" } } */
+/* { dg-final { scan-tree-dump-times "bit_xor_expr, " 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bool-15.c 
b/gcc/testsuite/gcc.dg/tree-ssa/bool-15.c
new file mode 100644
index 000..1f496663863
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bool-15.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+/* PR tree-optimization/110937 */
+
+_Bool f2(int x, int y, int w, int z)
+{
+  _Bool a = x == y;
+  _Bool b = w == z;
+  if (a)
+return !b;
+  return b;
+}
+
+/* We should be able to remove the conditional and convert it to an xor. */
+/* { dg-final { scan-tree-dump-not "gimple_cond " "optimized" } } */
+/* { dg-final { scan-tree-dump-not "gimple_phi " "optimized" } } */
+/* { dg-final { scan-tree-dump-not "ne_expr, " "optimized" } } */
+/* { dg-final { scan-tree-dump-times "bit_xor_expr, " 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c 
b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c
new file mode 100644
index 000..b79fe44187a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+/* PR tree-optimization/100798 */
+
+int f(int a, int t)
+{
+  return (a=='s' ? ~t : t);
+}
+
+/* This should be convert into t^-(a=='s').  */
+/* { dg-final { scan-tree-dump-times "bit_xor_expr, " 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "negate_expr, " 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-not "bit_not_expr, " "optimized" } } */
-- 
2.31.1



[PATCH] VR-VALUES [PR28794]: optimize compare assignments also

2023-08-07 Thread Andrew Pinski via Gcc-patches
This patch fixes the oldish (2006) bug where VRP was not
optimizing the comparison for assignments while handling
them for GIMPLE_COND only.
It just happens to also solves PR 103281 due to allowing
to optimize `c < 1` to `c == 0` and then we get
`(c == 0) == c` (which was handled by r14-2501-g285c9d04).

OK? Bootstrapped and tested on x86_64-linux-gnu with no
regressions.

PR tree-optmization/103281
PR tree-optimization/28794

gcc/ChangeLog:

* vr-values.cc (simplify_using_ranges::simplify_cond_using_ranges_1): 
Split out
majority to ...
(simplify_using_ranges::simplify_compare_using_ranges_1): Here.
(simplify_using_ranges::simplify_casted_cond): Rename to ...
(simplify_using_ranges::simplify_casted_compare): This
and change arguments to take op0 and op1.
(simplify_using_ranges::simplify_compare_assign_using_ranges_1): New 
method.
(simplify_using_ranges::simplify): For tcc_comparison assignments call
simplify_compare_assign_using_ranges_1.
* vr-values.h (simplify_using_ranges): Add
new methods, simplify_compare_using_ranges_1 and 
simplify_compare_assign_using_ranges_1.
Rename simplify_casted_cond and simplify_casted_compare and
update argument types.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr103281-1.c: New test.
* gcc.dg/tree-ssa/vrp-compare-1.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr103281-1.c|  19 +++
 gcc/testsuite/gcc.dg/tree-ssa/vrp-compare-1.c |  13 ++
 gcc/vr-values.cc  | 160 +++---
 gcc/vr-values.h   |   4 +-
 4 files changed, 134 insertions(+), 62 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr103281-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-compare-1.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103281-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr103281-1.c
new file mode 100644
index 000..09964d0b46b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103281-1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* PR tree-optmization/103281 */
+
+void foo(void);
+
+static unsigned b;
+
+int main() {
+  for (; b < 3; b++) {
+char c = b;
+char a = c ? c : c << 1;
+if (!(a < 1 ^ b))
+  foo();
+  }
+}
+
+/* the call to foo should be optimized away. */
+/* { dg-final { scan-tree-dump-not "foo " "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp-compare-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp-compare-1.c
new file mode 100644
index 000..9889cf34706
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp-compare-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp-details" } */
+/* PR tree-optimization/28794 */
+
+void g(int);
+void f1(int x)
+{
+  if (x < 0)  return;
+  g(x>0);
+}
+
+/* `x > 0` should be optimized to just `x != 0`  */
+/* { dg-final { scan-tree-dump-times "Simplified relational" 1 "evrp" } } */
diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
index ac4a83c6097..a4fddd62841 100644
--- a/gcc/vr-values.cc
+++ b/gcc/vr-values.cc
@@ -1139,6 +1139,87 @@ simplify_using_ranges::simplify_cond_using_ranges_1 
(gcond *stmt)
   if (fold_cond (stmt))
 return true;
 
+  if (simplify_compare_using_ranges_1 (cond_code, op0, op1, stmt))
+{
+  if (dump_file)
+   {
+ fprintf (dump_file, "Simplified relational ");
+ print_gimple_stmt (dump_file, stmt, 0);
+ fprintf (dump_file, " into ");
+   }
+
+  gimple_cond_set_code (stmt, cond_code);
+  gimple_cond_set_lhs (stmt, op0);
+  gimple_cond_set_rhs (stmt, op1);
+
+  update_stmt (stmt);
+
+   if (dump_file)
+   {
+ print_gimple_stmt (dump_file, stmt, 0);
+ fprintf (dump_file, "\n");
+   }
+  return true;
+}
+  return false;
+}
+
+/* Like simplify_cond_using_ranges_1 but for assignments rather
+   than GIMPLE_COND. */
+
+bool
+simplify_using_ranges::simplify_compare_assign_using_ranges_1
+   (gimple_stmt_iterator *gsi,
+gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree op0 = gimple_assign_rhs1 (stmt);
+  tree op1 = gimple_assign_rhs2 (stmt);
+  gcc_assert (TREE_CODE_CLASS (code) == tcc_comparison);
+  bool happened = false;
+
+  if (simplify_compare_using_ranges_1 (code, op0, op1, stmt))
+{
+  if (dump_file)
+   {
+ fprintf (dump_file, "Simplified relational ");
+ print_gimple_stmt (dump_file, stmt, 0);
+ fprintf (dump_file, " into ");
+   }
+
+  gimple_assign_set_rhs_code (stmt, code);
+  gimple_assign_set_rhs1 (stmt, op0);
+  gimple_assign_set_rhs2 (stmt, op1);
+
+  update_stmt (stmt);
+
+   if (dump_file)
+   {
+ print_gimple_stmt (dump_file, stmt, 0);
+ fprintf (dump_file, "\n");
+   }
+  happened = 

[PATCH] MATCH: [PR109959] `(uns <= 1) & uns` could be optimized to `uns == 1`

2023-08-06 Thread Andrew Pinski via Gcc-patches
I noticed while looking into some code generation of bitmap_single_bit_set_p,
that sometimes:
```
  if (uns > 1)
return 0;
  return uns == 1;
```
Would not optimize down to just:
```
return uns == 1;
```

In this case, VRP likes to change `a == 1` into `(bool)a` if
a has a range of [0,1] due to `a <= 1` side of the branch.
We might end up with this similar code even without VRP,
in the case of builtin-sprintf-warn-23.c (and Wrestrict.c), we had:
```
if (s < 0 || 1 < s)
  s = 0;
```
Which is the same as `s = ((unsigned)s) <= 1 ? s : 0`;
So we should be able to catch that also.

This adds 2 patterns to catch `(uns <= 1) & uns` and
`(uns > 1) ? 0 : uns` and convert those into:
`(convert) uns == 1`.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/109959

gcc/ChangeLog:

* match.pd (`(a > 1) ? 0 : (cast)a`, `(a <= 1) & (cast)a`):
New patterns.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Remove xfail.
* c-c++-common/Wrestrict.c: Update test and remove some xfail.
* gcc.dg/tree-ssa/cmpeq-1.c: New test.
* gcc.dg/tree-ssa/cmpeq-2.c: New test.
* gcc.dg/tree-ssa/cmpeq-3.c: New test.
---
 gcc/match.pd  | 20 +++
 gcc/testsuite/c-c++-common/Wrestrict.c| 11 +++---
 .../gcc.dg/tree-ssa/builtin-sprintf-warn-23.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/cmpeq-1.c   | 36 +++
 gcc/testsuite/gcc.dg/tree-ssa/cmpeq-2.c   | 32 +
 gcc/testsuite/gcc.dg/tree-ssa/cmpeq-3.c   | 22 
 6 files changed, 117 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cmpeq-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cmpeq-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cmpeq-3.c

diff --git a/gcc/match.pd b/gcc/match.pd
index de54b17abba..9b4819e5be7 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4902,6 +4902,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  )
 )
 
+/* (a > 1) ? 0 : (cast)a is the same as (cast)(a == 1)
+   for unsigned types. */
+(simplify
+ (cond (gt @0 integer_onep@1) integer_zerop (convert? @2))
+ (if (TYPE_UNSIGNED (TREE_TYPE (@0))
+  && bitwise_equal_p (@0, @2))
+  (convert (eq @0 @1))
+ )
+)
+
+/* (a <= 1) & (cast)a is the same as (cast)(a == 1)
+   for unsigned types. */
+(simplify
+ (bit_and:c (convert1? (le @0 integer_onep@1)) (convert2? @2))
+ (if (TYPE_UNSIGNED (TREE_TYPE (@0))
+  && bitwise_equal_p (@0, @2))
+  (convert (eq @0 @1))
+ )
+)
+
 (simplify
  (cond @0 zero_one_valued_p@1 zero_one_valued_p@2)
  (switch
diff --git a/gcc/testsuite/c-c++-common/Wrestrict.c 
b/gcc/testsuite/c-c++-common/Wrestrict.c
index 9eb02bdbfcb..4d005a618b3 100644
--- a/gcc/testsuite/c-c++-common/Wrestrict.c
+++ b/gcc/testsuite/c-c++-common/Wrestrict.c
@@ -681,7 +681,7 @@ void test_strcpy_range (void)
   ptrdiff_t r;
 
   r = SR (0, 1);
-  T (8, "0", a + r, a);   /* { dg-warning "accessing between 1 and 2 bytes at 
offsets \\\[0, 1] and 0 overlaps up to 2 bytes at offset \\\[0, 1]" "strcpy" { 
xfail *-*-*} } */
+  T (8, "0", a + r, a);   /* { dg-warning "accessing 2 bytes at offsets \\\[0, 
1] and 0 overlaps between 1 and 2 bytes at offset \\\[0, 1]" "strcpy" } */
 
   r = SR (2, 5);
   T (8, "01",  a + r, a);/* { dg-warning "accessing 3 bytes at 
offsets \\\[2, 5] and 0 may overlap 1 byte at offset 2" } */
@@ -860,10 +860,11 @@ void test_strncpy_range (char *d, size_t n)
 
   i = SR (0, 1);
   T ("0123", a, a + i, 0);
-  T ("0123", a, a + i, 1);
-  /* Offset in the range [0, i] is represented as a PHI (,  + i)
- that the implementation isn't equipped to handle yet.  */
-  T ("0123", a, a + i, 2);   /* { dg-warning "accessing 2 bytes at offsets 0 
and \\\[0, 1] may overlap 1 byte at offset 1" "strncpy" { xfail *-*-* } } */
+  T ("0123", a, a + i, 1); /* { dg-warning "accessing 1 byte at offsets 0 and 
\\\[0, 1] may overlap 1 byte at offset 0" } */
+  /* When i == 1 the following overlaps at least 1 byte: the nul at a[1]
+ (if a + 1 is the empty string).  If a + 1 is not empty then it overlaps
+ it plus as many non-nul characters after it, up to the total of 2.  */
+  T ("0123", a, a + i, 2);   /* { dg-warning "accessing 2 bytes at offsets 0 
and \\\[0, 1] overlaps between 1 and 2 bytes at offset \\\[0, 1]" "strncpy" } */
 
   i = SR (1, 5);
   T ("0123", a, a + i, 0);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c 
b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c
index 112b08afc44..051c58892e6 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c
@@ -719,5 +719,5 @@ void test_overlap_with_precision (char *d, int i, int j)
   T (d, "%.*s", i, d + 0);/* { dg-warning "may overlap" } */
   T (d, "%.*s", i, d + 1);/* { dg-warning "may overlap" } */
   T (d, "%.*s", i, d + 2);
-  T (d, "%.*s", i, d + i);/* { dg-warning 

[PATCH] MATCH: Extend min_value/max_value to pointer types

2023-08-05 Thread Andrew Pinski via Gcc-patches
Since we already had the infrastructure to optimize
`(x == 0) && (x > y)` to false for integer types,
this extends the same to pointer types as indirectly
requested by PR 96695.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/96695
* match.pd (min_value, max_value): Extend to
pointer types too.

gcc/testsuite/ChangeLog:

PR tree-optimization/96695
* gcc.dg/pr96695-1.c: New test.
* gcc.dg/pr96695-10.c: New test.
* gcc.dg/pr96695-11.c: New test.
* gcc.dg/pr96695-12.c: New test.
* gcc.dg/pr96695-2.c: New test.
* gcc.dg/pr96695-3.c: New test.
* gcc.dg/pr96695-4.c: New test.
* gcc.dg/pr96695-5.c: New test.
* gcc.dg/pr96695-6.c: New test.
* gcc.dg/pr96695-7.c: New test.
* gcc.dg/pr96695-8.c: New test.
* gcc.dg/pr96695-9.c: New test.
---
 gcc/match.pd  |  6 --
 gcc/testsuite/gcc.dg/pr96695-1.c  | 18 ++
 gcc/testsuite/gcc.dg/pr96695-10.c | 20 
 gcc/testsuite/gcc.dg/pr96695-11.c | 18 ++
 gcc/testsuite/gcc.dg/pr96695-12.c | 18 ++
 gcc/testsuite/gcc.dg/pr96695-2.c  | 18 ++
 gcc/testsuite/gcc.dg/pr96695-3.c  | 20 
 gcc/testsuite/gcc.dg/pr96695-4.c  | 21 +
 gcc/testsuite/gcc.dg/pr96695-5.c  | 19 +++
 gcc/testsuite/gcc.dg/pr96695-6.c  | 20 
 gcc/testsuite/gcc.dg/pr96695-7.c  | 19 +++
 gcc/testsuite/gcc.dg/pr96695-8.c  | 19 +++
 gcc/testsuite/gcc.dg/pr96695-9.c  | 20 
 13 files changed, 234 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr96695-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr96695-10.c
 create mode 100644 gcc/testsuite/gcc.dg/pr96695-11.c
 create mode 100644 gcc/testsuite/gcc.dg/pr96695-12.c
 create mode 100644 gcc/testsuite/gcc.dg/pr96695-2.c
 create mode 100644 gcc/testsuite/gcc.dg/pr96695-3.c
 create mode 100644 gcc/testsuite/gcc.dg/pr96695-4.c
 create mode 100644 gcc/testsuite/gcc.dg/pr96695-5.c
 create mode 100644 gcc/testsuite/gcc.dg/pr96695-6.c
 create mode 100644 gcc/testsuite/gcc.dg/pr96695-7.c
 create mode 100644 gcc/testsuite/gcc.dg/pr96695-8.c
 create mode 100644 gcc/testsuite/gcc.dg/pr96695-9.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 2278029d608..de54b17abba 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2733,12 +2733,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 (match min_value
  INTEGER_CST
- (if (INTEGRAL_TYPE_P (type)
+ (if ((INTEGRAL_TYPE_P (type)
+   || POINTER_TYPE_P(type))
   && wi::eq_p (wi::to_wide (t), wi::min_value (type)
 
 (match max_value
  INTEGER_CST
- (if (INTEGRAL_TYPE_P (type)
+ (if ((INTEGRAL_TYPE_P (type)
+   || POINTER_TYPE_P(type))
   && wi::eq_p (wi::to_wide (t), wi::max_value (type)
 
 /* x >  y  &&  x != XXX_MIN  -->  x > y
diff --git a/gcc/testsuite/gcc.dg/pr96695-1.c b/gcc/testsuite/gcc.dg/pr96695-1.c
new file mode 100644
index 000..d4287ab4c8c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr96695-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ifcombine" } */
+
+#include 
+
+_Bool and1(unsigned *x, unsigned *y)
+{
+  /* x > y && x != 0 --> x > y */
+  return x > y && x != 0;
+}
+
+_Bool and2(unsigned *x, unsigned *y)
+{
+  /* x < y && x != -1 --> x < y */
+  return x < y && x != (unsigned*)-1;
+}
+
+/* { dg-final { scan-tree-dump-not " != " "ifcombine" } } */
diff --git a/gcc/testsuite/gcc.dg/pr96695-10.c 
b/gcc/testsuite/gcc.dg/pr96695-10.c
new file mode 100644
index 000..dfe752526f0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr96695-10.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+#include 
+
+_Bool or1(unsigned *x, unsigned *y)
+{
+  /* x <= y || x != 0 --> true */
+  return x <= y || x != 0;
+}
+
+_Bool or2(unsigned *x, unsigned *y)
+{
+  /* x >= y || x != -1 --> true */
+  return x >= y || x != (unsigned*)-1;
+}
+
+/* { dg-final { scan-tree-dump-not " != " "optimized" } } */
+/* { dg-final { scan-tree-dump-not " <= " "optimized" } } */
+/* { dg-final { scan-tree-dump-not " >= " "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/pr96695-11.c 
b/gcc/testsuite/gcc.dg/pr96695-11.c
new file mode 100644
index 000..d3c36168b98
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr96695-11.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ifcombine" } */
+
+#include 
+
+_Bool or1(unsigned *x, unsigned *y)
+{
+  /* x <= y || x == 0 --> x <= y */
+  return x <= y || x == 0;
+}
+
+_Bool or2(unsigned *x, unsigned *y)
+{
+  /* x >= y || x == -1 --> x >= y */
+  return x >= y || x == (unsigned*)-1;
+}
+
+/* { dg-final { scan-tree-dump-not " == " "ifcombine" } } */
diff --git a/gcc/testsuite/gcc.dg/pr96695-12.c 
b/gcc/testsuite/gcc.dg/pr96695-12.c
new file mode 100644
index 

Re: [committed][RISC-V] Remove errant hunk of code

2023-08-04 Thread Andrew Pinski via Gcc-patches
On Thu, Aug 3, 2023 at 10:31 PM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 8/3/23 17:38, Vineet Gupta wrote:
>
> >> ;-)  Actually if you wanted to poke at zicond, the most interesting
> >> unexplored area I've come across is the COND_EXPR handling in gimple.
> >> When we expand a COND_EXPR into RTL the first approach we take is to
> >> try movcc in RTL.
> >>
> >> Unfortunately we don't create COND_EXPRs all that often in gimple.
> >> Some simple match.pd patterns would likely really help here.
> >>
> >> The problem is RTL expansion when movcc FAILs is usually poor at
> >> best.  So if we're going to add those match.pd patterns, we probably
> >> need to beef up the RTL expansion code to do a better job when the
> >> target doesn't have a movcc RTL pattern.
> >
> > Ok, I'll add that to my todo list.
> You might want to reach out to Andrew Pinski if you do poke at this.  I
> made a reference to this issue in a BZ he recently commented on.  It was
> an x86 issue with cmov generation, but the same core issue applies --
> we're not generating COND_EXPRs very aggressively in gimple.

Yes I have some ideas of producing more aggressively COND_EXPR in
either isel or in the last phiopt.
There is also a canonicalization form issue dealing with `bool * b`
representing `bool ? b : 0` where isel could select between the
COND_EXPR and multiply too.
This is the issue Jeff is talking about too.

Thanks,
Andrew

>
> jeff


Re: RISC-V: Added support for CRC.

2023-08-03 Thread Andrew Pinski via Gcc-patches
On Thu, Aug 3, 2023 at 12:38 PM Mariam Harutyunyan via Gcc-patches
 wrote:
>
> This patch adds CRC support for the RISC-V architecture. It adds internal
> functions and built-ins specifically designed to handle CRC computations
> efficiently.
>
> If the target is ZBC, the clmul instruction is used for the CRC code
> generation; otherwise, table-based CRC is generated.  A table with 256
> elements is used to store precomputed CRCs.
>
> These CRC calculation algorithms have higher performance than the naive CRC
> calculation algorithm.

A few things about this patch:
You created a generic (non-target specific) builtin but didn't
document it in doc/extend.texi .
You created a generic builtin with no fallback in libgcc.
You created a new named (RTL) pattern, crc, and didn't document it in
the `Standard Names` section of doc/md.texi .

Thanks,
Andrew Pinski

>
>   gcc/ChangeLog:
>*builtin-types.def (BT_FN_UINT8_UINT8_UINT8_CONST_SIZE): Define.
>(BT_FN_UINT16_UINT16_UINT8_CONST_SIZE): Likewise.
>(BT_FN_UINT16_UINT16_UINT16_CONST_SIZE): Likewise.
>(BT_FN_UINT32_UINT32_UINT8_CONST_SIZE): Likewise.
>(BT_FN_UINT32_UINT32_UINT16_CONST_SIZE): Likewise.
>(BT_FN_UINT32_UINT32_UINT32_CONST_SIZE): Likewise.
>(BT_FN_UINT64_UINT64_UINT8_CONST_SIZE): Likewise.
>(BT_FN_UINT64_UINT64_UINT16_CONST_SIZE): Likewise.
>(BT_FN_UINT64_UINT64_UINT32_CONST_SIZE): Likewise.
>(BT_FN_UINT64_UINT64_UINT32_CONST_SIZE): Likewise.
>* builtins.cc (associated_internal_fn): Handle
> BUILT_IN_CRC8_DATA8,
>BUILT_IN_CRC16_DATA8, BUILT_IN_CRC16_DATA16,
>BUILT_IN_CRC32_DATA8, BUILT_IN_CRC32_DATA16,
> BUILT_IN_CRC32_DATA32,
>BUILT_IN_CRC64_DATA8, BUILT_IN_CRC64_DATA16,
> BUILT_IN_CRC64_DATA32,
>BUILT_IN_CRC64_DATA64.
>* builtins.def (BUILT_IN_CRC8_DATA8): New builtin.
>(BUILT_IN_CRC16_DATA8): Likewise.
>(BUILT_IN_CRC16_DATA16): Likewise.
>(BUILT_IN_CRC32_DATA8): Likewise.
>(BUILT_IN_CRC32_DATA16): Likewise.
>(BUILT_IN_CRC32_DATA32): Likewise.
>(BUILT_IN_CRC64_DATA8): Likewise.
>(BUILT_IN_CRC64_DATA16): Likewise.
>(BUILT_IN_CRC64_DATA32): Likewise.
>(BUILT_IN_CRC64_DATA64): Likewise.
>* config/riscv/bitmanip.md (crc4): New
> expander.
>* config/riscv/riscv-protos.h (expand_crc_table_based): Declare.
>(expand_crc_using_clmul): Likewise.
>* config/riscv/riscv.cc (gf2n_poly_long_div_quotient): New
> function.
>(generate_crc): Likewise.
>(generate_crc_table): Likewise.
>(expand_crc_table_based): Likewise.
>(expand_crc_using_clmul): Likewise.
>* config/riscv/riscv.md (UNSPEC_CRC): New unspec for CRC.
>* internal-fn.cc (crc_direct): Define.
>(expand_crc_optab_fn): New function.
>(direct_crc_optab_supported_p): Define.
>* internal-fn.def (CRC): New internal optab function.
>* optabs.def (crc_optab): New optab.
>
>  gcc/testsuite/ChangeLog:
>* gcc.target/riscv/crc-builtin-table-target32.c: New test.
>* gcc.target/riscv/crc-builtin-table-target64.c: New test.
>* gcc.target/riscv/crc-builtin-zbc32.c: New test.
>* gcc.target/riscv/crc-builtin-zbc64.c: New test.


[PATCHv2] Fix PR 110874: infinite loop in gimple_bitwise_inverted_equal_p with fre

2023-08-03 Thread Andrew Pinski via Gcc-patches
This changes gimple_bitwise_inverted_equal_p to use a 2 different match patterns
to try to match bit_not wrapped with a possible nop_convert and a comparison
also wrapped with a possible nop_convert. This is to avoid being recursive.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/110874
* gimple-match-head.cc (gimple_bit_not_with_nop): New declaration.
(gimple_maybe_cmp): Likewise.
(gimple_bitwise_inverted_equal_p): Rewrite to use 
gimple_bit_not_with_nop
and gimple_maybe_cmp instead of being recursive.
* match.pd (bit_not_with_nop): New match pattern.
(maybe_cmp): Likewise.

gcc/testsuite/ChangeLog:

PR tree-optimization/110874
* gcc.c-torture/compile/pr110874-a.c: New test.
---
 gcc/gimple-match-head.cc  | 87 ++-
 gcc/match.pd  | 17 
 .../gcc.c-torture/compile/pr110874-a.c| 17 
 3 files changed, 79 insertions(+), 42 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr110874-a.c

diff --git a/gcc/gimple-match-head.cc b/gcc/gimple-match-head.cc
index b1e96304d7c..a097a494c39 100644
--- a/gcc/gimple-match-head.cc
+++ b/gcc/gimple-match-head.cc
@@ -270,6 +270,10 @@ gimple_bitwise_equal_p (tree expr1, tree expr2, tree 
(*valueize) (tree))
 #define bitwise_inverted_equal_p(expr1, expr2) \
   gimple_bitwise_inverted_equal_p (expr1, expr2, valueize)
 
+
+bool gimple_bit_not_with_nop (tree, tree *, tree (*) (tree));
+bool gimple_maybe_cmp (tree, tree *, tree (*) (tree));
+
 /* Helper function for bitwise_equal_p macro.  */
 
 static inline bool
@@ -285,52 +289,51 @@ gimple_bitwise_inverted_equal_p (tree expr1, tree expr2, 
tree (*valueize) (tree)
 return false;
 
   tree other;
-  if (gimple_nop_convert (expr1, , valueize)
-  && gimple_bitwise_inverted_equal_p (other, expr2, valueize))
-return true;
-
-  if (gimple_nop_convert (expr2, , valueize)
-  && gimple_bitwise_inverted_equal_p (expr1, other, valueize))
-return true;
-
-  if (TREE_CODE (expr1) != SSA_NAME
-  || TREE_CODE (expr2) != SSA_NAME)
-return false;
-
-  gimple *d1 = get_def (valueize, expr1);
-  gassign *a1 = safe_dyn_cast  (d1);
-  gimple *d2 = get_def (valueize, expr2);
-  gassign *a2 = safe_dyn_cast  (d2);
-  if (a1
-  && gimple_assign_rhs_code (a1) == BIT_NOT_EXPR
-  && gimple_bitwise_equal_p (do_valueize (valueize,
- gimple_assign_rhs1 (a1)),
-expr2, valueize))
+  /* Try if EXPR1 was defined as ~EXPR2. */
+  if (gimple_bit_not_with_nop (expr1, , valueize))
+{
+  if (operand_equal_p (other, expr2, 0))
return true;
-  if (a2
-  && gimple_assign_rhs_code (a2) == BIT_NOT_EXPR
-  && gimple_bitwise_equal_p (expr1,
-do_valueize (valueize,
- gimple_assign_rhs1 (a2)),
-valueize))
+  tree expr4;
+  if (gimple_nop_convert (expr2, , valueize)
+ && operand_equal_p (other, expr4, 0))
return true;
-
-  if (a1 && a2
-  && TREE_CODE_CLASS (gimple_assign_rhs_code (a1)) == tcc_comparison
-  && TREE_CODE_CLASS (gimple_assign_rhs_code (a2)) == tcc_comparison)
+}
+  /* Try if EXPR2 was defined as ~EXPR1. */
+  if (gimple_bit_not_with_nop (expr2, , valueize))
 {
-  tree op10 = do_valueize (valueize, gimple_assign_rhs1 (a1));
-  tree op20 = do_valueize (valueize, gimple_assign_rhs1 (a2));
-  if (!operand_equal_p (op10, op20))
-return false;
-  tree op11 = do_valueize (valueize, gimple_assign_rhs2 (a1));
-  tree op21 = do_valueize (valueize, gimple_assign_rhs2 (a2));
-  if (!operand_equal_p (op11, op21))
-return false;
-  if (invert_tree_comparison (gimple_assign_rhs_code (a1),
- HONOR_NANS (op10))
- == gimple_assign_rhs_code (a2))
+  if (operand_equal_p (other, expr1, 0))
+   return true;
+  tree expr3;
+  if (gimple_nop_convert (expr1, , valueize)
+ && operand_equal_p (other, expr3, 0))
return true;
 }
+
+  /* If neither are defined by BIT_NOT, try to see if
+ both are defined by comparisons and see if they are
+ complementary (inversion) of each other. */
+  tree newexpr1, newexpr2;
+  if (!gimple_maybe_cmp (expr1, , valueize))
+return false;
+  if (!gimple_maybe_cmp (expr2, , valueize))
+return false;
+
+  gimple *d1 = get_def (valueize, newexpr1);
+  gassign *a1 = dyn_cast  (d1);
+  gimple *d2 = get_def (valueize, newexpr2);
+  gassign *a2 = dyn_cast  (d2);
+  tree op10 = do_valueize (valueize, gimple_assign_rhs1 (a1));
+  tree op20 = do_valueize (valueize, gimple_assign_rhs1 (a2));
+  if (!operand_equal_p (op10, op20))
+return false;
+  tree op11 = do_valueize (valueize, gimple_assign_rhs2 (a1));
+  tree op21 = 

Re: [COMMITTEDv3] tree-optimization: [PR100864] `(a&!b) | b` is not opimized to `a | b` for comparisons

2023-08-03 Thread Andrew Pinski via Gcc-patches
On Thu, Aug 3, 2023 at 4:58 AM Mikael Morin  wrote:
>
> Hello,
>
> Le 31/07/2023 à 19:07, Andrew Pinski via Gcc-patches a écrit :
> > diff --git a/gcc/generic-match-head.cc b/gcc/generic-match-head.cc
> > index a71c0727b0b..ddaf22f2179 100644
> > --- a/gcc/generic-match-head.cc
> > +++ b/gcc/generic-match-head.cc
> > @@ -121,3 +121,45 @@ bitwise_equal_p (tree expr1, tree expr2)
> >   return wi::to_wide (expr1) == wi::to_wide (expr2);
> > return operand_equal_p (expr1, expr2, 0);
> >   }
> > +
> > +/* Return true if EXPR1 and EXPR2 have the bitwise opposite value,
> > +   but not necessarily same type.
> > +   The types can differ through nop conversions.  */
> > +
> > +static inline bool
> > +bitwise_inverted_equal_p (tree expr1, tree expr2)
> > +{
> > +  STRIP_NOPS (expr1);
> > +  STRIP_NOPS (expr2);
> > +  if (expr1 == expr2)
> > +return false;
> > +  if (!tree_nop_conversion_p (TREE_TYPE (expr1), TREE_TYPE (expr2)))
> > +return false;
> > +  if (TREE_CODE (expr1) == INTEGER_CST && TREE_CODE (expr2) == INTEGER_CST)
> > +return wi::to_wide (expr1) == ~wi::to_wide (expr2);
> > +  if (operand_equal_p (expr1, expr2, 0))
> > +return false;
> > +  if (TREE_CODE (expr1) == BIT_NOT_EXPR
> > +  && bitwise_equal_p (TREE_OPERAND (expr1, 0), expr2))
> > +return true;
> > +  if (TREE_CODE (expr2) == BIT_NOT_EXPR
> > +  && bitwise_equal_p (expr1, TREE_OPERAND (expr2, 0)))
> > +return true;
> > +  if (COMPARISON_CLASS_P (expr1)
> > +  && COMPARISON_CLASS_P (expr2))
> > +{
> > +  tree op10 = TREE_OPERAND (expr1, 0);
> > +  tree op20 = TREE_OPERAND (expr2, 0);
> > +  if (!operand_equal_p (op10, op20))
> > + return false;
> > +  tree op11 = TREE_OPERAND (expr1, 1);
> > +  tree op21 = TREE_OPERAND (expr2, 1);
> > +  if (!operand_equal_p (op11, op21))
> > + return false;
> > +  if (invert_tree_comparison (TREE_CODE (expr1),
> > +   HONOR_NANS (op10))
> > +   == TREE_CODE (expr2))
> > + return true;
>
> So this is trying to match a == b against a != b, or a < b against a >=
> b, or similar; correct?
> Shouldn't this be completed with "crossed" checks, that is match a == b
> against b != a, or a < b against b <= a, etc?  Or is there some
> canonicalization making that redundant?

There is some canonicalization that does happen so you don't need to
do the cross checking.
tree_swap_operands_p defines that order .
In that the lower ssa names are always first operands and constants
are always last.

Thanks,
Andrew


>
> I have given up determining whether these cases were already covered by
> the test or not.
>
> Mikael
>
>


  1   2   3   4   5   6   >