[PATCH] Fix PR31096

2016-03-31 Thread Hurugalawadi, Naveen
Hi,

Please find attached the patch that fixes the PR31096.
Should the optimization be extended to addition and other
operations as well?

Please review the patch and let me know if its okay?

Regression tested on X86_64.

Thanks,
Naveen

2016-03-31  Naveen H.S  

* match.pd (cmp (mult:cs @0 @1) (mult:cs @2 @1)) : New Pattern.

* gcc.dg/pr31096.c: New testcase.diff --git a/gcc/match.pd b/gcc/match.pd
index c0ed305..e1e1b04 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -894,7 +894,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   && tree_nop_conversion_p (type, TREE_TYPE (@1)))
   (convert (bit_and (bit_not @1) @0
 
-
+/* Fold A * 10 == B * 10 into A == B.  */
+(for cmp (tcc_comparison)
+ (simplify
+  (cmp (mult:cs @0 @1) (mult:cs @2 @1))
+   (cmp @0 @2)))
 
 /* ((X inner_op C0) outer_op C1)
With X being a tree where value_range has reasoned certain bits to always be
diff --git a/gcc/testsuite/gcc.dg/pr31096.c b/gcc/testsuite/gcc.dg/pr31096.c
new file mode 100644
index 000..5476db1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr31096.c
@@ -0,0 +1,17 @@
+/* PR middle-end/31096 */
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+
+int
+f (int a, int b)
+{
+  return a * 10 == b * 10;
+}
+
+int
+f1 (int a, int b)
+{
+  return a == b;
+}
+
+/* { dg-final { scan-assembler-not "addl" } } */


Re: [PATCH] Fix PR31096

2016-03-31 Thread Marc Glisse

On Thu, 31 Mar 2016, Hurugalawadi, Naveen wrote:


Hi,

Please find attached the patch that fixes the PR31096.
Should the optimization be extended to addition and other
operations as well?

Please review the patch and let me know if its okay?

Regression tested on X86_64.

Thanks,
Naveen

2016-03-31  Naveen H.S  

   * match.pd (cmp (mult:cs @0 @1) (mult:cs @2 @1)) : New Pattern.

   * gcc.dg/pr31096.c: New testcase.


Looks like you are turning x*-1 < y*-1 into x

Re: [PATCH] Fix PR31096

2016-03-31 Thread Ramana Radhakrishnan


On 31/03/16 09:55, Hurugalawadi, Naveen wrote:
> +/* { dg-final { scan-assembler-not "addl" } } */


addl is not the mnemonic for add on all architectures  


Ramana


Re: [PATCH] Fix PR31096

2016-04-05 Thread Hurugalawadi, Naveen
Hi,

>> Looks like you are turning x*-1 < y*-1 into xdiff --git a/gcc/match.pd b/gcc/match.pd
index c0ed305..e073e9f 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -894,7 +894,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   && tree_nop_conversion_p (type, TREE_TYPE (@1)))
   (convert (bit_and (bit_not @1) @0
 
-
+/* Fold A * 10 == B * 10 into A == B.  naveen*/
+(for cmp (tcc_comparison)
+ (simplify
+  (cmp (mult:cs @0 INTEGER_CST@1) (mult:cs @2 INTEGER_CST@1))
+   (cmp @0 @2)))
 
 /* ((X inner_op C0) outer_op C1)
With X being a tree where value_range has reasoned certain bits to always be
diff --git a/gcc/testsuite/gcc.dg/pr31096.c b/gcc/testsuite/gcc.dg/pr31096.c
new file mode 100644
index 000..1c464db
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr31096.c
@@ -0,0 +1,17 @@
+/* PR middle-end/31096 */
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized" }  */
+
+int
+f (int a, int b)
+{
+  return a * 10 == b * 10;
+}
+
+int
+f1 (int a, int b)
+{
+  return a == b;
+}
+
+/* { dg-final { scan-tree-dump-not " * 10" "optimized" } } */


Re: [PATCH] Fix PR31096

2016-04-05 Thread Marc Glisse

On Tue, 5 Apr 2016, Hurugalawadi, Naveen wrote:


Hi,


Looks like you are turning x*-1 < y*-1 into x

Please find attached the modified patch that works on integer
constant values.

Please review the patch and let me know if this is okay?


-1 is an integer constant, so that's still invalid. It is also invalid for 
unsigned. The :s are useless since the output is a single insn.


--
Marc Glisse


Re: [PATCH] Fix PR31096

2016-04-07 Thread Hurugalawadi, Naveen
Hi,

Thanks for the review, views and comments on the issue.

>> -1 is an integer constant, so that's still invalid. It is also invalid for
>> unsigned. The :s are useless since the output is a single insn.

The patch is modified as per your review comments.

Currently the following conditions had been taken care in the patch
as per your suggestions:-

a * c op b * c -> Optimizes for all cases
a * c op b * c -> Does not optimize when  c = 0

a * -c eq/ne b * -c -> Optimizes for all cases
a * -c lt/gt/ge/le b * -c -> Optimizes when c is positive
a * -c lt/gt/ge/le b * -c -> Optimizes and becomes b lt/gt/ge/le when c is 
negative

Have added a minimal testcase which covers all the above instances.
Please review the patch and let me know if its okay?

Thanks,
Naveendiff --git a/gcc/match.pd b/gcc/match.pd
index 75aa601..9386172 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -894,7 +894,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   && tree_nop_conversion_p (type, TREE_TYPE (@1)))
   (convert (bit_and (bit_not @1) @0
 
+/* Fold A * 10 == B * 10 into A == B.  */
+(for cmp (eq ne)
+ (simplify
+  (cmp (mult:c @0 @1) (mult:c @2 @1))
+  (if (TYPE_OVERFLOW_UNDEFINED (type)
+   && !integer_zerop (@1))
+   (cmp @0 @2
 
+/* Fold A * 10 < B * 10 into A < B.  */
+(for cmp (lt gt le ge)
+ (simplify
+  (cmp (mult:c @0 @1) (mult:c @2 @1))
+  (if (TYPE_OVERFLOW_UNDEFINED (type)
+   && !integer_zerop (@1))
+   (if (tree_expr_nonnegative_p (@1))
+(cmp @0 @2))
+   (if (!tree_expr_nonnegative_p (@1))
+(cmp @2 @0)
 
 /* ((X inner_op C0) outer_op C1)
With X being a tree where value_range has reasoned certain bits to always be
diff --git a/gcc/testsuite/gcc.dg/pr31096-1.c b/gcc/testsuite/gcc.dg/pr31096-1.c
new file mode 100644
index 000..8489724
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr31096-1.c
@@ -0,0 +1,29 @@
+/* PR middle-end/31096 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" }  */
+
+int
+f (int a, int b)
+{
+  return a > b;
+}
+
+int
+f1 (int a, int b)
+{
+  return a * 10 >= b * 10;
+}
+
+int
+f2 (int a, int b)
+{
+  return a * -42 <  b * -42;
+}
+
+int
+f3 (int a, int b)
+{
+  return a * 0 <= b * 0;
+}
+
+/* { dg-final { scan-tree-dump-not "\(D\) * " "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/pr31096.c b/gcc/testsuite/gcc.dg/pr31096.c
new file mode 100644
index 000..05536ad
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr31096.c
@@ -0,0 +1,29 @@
+/* PR middle-end/31096 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" }  */
+
+int
+f (int a, int b)
+{
+  return a == b;
+}
+
+int
+f1 (int a, int b)
+{
+  return a * 10 == b * 10;
+}
+
+int
+f2 (int a, int b)
+{
+  return a * -42 !=  b * -42;
+}
+
+int
+f3 (int a, int b)
+{
+  return a * 0 != b * 0;
+}
+
+/* { dg-final { scan-tree-dump-not "\(D\) * " "optimized" } } */


Re: [PATCH] Fix PR31096

2016-04-07 Thread Marc Glisse

On Thu, 7 Apr 2016, Hurugalawadi, Naveen wrote:


+/* Fold A * 10 == B * 10 into A == B.  */
+(for cmp (eq ne)
+ (simplify
+  (cmp (mult:c @0 @1) (mult:c @2 @1))
+  (if (TYPE_OVERFLOW_UNDEFINED (type)


type is the return type of the comparison. The relevant type here is
TREE_TYPE (@0). Maybe add a testcase with unsigned, to check that it
does not transform?


+   && !integer_zerop (@1))
+   (cmp @0 @2


!integer_zerop is not a promise that the variable is not zero, it just
says that we don't know for sure that it is zero. integer_nonzerop would
work. Or writing (cmp (mult @0 INTEGER_CST@1) (mult @2 @1)), but
then !wi::eq_p (@1, 0) would be a better test.

To make it more general (not limited to constants), you could probably
use tree_expr_nonzero_p, and at some point someone would enhance
tree_single_nonzero_warnv_p by checking VRP information for SSA_NAME.

The other transformation has similar issues.

--
Marc Glisse


Re: [PATCH] Fix PR31096

2016-04-12 Thread Hurugalawadi, Naveen
Hi,

>> type is the return type of the comparison. The relevant type here is
TREE_TYPE (@0). 
Done.

>>Maybe add a testcase with unsigned, to check that it
does not transform?
Added the testcase

>> you could probably use tree_expr_nonzero_p
Done.
I had !wi::eq_p (@1, 0) for INTEGER_CST, but when tried to
use it for general, then used the integer_zerop

Please find attached the modified patch with all the modifications
and let me know if its okay?

Thanks,
Naveendiff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 0f4bf7e..5922dbd 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -9177,7 +9177,7 @@ tree_expr_nonzero_warnv_p (tree t, bool *strict_overflow_p)
 /* Return true when T is an address and is known to be nonzero.
Handle warnings about undefined signed overflow.  */
 
-static bool
+bool
 tree_expr_nonzero_p (tree t)
 {
   bool ret, strict_overflow_p;
diff --git a/gcc/fold-const.h b/gcc/fold-const.h
index 02f4270..8579622 100644
--- a/gcc/fold-const.h
+++ b/gcc/fold-const.h
@@ -167,6 +167,7 @@ extern tree size_diffop_loc (location_t, tree, tree);
 #define non_lvalue(T) non_lvalue_loc (UNKNOWN_LOCATION, T)
 extern tree non_lvalue_loc (location_t, tree);
 
+extern bool tree_expr_nonzero_p (tree);
 extern bool tree_expr_nonnegative_p (tree);
 extern bool tree_expr_nonnegative_warnv_p (tree, bool *, int = 0);
 extern tree make_range (tree, int *, tree *, tree *, bool *);
diff --git a/gcc/match.pd b/gcc/match.pd
index 75aa601..479a3a3 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
zerop
CONSTANT_CLASS_P
tree_expr_nonnegative_p
+   tree_expr_nonzero_p
integer_valued_real_p
integer_pow2p
HONOR_NANS)
@@ -894,7 +895,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   && tree_nop_conversion_p (type, TREE_TYPE (@1)))
   (convert (bit_and (bit_not @1) @0
 
+/* Fold A * 10 == B * 10 into A == B.  */
+(for cmp (eq ne)
+ (simplify
+  (cmp (mult:c @0 @1) (mult:c @2 @1))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
+   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))
+   && tree_expr_nonzero_p (@1))
+   (cmp @0 @2
 
+/* Fold A * 10 < B * 10 into A < B.  */
+(for cmp (lt gt le ge)
+ (simplify
+  (cmp (mult:c @0 @1) (mult:c @2 @1))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
+   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))
+   && tree_expr_nonzero_p (@1))
+   (if (tree_expr_nonnegative_p (@1))
+(cmp @0 @2)
+   (if (!tree_expr_nonnegative_p (@1))
+(cmp @2 @0))
 
 /* ((X inner_op C0) outer_op C1)
With X being a tree where value_range has reasoned certain bits to always be
diff --git a/gcc/testsuite/gcc.dg/pr31096-1.c b/gcc/testsuite/gcc.dg/pr31096-1.c
new file mode 100644
index 000..5f62ddc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr31096-1.c
@@ -0,0 +1,29 @@
+/* PR middle-end/31096 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" }  */
+
+int
+f (int a, int b)
+{
+  return a > b;
+}
+
+int
+f1 (int a, int b)
+{
+  return a * 10 >= b * 10;
+}
+
+int
+f2 (int a, int b)
+{
+  return a * -42 <  b * -42;
+}
+
+int
+f3 (int a, int b)
+{
+  return a * 0 <= b * 0;
+}
+
+/* { dg-final { scan-tree-dump-not "\\(D\\) \\*" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/pr31096-2.c b/gcc/testsuite/gcc.dg/pr31096-2.c
new file mode 100644
index 000..ec51817
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr31096-2.c
@@ -0,0 +1,29 @@
+/* PR middle-end/31096 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" }  */
+
+int
+f (unsigned int a, unsigned int b)
+{
+  return a == b;
+}
+
+int
+f1 (unsigned int a, unsigned int b)
+{
+  return a * 10 == b * 10;
+}
+
+int
+f2 (unsigned int a, unsigned int b)
+{
+  return a * -42 <  b * -42;
+}
+
+int
+f3 (unsigned int a, unsigned int b)
+{
+  return a * 0 <= b * 0;
+}
+
+/* { dg-final { scan-tree-dump-times "\\(D\\) \\*" 4 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/pr31096.c b/gcc/testsuite/gcc.dg/pr31096.c
new file mode 100644
index 000..90cb71b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr31096.c
@@ -0,0 +1,29 @@
+/* PR middle-end/31096 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" }  */
+
+int
+f (int a, int b)
+{
+  return a == b;
+}
+
+int
+f1 (int a, int b)
+{
+  return a * 10 == b * 10;
+}
+
+int
+f2 (int a, int b)
+{
+  return a * -42 !=  b * -42;
+}
+
+int
+f3 (int a, int b)
+{
+  return a * 0 != b * 0;
+}
+
+/* { dg-final { scan-tree-dump-not "\\(D\\) \\*" "optimized" } } */


Re: [PATCH] Fix PR31096

2016-04-12 Thread Marc Glisse

On Tue, 12 Apr 2016, Hurugalawadi, Naveen wrote:

+/* Fold A * 10 == B * 10 into A == B.  */
+(for cmp (eq ne)
+ (simplify
+  (cmp (mult:c @0 @1) (mult:c @2 @1))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
+   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))
+   && tree_expr_nonzero_p (@1))
+   (cmp @0 @2

(another case we could handle here (but that's for another time) is when 
TYPE_OVERFLOW_WRAPS and @1 is odd)


+/* Fold A * 10 < B * 10 into A < B.  */
+(for cmp (lt gt le ge)
+ (simplify
+  (cmp (mult:c @0 @1) (mult:c @2 @1))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
+   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))
+   && tree_expr_nonzero_p (@1))
+   (if (tree_expr_nonnegative_p (@1))
+(cmp @0 @2)

Up to here it looks good to me (maybe someone has a better suggestion
than tree_expr_nonzero_p && tree_expr_nonnegative_p? I think we should
handle at least INTEGER_CST and SSA_NAME with VRP, and it seems natural
to add a VRP check in those 2 functions, expr_not_equal_to in the same
file already has one).

+   (if (!tree_expr_nonnegative_p (@1))
+(cmp @2 @0))

No, same issue as before. !tree_expr_nonnegative_p means that we don't
know for sure that @1 is >=0, so it might be that we know @1 is
negative, or it might be that we simply have no idea, you cannot deduce
anything from it.  Ideally, you would call tree_expr_nonpositive_p,
except that that function doesn't exist yet. So for now, I guess we
could restrict to something like (untested)

if (TREE_CODE (@1) == INTEGER_CST && wi::lt_p (@1, 0))

+int
+f (int a, int b)
+{
+  return a > b;
+}

+int
+f3 (int a, int b)
+{
+  return a * 0 <= b * 0;
+}

Not sure what you are testing with those 2. The second one is optimized
to 'true' before it even reaches this pattern. Multiplying by a third
variable 'c' (where the compiler has no idea what the sign of c is)
could be nicer.

Then you'll need to wait for a real reviewer to show up.

--
Marc Glisse


Re: [PATCH] Fix PR31096

2016-11-11 Thread Hurugalawadi, Naveen
Hi,

Sorry for a very late reply as the mail was missed or overlooked.

>> could now move the test  tree_expr_nonzero_p next to 
>> tree_expr_nonnegative_p (it is redundant for  the last case). 

Done.

>> Often just a comment can really help here. 

Comments updated as per the suggestion

>> when C is zero and verify this transformation doesn't fire on that case.

Updated test to check with zero.

>> verifying that the operand orders change appropriately when dealing 
>> with a negative constant.

Done.

>> verify nothing happens with floating point or vector types.

Done.

Please review the patch and let me know if any modifications are required.
Regression tested on X86 and AArch64.

Thanks,
Naveen

2016-11-11  Naveen H.S  
gcc
* fold-const.c (tree_expr_nonzero_p) : Make non-static.
* fold-const.h (tree_expr_nonzero_p) : Declare.
* match.pd (cmp (mult:c @0 @1) (mult:c @2 @1) : New Pattern.
* match.pd (cmp (mult:c @0 @1) (mult:c @2 @1) : New Pattern.
gcc/testsuite
* gcc.dg/pr31096.c: New testcase.
* gcc.dg/pr31096-1.c: New testcase.diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index e14471e..8f13807 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -9015,7 +9015,7 @@ tree_expr_nonzero_warnv_p (tree t, bool *strict_overflow_p)
 /* Return true when T is an address and is known to be nonzero.
Handle warnings about undefined signed overflow.  */
 
-static bool
+bool
 tree_expr_nonzero_p (tree t)
 {
   bool ret, strict_overflow_p;
diff --git a/gcc/fold-const.h b/gcc/fold-const.h
index 46dcd28..fbe1328 100644
--- a/gcc/fold-const.h
+++ b/gcc/fold-const.h
@@ -169,6 +169,7 @@ extern tree size_diffop_loc (location_t, tree, tree);
 #define non_lvalue(T) non_lvalue_loc (UNKNOWN_LOCATION, T)
 extern tree non_lvalue_loc (location_t, tree);
 
+extern bool tree_expr_nonzero_p (tree);
 extern bool tree_expr_nonnegative_p (tree);
 extern bool tree_expr_nonnegative_warnv_p (tree, bool *, int = 0);
 extern tree make_range (tree, int *, tree *, tree *, bool *);
diff --git a/gcc/match.pd b/gcc/match.pd
index 29ddcd8..eecfe23 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
zerop
CONSTANT_CLASS_P
tree_expr_nonnegative_p
+   tree_expr_nonzero_p
integer_valued_real_p
integer_pow2p
HONOR_NANS)
@@ -1017,7 +1018,31 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   && tree_nop_conversion_p (type, TREE_TYPE (@1)))
   (convert (bit_and (bit_not @1) @0
 
+/* For integral types with undefined overflow and C != 0 fold
+   x * C EQ/NE y * C into x EQ/NE y.  */
+(for cmp (eq ne)
+ (simplify
+  (cmp (mult:c @0 @1) (mult:c @2 @1))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
+   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))
+   && tree_expr_nonzero_p (@1))
+   (cmp @0 @2
+
+/* For integral types with undefined overflow and C != 0 fold
+   x * C RELOP y * C into:
 
+   x RELOP y for nonnegative C
+   y RELOP x for negative C  */
+(for cmp (lt gt le ge)
+ (simplify
+  (cmp (mult:c @0 @1) (mult:c @2 @1))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
+   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
+   (if (tree_expr_nonnegative_p (@1) && tree_expr_nonzero_p (@1))
+(cmp @0 @2)
+   (if (TREE_CODE (@1) == INTEGER_CST
+	&& wi::lt_p (@1, 0, TYPE_SIGN (TREE_TYPE (@1
+(cmp @2 @0))
 
 /* ((X inner_op C0) outer_op C1)
With X being a tree where value_range has reasoned certain bits to always be
diff --git a/gcc/testsuite/gcc.dg/pr31096-1.c b/gcc/testsuite/gcc.dg/pr31096-1.c
new file mode 100644
index 000..e681f0f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr31096-1.c
@@ -0,0 +1,51 @@
+/* PR middle-end/31096 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+#define zero(name, op) \
+int name (int a, int b) \
+{ return a * 0 op b * 0; }
+
+zero(zeq, ==) zero(zne, !=) zero(zlt, <)
+zero(zgt, >)  zero(zge, >=) zero(zle, <=)
+
+#define unsign_pos(name, op) \
+int name (unsigned a, unsigned b) \
+{ return a * 4 op b * 4; }
+
+unsign_pos(upeq, ==) unsign_pos(upne, !=) unsign_pos(uplt, <)
+unsign_pos(upgt, >)  unsign_pos(upge, >=) unsign_pos(uple, <=)
+
+#define unsign_neg(name, op) \
+int name (unsigned a, unsigned b) \
+{ return a * -2 op b * -2; }
+
+unsign_neg(uneq, ==) unsign_neg(unne, !=) unsign_neg(unlt, <)
+unsign_neg(ungt, >)  unsign_neg(unge, >=) unsign_neg(unle, <=)
+
+#define float(name, op) \
+int name (float a, float b) \
+{ return a * 5 op b * 5; }
+
+float(feq, ==) float(fne, !=) float(flt, <)
+float(fgt, >)  float(fge, >=) float(fle, <=)
+
+#define float_val(name, op) \
+int name (int a, int b) \
+{ return a * 54.0 op b * 54.0; }
+
+float_val(fveq, ==) float_val(fvne, !=) float_val(fvlt, <)
+float_val(fvgt, >)  float_val(fvge, >=) float_val(fvle, <=)
+
+#define vec(name, op) \
+int name (int a, int b) \
+{ int c[10]; return a * c[1] op b * c[1]; }
+
+vec(veq, ==) vec(vne, !=) vec(vlt, <)
+vec(vgt, >)  vec(vge, >=) vec(vle, <=)
+
+/* { dg-f

[PING] [PATCH] Fix PR31096

2016-11-22 Thread Hurugalawadi, Naveen
Hi,

Please consider this as a personal reminder to review the patch
at following link and let me know your comments on the same.

https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01049.html

Thanks,
Naveen

Re: [PATCH] Fix PR31096

2016-11-23 Thread Richard Biener
On Fri, Nov 11, 2016 at 11:19 AM, Hurugalawadi, Naveen
 wrote:
> Hi,
>
> Sorry for a very late reply as the mail was missed or overlooked.
>
>>> could now move the test  tree_expr_nonzero_p next to
>>> tree_expr_nonnegative_p (it is redundant for  the last case).
>
> Done.
>
>>> Often just a comment can really help here.
>
> Comments updated as per the suggestion
>
>>> when C is zero and verify this transformation doesn't fire on that case.
>
> Updated test to check with zero.
>
>>> verifying that the operand orders change appropriately when dealing
>>> with a negative constant.
>
> Done.
>
>>> verify nothing happens with floating point or vector types.
>
> Done.
>
> Please review the patch and let me know if any modifications are required.
> Regression tested on X86 and AArch64.

Ok with using wi::neg_p (@1, TYPE_SIGN (TREE_TYPE (@1))) instead of wi::lt_p.

Thanks,
Richard.

> Thanks,
> Naveen
>
> 2016-11-11  Naveen H.S  
> gcc
> * fold-const.c (tree_expr_nonzero_p) : Make non-static.
> * fold-const.h (tree_expr_nonzero_p) : Declare.
> * match.pd (cmp (mult:c @0 @1) (mult:c @2 @1) : New Pattern.
> * match.pd (cmp (mult:c @0 @1) (mult:c @2 @1) : New Pattern.
> gcc/testsuite
> * gcc.dg/pr31096.c: New testcase.
> * gcc.dg/pr31096-1.c: New testcase.


Re: [PATCH] Fix PR31096

2016-07-13 Thread Jeff Law

On 04/14/2016 12:45 AM, Hurugalawadi, Naveen wrote:

Hi,


>> I think we should handle at least INTEGER_CST and SSA_NAME
>> with VRP, and it seems natural to add a VRP check

The check should be added in the tree_single_nonzero_warnv_p
for SSA_NAME case for tree_expr_nonzero_p.
However, for tree_expr_nonnegative_p, its been handled in a
different way. Should we combine this check with the existing one?

+   (if (!tree_expr_nonnegative_p (@1))
+(cmp @2 @0))


>> Ideally, you would call tree_expr_nonpositive_p, except that that
>> function doesn't exist yet. So for now, I guess we

Would the tree_expr_nonpositive_p function be helpful for other cases
as well, I would try to add it if its useful.

Please find attached the modified patch as per the suggestions and
let me know if its fine?

Thanks,
Naveen


pr31096-4.patch


diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 0f4bf7e..5922dbd 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -9177,7 +9177,7 @@ tree_expr_nonzero_warnv_p (tree t, bool 
*strict_overflow_p)
 /* Return true when T is an address and is known to be nonzero.
Handle warnings about undefined signed overflow.  */

-static bool
+bool
 tree_expr_nonzero_p (tree t)
 {
   bool ret, strict_overflow_p;
diff --git a/gcc/fold-const.h b/gcc/fold-const.h
index 02f4270..8579622 100644
--- a/gcc/fold-const.h
+++ b/gcc/fold-const.h
@@ -167,6 +167,7 @@ extern tree size_diffop_loc (location_t, tree, tree);
 #define non_lvalue(T) non_lvalue_loc (UNKNOWN_LOCATION, T)
 extern tree non_lvalue_loc (location_t, tree);

+extern bool tree_expr_nonzero_p (tree);
 extern bool tree_expr_nonnegative_p (tree);
 extern bool tree_expr_nonnegative_warnv_p (tree, bool *, int = 0);
 extern tree make_range (tree, int *, tree *, tree *, bool *);
diff --git a/gcc/match.pd b/gcc/match.pd
index 75aa601..6655a3c 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
zerop
CONSTANT_CLASS_P
tree_expr_nonnegative_p
+   tree_expr_nonzero_p
integer_valued_real_p
integer_pow2p
HONOR_NANS)
@@ -894,7 +895,27 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   && tree_nop_conversion_p (type, TREE_TYPE (@1)))
   (convert (bit_and (bit_not @1) @0

+/* Fold A * 10 == B * 10 into A == B.  */
+(for cmp (eq ne)
+ (simplify
+  (cmp (mult:c @0 @1) (mult:c @2 @1))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
+   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))
+   && tree_expr_nonzero_p (@1))
+   (cmp @0 @2

Rather than refer to an explicit constant (10), I'd write the comment as

/* For integral types with undefined overflow and C != 0 fold
   x * C EQ/NE y * C into x EQ/NE y.  */

We commonly use "C" to refer to an arbitrary constant in comments 
throughout GCC.   I think my version is significantly clearer.






+/* Fold A * 10 < B * 10 into A < B.  */
I think we want to do a similar kind of fix to the comment here.  Except 
you want to lay out the different transformations based on the value of 
the constant.   So something like;


/* For integral types with undefined overflow and C != 0 fold
   x * C RELOP y * C into:

   x RELOP y for nonnegative C
   y RELOP x for negative C  */






 /* ((X inner_op C0) outer_op C1)
With X being a tree where value_range has reasoned certain bits to always be
diff --git a/gcc/testsuite/gcc.dg/pr31096.c b/gcc/testsuite/gcc.dg/pr31096.c
new file mode 100644
index 000..72446bc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr31096.c
@@ -0,0 +1,41 @@
+/* PR middle-end/31096 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" }  */
+
+int
+f (int a, int b)
+{
+  return a * 67 == b * 67;
+}
+
+int
+f1 (int a, int b)
+{
+  return a * -42 !=  b * -42;
+}
+
+int
+f2 (int a, int b)
+{
+  return a * 10 >= b * 10;
+}
+
+int
+f3 (int a, int b)
+{
+  return a * -4 <  b * -4;
+}
+
+int
+f4 (unsigned int a, unsigned int b)
+{
+  return a * 10 == b * 10;
+}
+
+int
+f5 (unsigned int a, unsigned int b)
+{
+  return a * -42 <  b * -42;
+}
+
+/* { dg-final { scan-tree-dump-times "\\(D\\) \\*" 4 "optimized" } } */
So the problem I see here is it's not obvious what your scanning for. 
Often just a comment can really help here.


I would suggest tests when C is zero and verify this transformation 
doesn't fire on that case.


I would suggest verifying that the operand orders change appropriately 
when dealing with a negative constant.


You might want to verify nothing happens with floating point or vector 
types.


If you wanted to be extra thorough you could iterate over the operators. 
 ie, testing == and !=, then <, <=, >, >=


It sounds a bit like overkill, but we've often found subtle cases where 
we wouldn't optimize one case when we expected it to be optimized.


So overall, I think the transformations are fine and just need updated 
comments.  The tests need a bit more work.   Can you please update and 
resubmit -- I think this is pretty close to ready.


Thanks for your patience,
jeff


I

Re: [PATCH] Fix PR31096

2016-04-13 Thread Hurugalawadi, Naveen
Hi,

>> I think we should handle at least INTEGER_CST and SSA_NAME 
>> with VRP, and it seems natural to add a VRP check 

The check should be added in the tree_single_nonzero_warnv_p
for SSA_NAME case for tree_expr_nonzero_p.
However, for tree_expr_nonnegative_p, its been handled in a 
different way. Should we combine this check with the existing one?

+   (if (!tree_expr_nonnegative_p (@1))
+(cmp @2 @0))

>> Ideally, you would call tree_expr_nonpositive_p, except that that
>> function doesn't exist yet. So for now, I guess we

Would the tree_expr_nonpositive_p function be helpful for other cases
as well, I would try to add it if its useful.

Please find attached the modified patch as per the suggestions and
let me know if its fine?

Thanks,
Naveendiff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 0f4bf7e..5922dbd 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -9177,7 +9177,7 @@ tree_expr_nonzero_warnv_p (tree t, bool *strict_overflow_p)
 /* Return true when T is an address and is known to be nonzero.
Handle warnings about undefined signed overflow.  */
 
-static bool
+bool
 tree_expr_nonzero_p (tree t)
 {
   bool ret, strict_overflow_p;
diff --git a/gcc/fold-const.h b/gcc/fold-const.h
index 02f4270..8579622 100644
--- a/gcc/fold-const.h
+++ b/gcc/fold-const.h
@@ -167,6 +167,7 @@ extern tree size_diffop_loc (location_t, tree, tree);
 #define non_lvalue(T) non_lvalue_loc (UNKNOWN_LOCATION, T)
 extern tree non_lvalue_loc (location_t, tree);
 
+extern bool tree_expr_nonzero_p (tree);
 extern bool tree_expr_nonnegative_p (tree);
 extern bool tree_expr_nonnegative_warnv_p (tree, bool *, int = 0);
 extern tree make_range (tree, int *, tree *, tree *, bool *);
diff --git a/gcc/match.pd b/gcc/match.pd
index 75aa601..6655a3c 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
zerop
CONSTANT_CLASS_P
tree_expr_nonnegative_p
+   tree_expr_nonzero_p
integer_valued_real_p
integer_pow2p
HONOR_NANS)
@@ -894,7 +895,27 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   && tree_nop_conversion_p (type, TREE_TYPE (@1)))
   (convert (bit_and (bit_not @1) @0
 
+/* Fold A * 10 == B * 10 into A == B.  */
+(for cmp (eq ne)
+ (simplify
+  (cmp (mult:c @0 @1) (mult:c @2 @1))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
+   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))
+   && tree_expr_nonzero_p (@1))
+   (cmp @0 @2
 
+/* Fold A * 10 < B * 10 into A < B.  */
+(for cmp (lt gt le ge)
+ (simplify
+  (cmp (mult:c @0 @1) (mult:c @2 @1))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
+   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))
+   && tree_expr_nonzero_p (@1))
+   (if (tree_expr_nonnegative_p (@1))
+(cmp @0 @2)
+   (if (TREE_CODE (@1) == INTEGER_CST
+	&& wi::lt_p (@1, 0, TYPE_SIGN (TREE_TYPE (@1
+(cmp @2 @0))
 
 /* ((X inner_op C0) outer_op C1)
With X being a tree where value_range has reasoned certain bits to always be
diff --git a/gcc/testsuite/gcc.dg/pr31096.c b/gcc/testsuite/gcc.dg/pr31096.c
new file mode 100644
index 000..72446bc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr31096.c
@@ -0,0 +1,41 @@
+/* PR middle-end/31096 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" }  */
+
+int
+f (int a, int b)
+{
+  return a * 67 == b * 67;
+}
+
+int
+f1 (int a, int b)
+{
+  return a * -42 !=  b * -42;
+}
+
+int
+f2 (int a, int b)
+{
+  return a * 10 >= b * 10;
+}
+
+int
+f3 (int a, int b)
+{
+  return a * -4 <  b * -4;
+}
+
+int
+f4 (unsigned int a, unsigned int b)
+{
+  return a * 10 == b * 10;
+}
+
+int
+f5 (unsigned int a, unsigned int b)
+{
+  return a * -42 <  b * -42;
+}
+
+/* { dg-final { scan-tree-dump-times "\\(D\\) \\*" 4 "optimized" } } */


Re: [PATCH] Fix PR31096

2016-04-15 Thread Marc Glisse

On Thu, 14 Apr 2016, Hurugalawadi, Naveen wrote:


I think we should handle at least INTEGER_CST and SSA_NAME
with VRP, and it seems natural to add a VRP check


The check should be added in the tree_single_nonzero_warnv_p
for SSA_NAME case for tree_expr_nonzero_p.


I think so.


However, for tree_expr_nonnegative_p, its been handled in a
different way.


Ah, indeed.


Should we combine this check with the existing one?


What do you mean exactly?


+   (if (!tree_expr_nonnegative_p (@1))
+(cmp @2 @0))


Ideally, you would call tree_expr_nonpositive_p, except that that
function doesn't exist yet. So for now, I guess we


Would the tree_expr_nonpositive_p function be helpful for other cases
as well, I would try to add it if its useful.


My "ideally" was wrong. Ideally for this pattern, we would have a function 
negative_p (since we don't want the zero value). But I don't know how 
useful it would be elsewhere.


I was playing with this at some point, but I don't know if that's a good 
idea...


(match negative_p
 INTEGER_CST@0
 (if (wi::lt_p (@0, 0, TYPE_SIGN (TREE_TYPE (@0))
(match negative_p
 SSA_NAME@0
 (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
  (with
   {
 wide_int min, max;
 value_range_type rtype = get_range_info (@0, &min, &max);
   }
   (if (rtype == VR_RANGE && wi::lt_p (max, 0, TYPE_SIGN (TREE_TYPE (@0


Please find attached the modified patch as per the suggestions and
let me know if its fine?


Looks ok to me (not a reviewer), though you could now move the test 
tree_expr_nonzero_p next to tree_expr_nonnegative_p (it is redundant for 
the last case).


--
Marc Glisse


Re: [PING] [PATCH] Fix PR31096

2016-11-22 Thread Marc Glisse

On Wed, 23 Nov 2016, Hurugalawadi, Naveen wrote:


Please consider this as a personal reminder to review the patch
at following link and let me know your comments on the same.

https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01049.html


Hi,

the gcc part looks good to me (I didn't look at the testsuite), but you'll 
need a real reviewer to approve the patch...


--
Marc Glisse


Re: [PING] [PATCH] Fix PR31096

2016-11-29 Thread Jeff Law

On 11/22/2016 10:25 PM, Hurugalawadi, Naveen wrote:

Hi,

Please consider this as a personal reminder to review the patch
at following link and let me know your comments on the same.

https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01049.html
I believe Richi asked for a small change after which you can consider 
the patch approved:


https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02320.html

jeff


Re: [PING] [PATCH] Fix PR31096

2016-11-29 Thread Hurugalawadi, Naveen
Hi Jeff,

>> I believe Richi asked for a small change after which you can consider 
>> the patch approved:

Yeah. Thanks for all the comments and reviews.
Patch committed after the modification as:-

https://gcc.gnu.org/ml/gcc-cvs/2016-11/msg01019.html

Thanks,
Naveen