Re: [PATCH] Properly valueize values we value-number to

2015-04-27 Thread Richard Biener
On Fri, 24 Apr 2015, Jeff Law wrote:

 On 02/17/2015 07:58 AM, Richard Biener wrote:
 [ Restarting a old thread... ]
 
  On a closer look the record_const_or_copy_1 hunk is redundant
  (record_equality is really a bit obfuscated...).
 Agreed.  I'm not entirely sure how it got to this point.
 
  And record_equality is where the SSA_NAME_VALUEs might end up
  forming chains?  At least because we might record a SSA_NAME_VALUE
  for sth that might be the target of a SSA_NAME_VALUE, as in
  
a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
if (b_2 == i_3(D))
  ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)
  
  thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
  SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
  can't easily avoid that because we don't keep track of a
  reverse mapping (nor would we want to update that).
 Right.  In general, the fact that we're in SSA form means that we ought not
 get loops in the SSA_NAME_VALUE chain since there's a single static assignment
 and we'll visit it once.
 
 That nice model breaks down in two ways.  First we derive equivalences from
 equality conditions like you've shown above.  Second, when threading we can
 thread through a loop latch and start reprocessing a block.  The interaction
 between those two can result in loops in the value chain.
 
 What I'm hoping to do in GCC6 is eliminate all this stuff with a threader that
 is independent of the dominator walk (and optimizer). Instead of walking
 forward through the dominator tree recording key nuggets, then removing those
 nuggets as we pop back up the tree, we instead we start with the conditional
 and walk up the use-def chains, simplifying as we go, with the goal of
 simplifying to a constant, of course.
 
 You can see the beginnings of that with the FSM bits from Sebastian.
 
 Obviously until those bits are in place, we should try to clean up any warts
 in the current implementation.
 
  
  Btw, in record_equality, the == part of the = check for the loop
  depth will just randomly swap x and y while we should prefer
  IL canonical order.  If we can't rely on the input being in IL
  canonical order we should prepend the whole function with
  
if (tree_swap_operands_p (x, y, false))
  std::swap (x, y);
  
  and change = to  for the loop-depth check.
 Agreed.  Makes perfect sense.

I'm now re-bootstrapping and testing the following.

Richard.

2015-04-27  Richard Biener  rguent...@suse.de

* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
(record_equivalences_from_stmt): Valueize rhs.
(record_equality): Canonicalize x and y order via
tree_swap_operands_p.  Do not swap operands for same loop depth.

Index: gcc/tree-ssa-dom.c
===
*** gcc/tree-ssa-dom.c  (revision 222360)
--- gcc/tree-ssa-dom.c  (working copy)
*** record_equivalences_from_phis (basic_blo
*** 1519,1524 
--- 1519,1531 
  if (lhs == t)
continue;
  
+ /* Valueize t.  */
+ if (TREE_CODE (t) == SSA_NAME)
+   {
+ tree tmp = SSA_NAME_VALUE (t);
+ t = tmp ? tmp : t;
+   }
+ 
  /* If we have not processed an alternative yet, then set
 RHS to this alternative.  */
  if (rhs == NULL)
*** record_equality (tree x, tree y)
*** 1752,1757 
--- 1759,1767 
  {
tree prev_x = NULL, prev_y = NULL;
  
+   if (tree_swap_operands_p (x, y, false))
+ std::swap (x, y);
+ 
if (TREE_CODE (x) == SSA_NAME)
  prev_x = SSA_NAME_VALUE (x);
if (TREE_CODE (y) == SSA_NAME)
*** record_equality (tree x, tree y)
*** 1766,1772 
else if (is_gimple_min_invariant (x)
   /* ???  When threading over backedges the following is important
  for correctness.  See PR61757.  */
!  || (loop_depth_of_name (x) = loop_depth_of_name (y)))
  prev_x = x, x = y, y = prev_x, prev_x = prev_y;
else if (prev_x  is_gimple_min_invariant (prev_x))
  x = y, y = prev_x, prev_x = prev_y;
--- 1776,1782 
else if (is_gimple_min_invariant (x)
   /* ???  When threading over backedges the following is important
  for correctness.  See PR61757.  */
!  || (loop_depth_of_name (x)  loop_depth_of_name (y)))
  prev_x = x, x = y, y = prev_x, prev_x = prev_y;
else if (prev_x  is_gimple_min_invariant (prev_x))
  x = y, y = prev_x, prev_x = prev_y;
*** record_equivalences_from_stmt (gimple st
*** 2128,2145 
if (may_optimize_p
   (TREE_CODE (rhs) == SSA_NAME
  || is_gimple_min_invariant (rhs)))
!   {
!   if (dump_file  (dump_flags  TDF_DETAILS))
! {
!   fprintf (dump_file,  ASGN );
!   print_generic_expr (dump_file, lhs, 0);
!   fprintf (dump_file,  = );
!   print_generic_expr (dump_file, rhs, 0);
!   fprintf 

Re: [PATCH] Properly valueize values we value-number to

2015-04-27 Thread Richard Biener
On Mon, 27 Apr 2015, Richard Biener wrote:

 On Fri, 24 Apr 2015, Jeff Law wrote:
 
  On 02/17/2015 07:58 AM, Richard Biener wrote:
  [ Restarting a old thread... ]
  
   On a closer look the record_const_or_copy_1 hunk is redundant
   (record_equality is really a bit obfuscated...).
  Agreed.  I'm not entirely sure how it got to this point.
  
   And record_equality is where the SSA_NAME_VALUEs might end up
   forming chains?  At least because we might record a SSA_NAME_VALUE
   for sth that might be the target of a SSA_NAME_VALUE, as in
   
 a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
 if (b_2 == i_3(D))
   ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)
   
   thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
   SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
   can't easily avoid that because we don't keep track of a
   reverse mapping (nor would we want to update that).
  Right.  In general, the fact that we're in SSA form means that we ought not
  get loops in the SSA_NAME_VALUE chain since there's a single static 
  assignment
  and we'll visit it once.
  
  That nice model breaks down in two ways.  First we derive equivalences from
  equality conditions like you've shown above.  Second, when threading we can
  thread through a loop latch and start reprocessing a block.  The interaction
  between those two can result in loops in the value chain.
  
  What I'm hoping to do in GCC6 is eliminate all this stuff with a threader 
  that
  is independent of the dominator walk (and optimizer). Instead of walking
  forward through the dominator tree recording key nuggets, then removing 
  those
  nuggets as we pop back up the tree, we instead we start with the conditional
  and walk up the use-def chains, simplifying as we go, with the goal of
  simplifying to a constant, of course.
  
  You can see the beginnings of that with the FSM bits from Sebastian.
  
  Obviously until those bits are in place, we should try to clean up any warts
  in the current implementation.
  
   
   Btw, in record_equality, the == part of the = check for the loop
   depth will just randomly swap x and y while we should prefer
   IL canonical order.  If we can't rely on the input being in IL
   canonical order we should prepend the whole function with
   
 if (tree_swap_operands_p (x, y, false))
   std::swap (x, y);
   
   and change = to  for the loop-depth check.
  Agreed.  Makes perfect sense.
 
 I'm now re-bootstrapping and testing the following.

Committed as follows, with XFAILing and re-opening PR65217.

Richard.

2015-04-27  Richard Biener  rguent...@suse.de

* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
(record_equivalences_from_stmt): Valueize rhs.
(record_equality): Canonicalize x and y order via
tree_swap_operands_p.  Do not swap operands for same loop depth.

* gcc.target/i386/pr65217.c: XFAIL.

Index: gcc/tree-ssa-dom.c
===
*** gcc/tree-ssa-dom.c  (revision 222360)
--- gcc/tree-ssa-dom.c  (working copy)
*** record_equivalences_from_phis (basic_blo
*** 1519,1524 
--- 1519,1531 
  if (lhs == t)
continue;
  
+ /* Valueize t.  */
+ if (TREE_CODE (t) == SSA_NAME)
+   {
+ tree tmp = SSA_NAME_VALUE (t);
+ t = tmp ? tmp : t;
+   }
+ 
  /* If we have not processed an alternative yet, then set
 RHS to this alternative.  */
  if (rhs == NULL)
*** record_equality (tree x, tree y)
*** 1752,1757 
--- 1759,1767 
  {
tree prev_x = NULL, prev_y = NULL;
  
+   if (tree_swap_operands_p (x, y, false))
+ std::swap (x, y);
+ 
if (TREE_CODE (x) == SSA_NAME)
  prev_x = SSA_NAME_VALUE (x);
if (TREE_CODE (y) == SSA_NAME)
*** record_equality (tree x, tree y)
*** 1766,1772 
else if (is_gimple_min_invariant (x)
   /* ???  When threading over backedges the following is important
  for correctness.  See PR61757.  */
!  || (loop_depth_of_name (x) = loop_depth_of_name (y)))
  prev_x = x, x = y, y = prev_x, prev_x = prev_y;
else if (prev_x  is_gimple_min_invariant (prev_x))
  x = y, y = prev_x, prev_x = prev_y;
--- 1776,1782 
else if (is_gimple_min_invariant (x)
   /* ???  When threading over backedges the following is important
  for correctness.  See PR61757.  */
!  || (loop_depth_of_name (x)  loop_depth_of_name (y)))
  prev_x = x, x = y, y = prev_x, prev_x = prev_y;
else if (prev_x  is_gimple_min_invariant (prev_x))
  x = y, y = prev_x, prev_x = prev_y;
*** record_equivalences_from_stmt (gimple st
*** 2128,2145 
if (may_optimize_p
   (TREE_CODE (rhs) == SSA_NAME
  || is_gimple_min_invariant (rhs)))
!   {
!   if (dump_file  (dump_flags  TDF_DETAILS))
! {
!   

Re: [PATCH] Properly valueize values we value-number to

2015-04-27 Thread Jeff Law

On 04/27/2015 06:46 AM, Richard Biener wrote:

On Mon, 27 Apr 2015, Richard Biener wrote:


On Fri, 24 Apr 2015, Jeff Law wrote:


On 02/17/2015 07:58 AM, Richard Biener wrote:
[ Restarting a old thread... ]


On a closer look the record_const_or_copy_1 hunk is redundant
(record_equality is really a bit obfuscated...).

Agreed.  I'm not entirely sure how it got to this point.


And record_equality is where the SSA_NAME_VALUEs might end up
forming chains?  At least because we might record a SSA_NAME_VALUE
for sth that might be the target of a SSA_NAME_VALUE, as in

   a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
   if (b_2 == i_3(D))
 ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)

thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
can't easily avoid that because we don't keep track of a
reverse mapping (nor would we want to update that).

Right.  In general, the fact that we're in SSA form means that we ought not
get loops in the SSA_NAME_VALUE chain since there's a single static assignment
and we'll visit it once.

That nice model breaks down in two ways.  First we derive equivalences from
equality conditions like you've shown above.  Second, when threading we can
thread through a loop latch and start reprocessing a block.  The interaction
between those two can result in loops in the value chain.

What I'm hoping to do in GCC6 is eliminate all this stuff with a threader that
is independent of the dominator walk (and optimizer). Instead of walking
forward through the dominator tree recording key nuggets, then removing those
nuggets as we pop back up the tree, we instead we start with the conditional
and walk up the use-def chains, simplifying as we go, with the goal of
simplifying to a constant, of course.

You can see the beginnings of that with the FSM bits from Sebastian.

Obviously until those bits are in place, we should try to clean up any warts
in the current implementation.



Btw, in record_equality, the == part of the = check for the loop
depth will just randomly swap x and y while we should prefer
IL canonical order.  If we can't rely on the input being in IL
canonical order we should prepend the whole function with

   if (tree_swap_operands_p (x, y, false))
 std::swap (x, y);

and change = to  for the loop-depth check.

Agreed.  Makes perfect sense.


I'm now re-bootstrapping and testing the following.


Committed as follows, with XFAILing and re-opening PR65217.

Richard.

2015-04-27  Richard Biener  rguent...@suse.de

* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
(record_equivalences_from_stmt): Valueize rhs.
(record_equality): Canonicalize x and y order via
tree_swap_operands_p.  Do not swap operands for same loop depth.

* gcc.target/i386/pr65217.c: XFAIL.
Looks good.  I'd spun the same record_equality changes over the weekend 
and have state on regressing 65217.


65217 is an interesting test.


  if ((n  -n) != n)
__builtin_unreachable();
  return n;

n  -n will (of course) get computed into an SSA_NAME.  We then 
propagate that name for the use of n in the return statement rather 
than using the effectively zero cost n.


If we put some smarts into tree_swap_operands_p to order sensibly in 
this kind of case, we end up regressing a different case that I'll be 
looking at today.


jeff




Re: [PATCH] Properly valueize values we value-number to

2015-04-27 Thread Jeff Law

On 04/27/2015 12:29 PM, Richard Biener wrote:

On April 27, 2015 6:24:47 PM GMT+02:00, Jeff Law l...@redhat.com

n  -n will (of course) get computed into an SSA_NAME.  We then
propagate that name for the use of n in the return statement
rather than using the effectively zero cost n.

If we put some smarts into tree_swap_operands_p to order sensibly
in this kind of case, we end up regressing a different case that
I'll be looking at today.


In this case the temporary we propagate has a single use (in the
comparison).  Might be interesting to disallow this case by extra
checks in record equality.  I wouldn't change tree_swap_operands_p.

Yea, I keep going back and forth over whether or not the heuristic 
should go in tree_swap_operand_p or in record_equality.


It's a fairly narrow issue, so maybe record_equality would be a better spot.

jeff


Re: [PATCH] Properly valueize values we value-number to

2015-04-27 Thread Richard Biener
On April 27, 2015 6:24:47 PM GMT+02:00, Jeff Law l...@redhat.com wrote:
On 04/27/2015 06:46 AM, Richard Biener wrote:
 On Mon, 27 Apr 2015, Richard Biener wrote:

 On Fri, 24 Apr 2015, Jeff Law wrote:

 On 02/17/2015 07:58 AM, Richard Biener wrote:
 [ Restarting a old thread... ]

 On a closer look the record_const_or_copy_1 hunk is redundant
 (record_equality is really a bit obfuscated...).
 Agreed.  I'm not entirely sure how it got to this point.

 And record_equality is where the SSA_NAME_VALUEs might end up
 forming chains?  At least because we might record a SSA_NAME_VALUE
 for sth that might be the target of a SSA_NAME_VALUE, as in

a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
if (b_2 == i_3(D))
  ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)

 thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
 SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
 can't easily avoid that because we don't keep track of a
 reverse mapping (nor would we want to update that).
 Right.  In general, the fact that we're in SSA form means that we
ought not
 get loops in the SSA_NAME_VALUE chain since there's a single static
assignment
 and we'll visit it once.

 That nice model breaks down in two ways.  First we derive
equivalences from
 equality conditions like you've shown above.  Second, when
threading we can
 thread through a loop latch and start reprocessing a block.  The
interaction
 between those two can result in loops in the value chain.

 What I'm hoping to do in GCC6 is eliminate all this stuff with a
threader that
 is independent of the dominator walk (and optimizer). Instead of
walking
 forward through the dominator tree recording key nuggets, then
removing those
 nuggets as we pop back up the tree, we instead we start with the
conditional
 and walk up the use-def chains, simplifying as we go, with the goal
of
 simplifying to a constant, of course.

 You can see the beginnings of that with the FSM bits from
Sebastian.

 Obviously until those bits are in place, we should try to clean up
any warts
 in the current implementation.


 Btw, in record_equality, the == part of the = check for the loop
 depth will just randomly swap x and y while we should prefer
 IL canonical order.  If we can't rely on the input being in IL
 canonical order we should prepend the whole function with

if (tree_swap_operands_p (x, y, false))
  std::swap (x, y);

 and change = to  for the loop-depth check.
 Agreed.  Makes perfect sense.

 I'm now re-bootstrapping and testing the following.

 Committed as follows, with XFAILing and re-opening PR65217.

 Richard.

 2015-04-27  Richard Biener  rguent...@suse.de

  * tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
  (record_equivalences_from_stmt): Valueize rhs.
  (record_equality): Canonicalize x and y order via
  tree_swap_operands_p.  Do not swap operands for same loop depth.

  * gcc.target/i386/pr65217.c: XFAIL.
Looks good.  I'd spun the same record_equality changes over the weekend

and have state on regressing 65217.

65217 is an interesting test.


   if ((n  -n) != n)
 __builtin_unreachable();
   return n;

n  -n will (of course) get computed into an SSA_NAME.  We then 
propagate that name for the use of n in the return statement rather 
than using the effectively zero cost n.

If we put some smarts into tree_swap_operands_p to order sensibly in 
this kind of case, we end up regressing a different case that I'll be 
looking at today.

In this case the temporary we propagate has a single use (in the comparison).  
Might be interesting to disallow this case by extra checks in record equality.  
I wouldn't change tree_swap_operands_p.

Richard.


jeff




Re: [PATCH] Properly valueize values we value-number to

2015-04-24 Thread Jeff Law

On 02/17/2015 07:58 AM, Richard Biener wrote:
[ Restarting a old thread... ]


On a closer look the record_const_or_copy_1 hunk is redundant
(record_equality is really a bit obfuscated...).

Agreed.  I'm not entirely sure how it got to this point.


And record_equality is where the SSA_NAME_VALUEs might end up
forming chains?  At least because we might record a SSA_NAME_VALUE
for sth that might be the target of a SSA_NAME_VALUE, as in

  a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
  if (b_2 == i_3(D))
... temporarily SSA_NAME_VALUE (b_2) == i_3(D)

thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
can't easily avoid that because we don't keep track of a
reverse mapping (nor would we want to update that).
Right.  In general, the fact that we're in SSA form means that we ought 
not get loops in the SSA_NAME_VALUE chain since there's a single static 
assignment and we'll visit it once.


That nice model breaks down in two ways.  First we derive equivalences 
from equality conditions like you've shown above.  Second, when 
threading we can thread through a loop latch and start reprocessing a 
block.  The interaction between those two can result in loops in the 
value chain.


What I'm hoping to do in GCC6 is eliminate all this stuff with a 
threader that is independent of the dominator walk (and optimizer). 
Instead of walking forward through the dominator tree recording key 
nuggets, then removing those nuggets as we pop back up the tree, we 
instead we start with the conditional and walk up the use-def chains, 
simplifying as we go, with the goal of simplifying to a constant, of course.


You can see the beginnings of that with the FSM bits from Sebastian.

Obviously until those bits are in place, we should try to clean up any 
warts in the current implementation.




Btw, in record_equality, the == part of the = check for the loop
depth will just randomly swap x and y while we should prefer
IL canonical order.  If we can't rely on the input being in IL
canonical order we should prepend the whole function with

  if (tree_swap_operands_p (x, y, false))
std::swap (x, y);

and change = to  for the loop-depth check.

Agreed.  Makes perfect sense.

jeff



Re: [PATCH] Properly valueize values we value-number to

2015-02-23 Thread Jeff Law

On 02/17/15 07:03, Richard Biener wrote:


This is something I noticed some time ago and that I remembered when
you added that looping SSA_NAME_VALUE to simplify_control_stmt_condition.
Currently DOM doesn't make sure that when setting
SSA_NAME_VALUE (x) = y that SSA_NAME_VALUE (y) == y, thus you could
get SSA_NAME_VALUE forming a chain until you reach the final value.

Thus the following patch which fixes all occurances and removes the
looping from simplify_control_stmt_condition.  Did you have any
testcase when you added that looping?

pr61607, which probably looks familiar :-)







Note that this is purely by code inspection and I don't have any
testcase where a SSA_NAME_VALUE chain causes missed optimizations
(you'd have to disable a lot of other optimizations before dom1
to be able to create a reasonably simple one).

Anyway - the tree-ssa-dom.c part bootstrapped and tested ok on
x86_64-unknown-linux-gnu, the tree-ssa-threadedge.c part bootstrapped
ok ontop of that and testing is still in progress.

Ok?

Thanks,
Richard.

2015-02-17  Richard Biener  rguent...@suse.de

* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
(record_const_or_copy_1): Valueize y.
(record_equivalences_from_stmt): Valueize rhs.
* tree-ssa-threadedge.c (simplify_control_stmt_condition):
Remove repeated valueization.

Given the regression was fixed elsewhere, let's table this until gcc-6.

I want to rip out large hunks of this code, at least on the threading 
side and replace it with a backwards walk similar to what Sebastian did 
for the FSM optimization.  As a nice side effect, that ought to 
completely disentangle DOM and the threader.


At the same time I want to see what effect we'd have by disabling these 
context sensitive propagations in DOM.  They add a lot of hair and I'm 
just not sure they're worth it.  If we really need them, perhaps we can 
get away with something simpler, outside of DOM.


jeff



[PATCH] Properly valueize values we value-number to

2015-02-17 Thread Richard Biener

This is something I noticed some time ago and that I remembered when
you added that looping SSA_NAME_VALUE to simplify_control_stmt_condition.
Currently DOM doesn't make sure that when setting
SSA_NAME_VALUE (x) = y that SSA_NAME_VALUE (y) == y, thus you could
get SSA_NAME_VALUE forming a chain until you reach the final value.

Thus the following patch which fixes all occurances and removes the
looping from simplify_control_stmt_condition.  Did you have any
testcase when you added that looping?

Note that this is purely by code inspection and I don't have any
testcase where a SSA_NAME_VALUE chain causes missed optimizations
(you'd have to disable a lot of other optimizations before dom1
to be able to create a reasonably simple one).

Anyway - the tree-ssa-dom.c part bootstrapped and tested ok on
x86_64-unknown-linux-gnu, the tree-ssa-threadedge.c part bootstrapped
ok ontop of that and testing is still in progress.

Ok?

Thanks,
Richard.

2015-02-17  Richard Biener  rguent...@suse.de

* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
(record_const_or_copy_1): Valueize y.
(record_equivalences_from_stmt): Valueize rhs.
* tree-ssa-threadedge.c (simplify_control_stmt_condition):
Remove repeated valueization.

Index: gcc/tree-ssa-dom.c
===
--- gcc/tree-ssa-dom.c  (revision 220751)
+++ gcc/tree-ssa-dom.c  (working copy)
@@ -1223,6 +1223,13 @@ record_equivalences_from_phis (basic_blo
  if (lhs == t)
continue;
 
+ /* Valueize t.  */
+ if (TREE_CODE (t) == SSA_NAME)
+   {
+ tree tmp = SSA_NAME_VALUE (t);
+ t = tmp ? tmp : t;
+   }
+
  /* If we have not processed an alternative yet, then set
 RHS to this alternative.  */
  if (rhs == NULL)
@@ -1566,6 +1573,13 @@ record_conditions (struct edge_info *edg
 static void
 record_const_or_copy_1 (tree x, tree y, tree prev_x)
 {
+  /* Valueize y.  */
+  if (TREE_CODE (y) == SSA_NAME)
+{
+  tree tmp = SSA_NAME_VALUE (y);
+  y = tmp ? tmp : y;
+}
+
   set_ssa_name_value (x, y);
 
   if (dump_file  (dump_flags  TDF_DETAILS))
@@ -2184,18 +2198,25 @@ record_equivalences_from_stmt (gimple st
   if (may_optimize_p
   (TREE_CODE (rhs) == SSA_NAME
  || is_gimple_min_invariant (rhs)))
-  {
-   if (dump_file  (dump_flags  TDF_DETAILS))
- {
-   fprintf (dump_file,  ASGN );
-   print_generic_expr (dump_file, lhs, 0);
-   fprintf (dump_file,  = );
-   print_generic_expr (dump_file, rhs, 0);
-   fprintf (dump_file, \n);
- }
+   {
+ /* Valueize rhs.  */
+ if (TREE_CODE (rhs) == SSA_NAME)
+   {
+ tree tmp = SSA_NAME_VALUE (rhs);
+ rhs = tmp ? tmp : rhs;
+   }
 
-   set_ssa_name_value (lhs, rhs);
-  }
+ if (dump_file  (dump_flags  TDF_DETAILS))
+   {
+ fprintf (dump_file,  ASGN );
+ print_generic_expr (dump_file, lhs, 0);
+ fprintf (dump_file,  = );
+ print_generic_expr (dump_file, rhs, 0);
+ fprintf (dump_file, \n);
+   }
+
+ set_ssa_name_value (lhs, rhs);
+   }
 }
 
   /* Make sure we can propagate x + CST.  */
Index: gcc/tree-ssa-threadedge.c
===
--- gcc/tree-ssa-threadedge.c   (revision 220751)
+++ gcc/tree-ssa-threadedge.c   (working copy)
@@ -591,29 +591,13 @@ simplify_control_stmt_condition (edge e,
   cond_code = gimple_cond_code (stmt);
 
   /* Get the current value of both operands.  */
-  if (TREE_CODE (op0) == SSA_NAME)
-   {
- for (int i = 0; i  2; i++)
-   {
- if (TREE_CODE (op0) == SSA_NAME
-  SSA_NAME_VALUE (op0))
-   op0 = SSA_NAME_VALUE (op0);
- else
-   break;
-   }
-   }
-
-  if (TREE_CODE (op1) == SSA_NAME)
-   {
- for (int i = 0; i  2; i++)
-   {
- if (TREE_CODE (op1) == SSA_NAME
-  SSA_NAME_VALUE (op1))
-   op1 = SSA_NAME_VALUE (op1);
- else
-   break;
-   }
-   }
+  if (TREE_CODE (op0) == SSA_NAME
+  SSA_NAME_VALUE (op0))
+   op0 = SSA_NAME_VALUE (op0);
+
+  if (TREE_CODE (op1) == SSA_NAME
+  SSA_NAME_VALUE (op1))
+   op1 = SSA_NAME_VALUE (op1);
 
   if (handle_dominating_asserts)
{
@@ -682,22 +666,11 @@ simplify_control_stmt_condition (edge e,
   tree original_lhs = cond;
   cached_lhs = cond;
 
-  /* Get the variable's current value from the equivalence chains.
-
-It is possible to get loops in the SSA_NAME_VALUE chains
-(consider threading the backedge of a loop where we have
-a loop invariant SSA_NAME used in the 

Re: [PATCH] Properly valueize values we value-number to

2015-02-17 Thread Richard Biener
On Tue, 17 Feb 2015, Richard Biener wrote:

 
 This is something I noticed some time ago and that I remembered when
 you added that looping SSA_NAME_VALUE to simplify_control_stmt_condition.
 Currently DOM doesn't make sure that when setting
 SSA_NAME_VALUE (x) = y that SSA_NAME_VALUE (y) == y, thus you could
 get SSA_NAME_VALUE forming a chain until you reach the final value.
 
 Thus the following patch which fixes all occurances and removes the
 looping from simplify_control_stmt_condition.  Did you have any
 testcase when you added that looping?
 
 Note that this is purely by code inspection and I don't have any
 testcase where a SSA_NAME_VALUE chain causes missed optimizations
 (you'd have to disable a lot of other optimizations before dom1
 to be able to create a reasonably simple one).
 
 Anyway - the tree-ssa-dom.c part bootstrapped and tested ok on
 x86_64-unknown-linux-gnu, the tree-ssa-threadedge.c part bootstrapped
 ok ontop of that and testing is still in progress.

On a closer look the record_const_or_copy_1 hunk is redundant
(record_equality is really a bit obfuscated...).

And record_equality is where the SSA_NAME_VALUEs might end up
forming chains?  At least because we might record a SSA_NAME_VALUE
for sth that might be the target of a SSA_NAME_VALUE, as in

 a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
 if (b_2 == i_3(D))
   ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)

thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
can't easily avoid that because we don't keep track of a
reverse mapping (nor would we want to update that).

Btw, in record_equality, the == part of the = check for the loop
depth will just randomly swap x and y while we should prefer
IL canonical order.  If we can't rely on the input being in IL
canonical order we should prepend the whole function with

 if (tree_swap_operands_p (x, y, false))
   std::swap (x, y);

and change = to  for the loop-depth check.

For PR62217 it is that loop depth check which causes the equivalence
to be recorded that ends up being harmful (well, harmful is the
actual propagation of course).  Btw, we already inhibit propagation
into IV increments (cprop_operand) - so it looks like we may
want to inhibit BIV replacement completely instead?

Index: gcc/tree-ssa-dom.c
===
--- gcc/tree-ssa-dom.c  (revision 220755)
+++ gcc/tree-ssa-dom.c  (working copy)
@@ -2291,11 +2291,16 @@ cprop_operand (gimple stmt, use_operand_
   if (!may_propagate_copy (op, val))
return;
 
-  /* Do not propagate copies into simple IV increment statements.
- See PR23821 for how this can disturb IV analysis.  */
-  if (TREE_CODE (val) != INTEGER_CST
-  simple_iv_increment_p (stmt))
-   return;
+  /* Do not propagate copies into BIVs.
+ See PR23821 and PR62217 for how this can disturb IV and
+number of iteration analysis.  */
+  if (TREE_CODE (val) != INTEGER_CST)
+   {
+ gimple def = SSA_NAME_DEF_STMT (op);
+ if (gimple_code (def) == GIMPLE_PHI
+  gimple_bb (def)-loop_father-header == gimple_bb (def))
+   return;
+   }
 
   /* Dump details.  */
   if (dump_file  (dump_flags  TDF_DETAILS))

So it would just extend an existing heuristic.

Richard.

 Ok?
 
 Thanks,
 Richard.
 
 2015-02-17  Richard Biener  rguent...@suse.de
 
   * tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
   (record_const_or_copy_1): Valueize y.
   (record_equivalences_from_stmt): Valueize rhs.
   * tree-ssa-threadedge.c (simplify_control_stmt_condition):
   Remove repeated valueization.
 
 Index: gcc/tree-ssa-dom.c
 ===
 --- gcc/tree-ssa-dom.c(revision 220751)
 +++ gcc/tree-ssa-dom.c(working copy)
 @@ -1223,6 +1223,13 @@ record_equivalences_from_phis (basic_blo
 if (lhs == t)
   continue;
  
 +   /* Valueize t.  */
 +   if (TREE_CODE (t) == SSA_NAME)
 + {
 +   tree tmp = SSA_NAME_VALUE (t);
 +   t = tmp ? tmp : t;
 + }
 +
 /* If we have not processed an alternative yet, then set
RHS to this alternative.  */
 if (rhs == NULL)
 @@ -1566,6 +1573,13 @@ record_conditions (struct edge_info *edg
  static void
  record_const_or_copy_1 (tree x, tree y, tree prev_x)
  {
 +  /* Valueize y.  */
 +  if (TREE_CODE (y) == SSA_NAME)
 +{
 +  tree tmp = SSA_NAME_VALUE (y);
 +  y = tmp ? tmp : y;
 +}
 +
set_ssa_name_value (x, y);
  
if (dump_file  (dump_flags  TDF_DETAILS))
 @@ -2184,18 +2198,25 @@ record_equivalences_from_stmt (gimple st
if (may_optimize_p
  (TREE_CODE (rhs) == SSA_NAME
 || is_gimple_min_invariant (rhs)))
 -  {
 - if (dump_file  (dump_flags  TDF_DETAILS))
 -   {
 - fprintf (dump_file,