Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP

2017-05-08 Thread Jeff Law

On 05/08/2017 01:32 AM, Richard Biener wrote:


Note that I tried last stage3 (it ended up being too late) to get rid
of ASSERT_EXPRs
doing substitute-and-fold itself (basically copy-propagate them out at
this point rather
than as a separate thing later).  This is because the ASSERT_EXPR uses interfere
with the single_use checks in match.pd patterns and thus are actually harmful.

The barrier I ran into was the ASSERT_EXPR use by the threader ... so
now you're making
us rely even more on those :/
Perhaps, but it's a short term thing -- Andrew and I want to get rid of 
ASSERT_EXPRs too.  I really think we all share that common goal.


I can certainly see how they muck up the single_use checks.  They get in 
the way of other things as well by hiding equivalency information.


jeff


Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP

2017-05-08 Thread Richard Biener via gcc-patches
On Fri, May 5, 2017 at 10:53 PM, Jeff Law  wrote:
> On 05/04/2017 08:37 AM, Jeff Law wrote:
>>
>>
>> You understanding is slightly wrong however,  The ASSERT_EXPRs and
>> conditionals map 100% through propagation and into simplification.  It's
>> only during simplification that we lose the direct mapping as we change the
>> conditional in order to remove the unnecessary type conversion. Threading
>> runs after simplification.
>>
>> Another approach here would be to simplify the ASSERT_EXPR in the same
>> manner that we simplify the conditional.  That may even be better for
>> various reasons in the short term.  Let me poke at that.
>
> Now I remember why I didn't do that (simplify the ASSERT_EXPR).  It
> doesn't work :-)
>
> So given an ASSERT_EXPR like:
>
>   v1_378 = ASSERT_EXPR ;
>
> Let's assume for the sake of argument v1_179 was set by a type cast from
> xx_1.  We can simplify that ASSERT_EXPR into
>
>   v1_378 = ASSERT_EXPR ;
>
> But note we can not change operand 0 of the ASSERT_EXPR.  That would change
> the *type* of the ASSERT_EXPR and blows up all kinds of things later.  So
> the ASSERT_EXPR at best can morph into this form:
>
>   v1_378 = ASSERT_EXPR ;
>
> When the threader wants to look for an ASSERT_EXPR that creates a range for
> an object, it does lookups based on a match of operand 0 without digging
> into the expression in operand 1.
>
> That's something we may want to change in the future (it plays into issues
> that arise in patch #3) but in the immediate term it's still best to defer
> that one special case of tweaking a GIMPLE_COND.

Note that I tried last stage3 (it ended up being too late) to get rid
of ASSERT_EXPRs
doing substitute-and-fold itself (basically copy-propagate them out at
this point rather
than as a separate thing later).  This is because the ASSERT_EXPR uses interfere
with the single_use checks in match.pd patterns and thus are actually harmful.

The barrier I ran into was the ASSERT_EXPR use by the threader ... so
now you're making
us rely even more on those :/

Richard.

> Jeff


Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP

2017-05-05 Thread Jeff Law

On 05/04/2017 08:37 AM, Jeff Law wrote:


You understanding is slightly wrong however,  The ASSERT_EXPRs and 
conditionals map 100% through propagation and into simplification.  It's 
only during simplification that we lose the direct mapping as we change 
the conditional in order to remove the unnecessary type conversion. 
Threading runs after simplification.


Another approach here would be to simplify the ASSERT_EXPR in the same 
manner that we simplify the conditional.  That may even be better for 
various reasons in the short term.  Let me poke at that.

Now I remember why I didn't do that (simplify the ASSERT_EXPR).  It
doesn't work :-)

So given an ASSERT_EXPR like:

  v1_378 = ASSERT_EXPR ;

Let's assume for the sake of argument v1_179 was set by a type cast from 
xx_1.  We can simplify that ASSERT_EXPR into


  v1_378 = ASSERT_EXPR ;

But note we can not change operand 0 of the ASSERT_EXPR.  That would 
change the *type* of the ASSERT_EXPR and blows up all kinds of things 
later.  So the ASSERT_EXPR at best can morph into this form:


  v1_378 = ASSERT_EXPR ;

When the threader wants to look for an ASSERT_EXPR that creates a range 
for an object, it does lookups based on a match of operand 0 without 
digging into the expression in operand 1.


That's something we may want to change in the future (it plays into 
issues that arise in patch #3) but in the immediate term it's still best 
to defer that one special case of tweaking a GIMPLE_COND.


Jeff


Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP

2017-05-04 Thread Jeff Law

On 05/04/2017 04:59 AM, Richard Biener wrote:



I think this is a hack ;)  Basically the issue is that jump-threading
uses ASSERT_EXPRs
at all (which are an implementation detail of VRP).  As far as I
understand it does that
because VRP can do "fancy" things and create ASSERT_EXPRs that do not directly
map to the conditional but to its operand def stmts.

Agreed it's a hack, but perhaps for different reasons.

The only reason we have a threading pass inside VRP is so that we can 
get access to the internal state.  It pre-dates the ability to query any 
kind of range data outside of VRP by a decade.


The long term plan is to drop the threading passes from VRP once we can 
get accurate ranges outside VRP using Andrew's work.  Once we can do 
that, DOM or backward threader can query that data and the jump 
threading pass inside VRP becomes redundant/useless.


The first major milestone for that work will be the ability to drop 
ASSERT_EXPRs completely from the IL, but still get just as accurate 
range information.  ASSERT_EXPRs are really just an implementation 
detail inside VRP.


--


You understanding is slightly wrong however,  The ASSERT_EXPRs and 
conditionals map 100% through propagation and into simplification.  It's 
only during simplification that we lose the direct mapping as we change 
the conditional in order to remove the unnecessary type conversion. 
Threading runs after simplification.


Another approach here would be to simplify the ASSERT_EXPR in the same 
manner that we simplify the conditional.  That may even be better for 
various reasons in the short term.  Let me poke at that.





I have meanwhile factored this "fancieness" out into (ok, bad name...)
register_edge_assert_for which records all these fancy asserts in a
vec.  This is
now used from EVRP:

   gimple *stmt = last_stmt (pred_e->src);
   if (stmt
   && gimple_code (stmt) == GIMPLE_COND
   && (op0 = gimple_cond_lhs (stmt))
   && TREE_CODE (op0) == SSA_NAME
   && (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt)))
   || POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt)
 {
   if (dump_file && (dump_flags & TDF_DETAILS))
 {
   fprintf (dump_file, "Visiting controlling predicate ");
   print_gimple_stmt (dump_file, stmt, 0, 0);
 }
   /* Entering a new scope.  Try to see if we can find a VR
  here.  */
   tree op1 = gimple_cond_rhs (stmt);
   if (TREE_OVERFLOW_P (op1))
 op1 = drop_tree_overflow (op1);
   tree_code code = gimple_cond_code (stmt);

   auto_vec asserts;
   register_edge_assert_for (op0, pred_e, code, op0, op1, asserts);
   if (TREE_CODE (op1) == SSA_NAME)
 register_edge_assert_for (op1, pred_e, code, op0, op1, asserts);

regular VRP transforms those into its own assert representation in
finish_register_edge_assert_for (though assert_info should be enough
for jump threading).

Yea, this looks like a much simplified form of what Andrew has done.

I wasn't aware of this stuff from evrp, it may be useful to simplify the 
3rd patch in this series (which is still evolving and tries to put in a 
real solution for the conditional equivalence problems).



So I think it should be possible for jump threading to use this new
machinery (and even DOM based jump threading or backwards threading
could make use of this!)

Roughly the plan.

I think Aldy is close to being ready to start review work on the new API 
class for querying ranges.  I think it addresses the big issues we both 
want to see addressed (no more ANTI ranges, no fixed number of intervals 
within a range representation).  Removal of ANTI ranges does 
significantly simplify things on the client side which IMHO should be 
the driver at this point.


In parallel I expect to carve out two more hunks of Andrew's work in the 
near future.  Essentially they're APIs that allow efficient computation 
and mapping of blocks that generate range data for a particular name and 
walkers.  They should allow simplification of some stuff in DOM and 
improve the most glaring weakness in the backwards threader.


jeff


Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP

2017-05-04 Thread Richard Biener
On Wed, May 3, 2017 at 6:32 PM, Jeff Law  wrote:
> [ With the patch attached... ]
>
>
> On 05/03/2017 10:31 AM, Jeff Law wrote:
>>
>> This is the first of 3-5 patches to address pr78496.
>>
>> The goal of these patches is to catch jump threads earlier in the pipeline
>> to avoid undesirable behavior in PRE and more generally be able to exploit
>> the secondary opportunities exposed by jump threading.
>>
>> One of the more serious issues I found while investigating 78496 was VRP
>> failing to find what should have been obvious jump threads.  The fundamental
>> issue is VRP will simplify conditionals which are fed by a typecast prior to
>> jump threading.   So something like this:
>>
>> x = (typecast) y;
>> if (x == 42)
>>
>> Can often be transformed into:
>>
>> if (y == 42)
>>
>>
>> The problem is any ASSERT_EXPRS after the conditional will reference "x"
>> rather than "y".  That in turn makes it impossible for VRP to use those
>> ASSERT_EXPRs to thread later jumps that use x == 
>>
>>
>> More concretely consider this gimple code:
>>
>>
>> ;;   basic block 5, loop depth 0, count 0, freq 1, maybe hot
>> ;;prev block 4, next block 12, flags: (NEW, REACHABLE, VISITED)
>> ;;pred:   3 [50.0%]  (TRUE_VALUE,EXECUTABLE)
>> ;;4 [100.0%]  (FALLTHRU,EXECUTABLE)
>># iftmp.0_2 = PHI <1(3), 0(4)>
>>in_loop_7 = (unsigned char) iftmp.0_2;
>>if (in_loop_7 != 0)
>>  goto ; [33.00%]
>>else
>>  goto ; [67.00%]
>>
>> ;;succ:   6 [33.0%]  (TRUE_VALUE,EXECUTABLE)
>> ;;12 [67.0%]  (FALSE_VALUE,EXECUTABLE)
>>
>> ;;   basic block 12, loop depth 0, count 0, freq 6700, maybe hot
>> ;;prev block 5, next block 6, flags: (NEW)
>> ;;pred:   5 [67.0%]  (FALSE_VALUE,EXECUTABLE)
>>in_loop_15 = ASSERT_EXPR ;
>>goto ; [100.00%]
>> ;;succ:   7 [100.0%]  (FALLTHRU)
>>
>> ;;   basic block 6, loop depth 0, count 0, freq 3300, maybe hot
>> ;;prev block 12, next block 7, flags: (NEW, REACHABLE, VISITED)
>> ;;pred:   5 [33.0%]  (TRUE_VALUE,EXECUTABLE)
>>in_loop_14 = ASSERT_EXPR ;
>>simple_iv ();
>> ;;succ:   7 [100.0%]  (FALLTHRU,EXECUTABLE)
>>
>> And later we have:
>>
>> ;;   basic block 9, loop depth 0, count 0, freq 8476, maybe hot
>> ;;prev block 8, next block 10, flags: (NEW, REACHABLE, VISITED)
>> ;;pred:   7 [84.8%]  (FALSE_VALUE,EXECUTABLE)
>>if (in_loop_7 == 0)
>>  goto ; [36.64%]
>>else
>>  goto ; [63.36%]
>>
>> VRP knows it can replace the uses of in_loop_7 in the conditionals in
>> blocks 5 and 9 with iftmp.0_2 and happily does so *before* jump threading
>> (but well after ASSERT_EXPR insertion).
>>
>> As a result VRP is unable to utilize the ASSERT_EXPRs in blocks 12 and 6
>> (which reference in_loop_7) to thread the jump at bb9 (which now references
>> iftmp.0_2).
>>
>>
>> The cases in pr78496 are slightly more complex, but boil down to the same
>> core issue -- simplifying the conditional too early.
>>
>> Thankfully this is easy to fix.  We just split the conditional
>> simplification into two steps so that the transformation noted above occurs
>> after jump threading (the other simplifications we want to occur before jump
>> threading).
>>
>> This allows VRP1 to pick up 27 missed jump threads in the testcase from
>> 78496.  It could well be enough to address 78496, but since we don't have a
>> solid description of the desired end result I won't consider 78496 fixed
>> quite yet as there's significant further improvements we can make.
>>
>> Bootstrapped and regression tested on x86_64-linux-gnu.  Installing on the
>> trunk.

I think this is a hack ;)  Basically the issue is that jump-threading
uses ASSERT_EXPRs
at all (which are an implementation detail of VRP).  As far as I
understand it does that
because VRP can do "fancy" things and create ASSERT_EXPRs that do not directly
map to the conditional but to its operand def stmts.

I have meanwhile factored this "fancieness" out into (ok, bad name...)
register_edge_assert_for which records all these fancy asserts in a
vec.  This is
now used from EVRP:

  gimple *stmt = last_stmt (pred_e->src);
  if (stmt
  && gimple_code (stmt) == GIMPLE_COND
  && (op0 = gimple_cond_lhs (stmt))
  && TREE_CODE (op0) == SSA_NAME
  && (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt)))
  || POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt)
{
  if (dump_file && (dump_flags & TDF_DETAILS))
{
  fprintf (dump_file, "Visiting controlling predicate ");
  print_gimple_stmt (dump_file, stmt, 0, 0);
}
  /* Entering a new scope.  Try to see if we can find a VR
 here.  */
  tree op1 = gimple_cond_rhs (stmt);
  if (TREE_OVERFLOW_P (op1))
op1 = drop_tree_overflow (op1);
  tree_code code = gimple_cond_code (stmt);

 

Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP

2017-05-03 Thread Jeff Law

On 05/03/2017 10:55 AM, Bin.Cheng wrote:

On Wed, May 3, 2017 at 5:32 PM, Jeff Law  wrote:

[ With the patch attached... ]


On 05/03/2017 10:31 AM, Jeff Law wrote:


This is the first of 3-5 patches to address pr78496.

The goal of these patches is to catch jump threads earlier in the pipeline
to avoid undesirable behavior in PRE and more generally be able to exploit
the secondary opportunities exposed by jump threading.

One of the more serious issues I found while investigating 78496 was VRP
failing to find what should have been obvious jump threads.  The fundamental
issue is VRP will simplify conditionals which are fed by a typecast prior to
jump threading.   So something like this:

x = (typecast) y;
if (x == 42)

Can often be transformed into:

if (y == 42)


The problem is any ASSERT_EXPRS after the conditional will reference "x"
rather than "y".  That in turn makes it impossible for VRP to use those
ASSERT_EXPRs to thread later jumps that use x == 


More concretely consider this gimple code:


;;   basic block 5, loop depth 0, count 0, freq 1, maybe hot
;;prev block 4, next block 12, flags: (NEW, REACHABLE, VISITED)
;;pred:   3 [50.0%]  (TRUE_VALUE,EXECUTABLE)
;;4 [100.0%]  (FALLTHRU,EXECUTABLE)
# iftmp.0_2 = PHI <1(3), 0(4)>
in_loop_7 = (unsigned char) iftmp.0_2;
if (in_loop_7 != 0)
  goto ; [33.00%]
else
  goto ; [67.00%]

;;succ:   6 [33.0%]  (TRUE_VALUE,EXECUTABLE)
;;12 [67.0%]  (FALSE_VALUE,EXECUTABLE)

;;   basic block 12, loop depth 0, count 0, freq 6700, maybe hot
;;prev block 5, next block 6, flags: (NEW)
;;pred:   5 [67.0%]  (FALSE_VALUE,EXECUTABLE)
in_loop_15 = ASSERT_EXPR ;
goto ; [100.00%]
;;succ:   7 [100.0%]  (FALLTHRU)

;;   basic block 6, loop depth 0, count 0, freq 3300, maybe hot
;;prev block 12, next block 7, flags: (NEW, REACHABLE, VISITED)
;;pred:   5 [33.0%]  (TRUE_VALUE,EXECUTABLE)
in_loop_14 = ASSERT_EXPR ;
simple_iv ();
;;succ:   7 [100.0%]  (FALLTHRU,EXECUTABLE)

And later we have:

;;   basic block 9, loop depth 0, count 0, freq 8476, maybe hot
;;prev block 8, next block 10, flags: (NEW, REACHABLE, VISITED)
;;pred:   7 [84.8%]  (FALSE_VALUE,EXECUTABLE)
if (in_loop_7 == 0)
  goto ; [36.64%]
else
  goto ; [63.36%]

VRP knows it can replace the uses of in_loop_7 in the conditionals in
blocks 5 and 9 with iftmp.0_2 and happily does so *before* jump threading
(but well after ASSERT_EXPR insertion).

As a result VRP is unable to utilize the ASSERT_EXPRs in blocks 12 and 6
(which reference in_loop_7) to thread the jump at bb9 (which now references
iftmp.0_2).


The cases in pr78496 are slightly more complex, but boil down to the same
core issue -- simplifying the conditional too early.

Thankfully this is easy to fix.  We just split the conditional
simplification into two steps so that the transformation noted above occurs
after jump threading (the other simplifications we want to occur before jump
threading).

This allows VRP1 to pick up 27 missed jump threads in the testcase from
78496.  It could well be enough to address 78496, but since we don't have a
solid description of the desired end result I won't consider 78496 fixed
quite yet as there's significant further improvements we can make.

Bootstrapped and regression tested on x86_64-linux-gnu.  Installing on the
trunk.

Jeff




diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 92a4e395ba8..32cc81978df 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,14 @@
+2017-05-03  Jeff Law  
+
+   PR tree-optimization/78496
+   * tree-vrp.c (simplify_cond_using_ranges_1): Renamed
+   from simplify_cond_using_ranges.  Split off code to walk
+   backwards through casts into ...
+   (simplify_cond_using_ranges_2): New function.
+   (simplify_stmt_using_ranges): Call simplify_cond_using_ranges_1.
+   (execute_vrp): After identifying jump threads, call
+   simplify_cond_using_ranges_2.
+
  2017-05-03  Jeff Downs  
 Rainer Orth  

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 596468d33e9..55a44e4635a 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2017-05-03  Jeff Law  
+
+   PR tree-optimization/78496
+   * gcc.dg/tree-ssa/ssa-thread-15.c: New test.
+
  2017-05-03  Uros Bizjak  

 * g++.dg/lto/pr79671_0.C (foo): Fix asm constraints.
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c
new file mode 100644
index 000..f73268cacbb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c
@@ -0,0 +1,51 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-vrp1" } */
+
+/* We should thread the if 

Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP

2017-05-03 Thread Bin.Cheng
On Wed, May 3, 2017 at 5:32 PM, Jeff Law  wrote:
> [ With the patch attached... ]
>
>
> On 05/03/2017 10:31 AM, Jeff Law wrote:
>>
>> This is the first of 3-5 patches to address pr78496.
>>
>> The goal of these patches is to catch jump threads earlier in the pipeline
>> to avoid undesirable behavior in PRE and more generally be able to exploit
>> the secondary opportunities exposed by jump threading.
>>
>> One of the more serious issues I found while investigating 78496 was VRP
>> failing to find what should have been obvious jump threads.  The fundamental
>> issue is VRP will simplify conditionals which are fed by a typecast prior to
>> jump threading.   So something like this:
>>
>> x = (typecast) y;
>> if (x == 42)
>>
>> Can often be transformed into:
>>
>> if (y == 42)
>>
>>
>> The problem is any ASSERT_EXPRS after the conditional will reference "x"
>> rather than "y".  That in turn makes it impossible for VRP to use those
>> ASSERT_EXPRs to thread later jumps that use x == 
>>
>>
>> More concretely consider this gimple code:
>>
>>
>> ;;   basic block 5, loop depth 0, count 0, freq 1, maybe hot
>> ;;prev block 4, next block 12, flags: (NEW, REACHABLE, VISITED)
>> ;;pred:   3 [50.0%]  (TRUE_VALUE,EXECUTABLE)
>> ;;4 [100.0%]  (FALLTHRU,EXECUTABLE)
>># iftmp.0_2 = PHI <1(3), 0(4)>
>>in_loop_7 = (unsigned char) iftmp.0_2;
>>if (in_loop_7 != 0)
>>  goto ; [33.00%]
>>else
>>  goto ; [67.00%]
>>
>> ;;succ:   6 [33.0%]  (TRUE_VALUE,EXECUTABLE)
>> ;;12 [67.0%]  (FALSE_VALUE,EXECUTABLE)
>>
>> ;;   basic block 12, loop depth 0, count 0, freq 6700, maybe hot
>> ;;prev block 5, next block 6, flags: (NEW)
>> ;;pred:   5 [67.0%]  (FALSE_VALUE,EXECUTABLE)
>>in_loop_15 = ASSERT_EXPR ;
>>goto ; [100.00%]
>> ;;succ:   7 [100.0%]  (FALLTHRU)
>>
>> ;;   basic block 6, loop depth 0, count 0, freq 3300, maybe hot
>> ;;prev block 12, next block 7, flags: (NEW, REACHABLE, VISITED)
>> ;;pred:   5 [33.0%]  (TRUE_VALUE,EXECUTABLE)
>>in_loop_14 = ASSERT_EXPR ;
>>simple_iv ();
>> ;;succ:   7 [100.0%]  (FALLTHRU,EXECUTABLE)
>>
>> And later we have:
>>
>> ;;   basic block 9, loop depth 0, count 0, freq 8476, maybe hot
>> ;;prev block 8, next block 10, flags: (NEW, REACHABLE, VISITED)
>> ;;pred:   7 [84.8%]  (FALSE_VALUE,EXECUTABLE)
>>if (in_loop_7 == 0)
>>  goto ; [36.64%]
>>else
>>  goto ; [63.36%]
>>
>> VRP knows it can replace the uses of in_loop_7 in the conditionals in
>> blocks 5 and 9 with iftmp.0_2 and happily does so *before* jump threading
>> (but well after ASSERT_EXPR insertion).
>>
>> As a result VRP is unable to utilize the ASSERT_EXPRs in blocks 12 and 6
>> (which reference in_loop_7) to thread the jump at bb9 (which now references
>> iftmp.0_2).
>>
>>
>> The cases in pr78496 are slightly more complex, but boil down to the same
>> core issue -- simplifying the conditional too early.
>>
>> Thankfully this is easy to fix.  We just split the conditional
>> simplification into two steps so that the transformation noted above occurs
>> after jump threading (the other simplifications we want to occur before jump
>> threading).
>>
>> This allows VRP1 to pick up 27 missed jump threads in the testcase from
>> 78496.  It could well be enough to address 78496, but since we don't have a
>> solid description of the desired end result I won't consider 78496 fixed
>> quite yet as there's significant further improvements we can make.
>>
>> Bootstrapped and regression tested on x86_64-linux-gnu.  Installing on the
>> trunk.
>>
>> Jeff
>>
>
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 92a4e395ba8..32cc81978df 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,14 @@
> +2017-05-03  Jeff Law  
> +
> +   PR tree-optimization/78496
> +   * tree-vrp.c (simplify_cond_using_ranges_1): Renamed
> +   from simplify_cond_using_ranges.  Split off code to walk
> +   backwards through casts into ...
> +   (simplify_cond_using_ranges_2): New function.
> +   (simplify_stmt_using_ranges): Call simplify_cond_using_ranges_1.
> +   (execute_vrp): After identifying jump threads, call
> +   simplify_cond_using_ranges_2.
> +
>  2017-05-03  Jeff Downs  
> Rainer Orth  
>
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 596468d33e9..55a44e4635a 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2017-05-03  Jeff Law  
> +
> +   PR tree-optimization/78496
> +   * gcc.dg/tree-ssa/ssa-thread-15.c: New test.
> +
>  2017-05-03  Uros Bizjak  
>
> * g++.dg/lto/pr79671_0.C (foo): Fix asm constraints.
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c
> 

Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP

2017-05-03 Thread Jeff Law

[ With the patch attached... ]

On 05/03/2017 10:31 AM, Jeff Law wrote:

This is the first of 3-5 patches to address pr78496.

The goal of these patches is to catch jump threads earlier in the 
pipeline to avoid undesirable behavior in PRE and more generally be able 
to exploit the secondary opportunities exposed by jump threading.


One of the more serious issues I found while investigating 78496 was VRP 
failing to find what should have been obvious jump threads.  The 
fundamental issue is VRP will simplify conditionals which are fed by a 
typecast prior to jump threading.   So something like this:


x = (typecast) y;
if (x == 42)

Can often be transformed into:

if (y == 42)


The problem is any ASSERT_EXPRS after the conditional will reference "x" 
rather than "y".  That in turn makes it impossible for VRP to use those 
ASSERT_EXPRs to thread later jumps that use x == 



More concretely consider this gimple code:


;;   basic block 5, loop depth 0, count 0, freq 1, maybe hot
;;prev block 4, next block 12, flags: (NEW, REACHABLE, VISITED)
;;pred:   3 [50.0%]  (TRUE_VALUE,EXECUTABLE)
;;4 [100.0%]  (FALLTHRU,EXECUTABLE)
   # iftmp.0_2 = PHI <1(3), 0(4)>
   in_loop_7 = (unsigned char) iftmp.0_2;
   if (in_loop_7 != 0)
 goto ; [33.00%]
   else
 goto ; [67.00%]

;;succ:   6 [33.0%]  (TRUE_VALUE,EXECUTABLE)
;;12 [67.0%]  (FALSE_VALUE,EXECUTABLE)

;;   basic block 12, loop depth 0, count 0, freq 6700, maybe hot
;;prev block 5, next block 6, flags: (NEW)
;;pred:   5 [67.0%]  (FALSE_VALUE,EXECUTABLE)
   in_loop_15 = ASSERT_EXPR ;
   goto ; [100.00%]
;;succ:   7 [100.0%]  (FALLTHRU)

;;   basic block 6, loop depth 0, count 0, freq 3300, maybe hot
;;prev block 12, next block 7, flags: (NEW, REACHABLE, VISITED)
;;pred:   5 [33.0%]  (TRUE_VALUE,EXECUTABLE)
   in_loop_14 = ASSERT_EXPR ;
   simple_iv ();
;;succ:   7 [100.0%]  (FALLTHRU,EXECUTABLE)

And later we have:

;;   basic block 9, loop depth 0, count 0, freq 8476, maybe hot
;;prev block 8, next block 10, flags: (NEW, REACHABLE, VISITED)
;;pred:   7 [84.8%]  (FALSE_VALUE,EXECUTABLE)
   if (in_loop_7 == 0)
 goto ; [36.64%]
   else
 goto ; [63.36%]

VRP knows it can replace the uses of in_loop_7 in the conditionals in 
blocks 5 and 9 with iftmp.0_2 and happily does so *before* jump 
threading (but well after ASSERT_EXPR insertion).


As a result VRP is unable to utilize the ASSERT_EXPRs in blocks 12 and 6 
(which reference in_loop_7) to thread the jump at bb9 (which now 
references iftmp.0_2).



The cases in pr78496 are slightly more complex, but boil down to the 
same core issue -- simplifying the conditional too early.


Thankfully this is easy to fix.  We just split the conditional 
simplification into two steps so that the transformation noted above 
occurs after jump threading (the other simplifications we want to occur 
before jump threading).


This allows VRP1 to pick up 27 missed jump threads in the testcase from 
78496.  It could well be enough to address 78496, but since we don't 
have a solid description of the desired end result I won't consider 
78496 fixed quite yet as there's significant further improvements we can 
make.


Bootstrapped and regression tested on x86_64-linux-gnu.  Installing on 
the trunk.


Jeff



diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 92a4e395ba8..32cc81978df 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,14 @@
+2017-05-03  Jeff Law  
+
+   PR tree-optimization/78496
+   * tree-vrp.c (simplify_cond_using_ranges_1): Renamed
+   from simplify_cond_using_ranges.  Split off code to walk
+   backwards through casts into ...
+   (simplify_cond_using_ranges_2): New function.
+   (simplify_stmt_using_ranges): Call simplify_cond_using_ranges_1.
+   (execute_vrp): After identifying jump threads, call
+   simplify_cond_using_ranges_2.
+
 2017-05-03  Jeff Downs  
Rainer Orth  
 
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 596468d33e9..55a44e4635a 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2017-05-03  Jeff Law  
+
+   PR tree-optimization/78496
+   * gcc.dg/tree-ssa/ssa-thread-15.c: New test.
+
 2017-05-03  Uros Bizjak  
 
* g++.dg/lto/pr79671_0.C (foo): Fix asm constraints.
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c
new file mode 100644
index 000..f73268cacbb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c
@@ -0,0 +1,51 @@
+/* { dg-do compile } */ 
+/* { dg-options "-O2 -fdump-tree-vrp1" } */
+
+/* We should thread the if (!in_loop) completely leaving
+   just two conditionals.  */
+/* { dg-final { scan-tree-dump-times 

[PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP

2017-05-03 Thread Jeff Law

This is the first of 3-5 patches to address pr78496.

The goal of these patches is to catch jump threads earlier in the 
pipeline to avoid undesirable behavior in PRE and more generally be able 
to exploit the secondary opportunities exposed by jump threading.


One of the more serious issues I found while investigating 78496 was VRP 
failing to find what should have been obvious jump threads.  The 
fundamental issue is VRP will simplify conditionals which are fed by a 
typecast prior to jump threading.   So something like this:


x = (typecast) y;
if (x == 42)

Can often be transformed into:

if (y == 42)


The problem is any ASSERT_EXPRS after the conditional will reference "x" 
rather than "y".  That in turn makes it impossible for VRP to use those 
ASSERT_EXPRs to thread later jumps that use x == 



More concretely consider this gimple code:


;;   basic block 5, loop depth 0, count 0, freq 1, maybe hot
;;prev block 4, next block 12, flags: (NEW, REACHABLE, VISITED)
;;pred:   3 [50.0%]  (TRUE_VALUE,EXECUTABLE)
;;4 [100.0%]  (FALLTHRU,EXECUTABLE)
  # iftmp.0_2 = PHI <1(3), 0(4)>
  in_loop_7 = (unsigned char) iftmp.0_2;
  if (in_loop_7 != 0)
goto ; [33.00%]
  else
goto ; [67.00%]

;;succ:   6 [33.0%]  (TRUE_VALUE,EXECUTABLE)
;;12 [67.0%]  (FALSE_VALUE,EXECUTABLE)

;;   basic block 12, loop depth 0, count 0, freq 6700, maybe hot
;;prev block 5, next block 6, flags: (NEW)
;;pred:   5 [67.0%]  (FALSE_VALUE,EXECUTABLE)
  in_loop_15 = ASSERT_EXPR ;
  goto ; [100.00%]
;;succ:   7 [100.0%]  (FALLTHRU)

;;   basic block 6, loop depth 0, count 0, freq 3300, maybe hot
;;prev block 12, next block 7, flags: (NEW, REACHABLE, VISITED)
;;pred:   5 [33.0%]  (TRUE_VALUE,EXECUTABLE)
  in_loop_14 = ASSERT_EXPR ;
  simple_iv ();
;;succ:   7 [100.0%]  (FALLTHRU,EXECUTABLE)

And later we have:

;;   basic block 9, loop depth 0, count 0, freq 8476, maybe hot
;;prev block 8, next block 10, flags: (NEW, REACHABLE, VISITED)
;;pred:   7 [84.8%]  (FALSE_VALUE,EXECUTABLE)
  if (in_loop_7 == 0)
goto ; [36.64%]
  else
goto ; [63.36%]

VRP knows it can replace the uses of in_loop_7 in the conditionals in 
blocks 5 and 9 with iftmp.0_2 and happily does so *before* jump 
threading (but well after ASSERT_EXPR insertion).


As a result VRP is unable to utilize the ASSERT_EXPRs in blocks 12 and 6 
(which reference in_loop_7) to thread the jump at bb9 (which now 
references iftmp.0_2).



The cases in pr78496 are slightly more complex, but boil down to the 
same core issue -- simplifying the conditional too early.


Thankfully this is easy to fix.  We just split the conditional 
simplification into two steps so that the transformation noted above 
occurs after jump threading (the other simplifications we want to occur 
before jump threading).


This allows VRP1 to pick up 27 missed jump threads in the testcase from 
78496.  It could well be enough to address 78496, but since we don't 
have a solid description of the desired end result I won't consider 
78496 fixed quite yet as there's significant further improvements we can 
make.


Bootstrapped and regression tested on x86_64-linux-gnu.  Installing on 
the trunk.


Jeff