Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP
On 05/08/2017 01:32 AM, Richard Biener wrote: Note that I tried last stage3 (it ended up being too late) to get rid of ASSERT_EXPRs doing substitute-and-fold itself (basically copy-propagate them out at this point rather than as a separate thing later). This is because the ASSERT_EXPR uses interfere with the single_use checks in match.pd patterns and thus are actually harmful. The barrier I ran into was the ASSERT_EXPR use by the threader ... so now you're making us rely even more on those :/ Perhaps, but it's a short term thing -- Andrew and I want to get rid of ASSERT_EXPRs too. I really think we all share that common goal. I can certainly see how they muck up the single_use checks. They get in the way of other things as well by hiding equivalency information. jeff
Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP
On Fri, May 5, 2017 at 10:53 PM, Jeff Lawwrote: > On 05/04/2017 08:37 AM, Jeff Law wrote: >> >> >> You understanding is slightly wrong however, The ASSERT_EXPRs and >> conditionals map 100% through propagation and into simplification. It's >> only during simplification that we lose the direct mapping as we change the >> conditional in order to remove the unnecessary type conversion. Threading >> runs after simplification. >> >> Another approach here would be to simplify the ASSERT_EXPR in the same >> manner that we simplify the conditional. That may even be better for >> various reasons in the short term. Let me poke at that. > > Now I remember why I didn't do that (simplify the ASSERT_EXPR). It > doesn't work :-) > > So given an ASSERT_EXPR like: > > v1_378 = ASSERT_EXPR ; > > Let's assume for the sake of argument v1_179 was set by a type cast from > xx_1. We can simplify that ASSERT_EXPR into > > v1_378 = ASSERT_EXPR ; > > But note we can not change operand 0 of the ASSERT_EXPR. That would change > the *type* of the ASSERT_EXPR and blows up all kinds of things later. So > the ASSERT_EXPR at best can morph into this form: > > v1_378 = ASSERT_EXPR ; > > When the threader wants to look for an ASSERT_EXPR that creates a range for > an object, it does lookups based on a match of operand 0 without digging > into the expression in operand 1. > > That's something we may want to change in the future (it plays into issues > that arise in patch #3) but in the immediate term it's still best to defer > that one special case of tweaking a GIMPLE_COND. Note that I tried last stage3 (it ended up being too late) to get rid of ASSERT_EXPRs doing substitute-and-fold itself (basically copy-propagate them out at this point rather than as a separate thing later). This is because the ASSERT_EXPR uses interfere with the single_use checks in match.pd patterns and thus are actually harmful. The barrier I ran into was the ASSERT_EXPR use by the threader ... so now you're making us rely even more on those :/ Richard. > Jeff
Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP
On 05/04/2017 08:37 AM, Jeff Law wrote: You understanding is slightly wrong however, The ASSERT_EXPRs and conditionals map 100% through propagation and into simplification. It's only during simplification that we lose the direct mapping as we change the conditional in order to remove the unnecessary type conversion. Threading runs after simplification. Another approach here would be to simplify the ASSERT_EXPR in the same manner that we simplify the conditional. That may even be better for various reasons in the short term. Let me poke at that. Now I remember why I didn't do that (simplify the ASSERT_EXPR). It doesn't work :-) So given an ASSERT_EXPR like: v1_378 = ASSERT_EXPR; Let's assume for the sake of argument v1_179 was set by a type cast from xx_1. We can simplify that ASSERT_EXPR into v1_378 = ASSERT_EXPR ; But note we can not change operand 0 of the ASSERT_EXPR. That would change the *type* of the ASSERT_EXPR and blows up all kinds of things later. So the ASSERT_EXPR at best can morph into this form: v1_378 = ASSERT_EXPR ; When the threader wants to look for an ASSERT_EXPR that creates a range for an object, it does lookups based on a match of operand 0 without digging into the expression in operand 1. That's something we may want to change in the future (it plays into issues that arise in patch #3) but in the immediate term it's still best to defer that one special case of tweaking a GIMPLE_COND. Jeff
Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP
On 05/04/2017 04:59 AM, Richard Biener wrote: I think this is a hack ;) Basically the issue is that jump-threading uses ASSERT_EXPRs at all (which are an implementation detail of VRP). As far as I understand it does that because VRP can do "fancy" things and create ASSERT_EXPRs that do not directly map to the conditional but to its operand def stmts. Agreed it's a hack, but perhaps for different reasons. The only reason we have a threading pass inside VRP is so that we can get access to the internal state. It pre-dates the ability to query any kind of range data outside of VRP by a decade. The long term plan is to drop the threading passes from VRP once we can get accurate ranges outside VRP using Andrew's work. Once we can do that, DOM or backward threader can query that data and the jump threading pass inside VRP becomes redundant/useless. The first major milestone for that work will be the ability to drop ASSERT_EXPRs completely from the IL, but still get just as accurate range information. ASSERT_EXPRs are really just an implementation detail inside VRP. -- You understanding is slightly wrong however, The ASSERT_EXPRs and conditionals map 100% through propagation and into simplification. It's only during simplification that we lose the direct mapping as we change the conditional in order to remove the unnecessary type conversion. Threading runs after simplification. Another approach here would be to simplify the ASSERT_EXPR in the same manner that we simplify the conditional. That may even be better for various reasons in the short term. Let me poke at that. I have meanwhile factored this "fancieness" out into (ok, bad name...) register_edge_assert_for which records all these fancy asserts in a vec. This is now used from EVRP: gimple *stmt = last_stmt (pred_e->src); if (stmt && gimple_code (stmt) == GIMPLE_COND && (op0 = gimple_cond_lhs (stmt)) && TREE_CODE (op0) == SSA_NAME && (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt))) || POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt) { if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file, "Visiting controlling predicate "); print_gimple_stmt (dump_file, stmt, 0, 0); } /* Entering a new scope. Try to see if we can find a VR here. */ tree op1 = gimple_cond_rhs (stmt); if (TREE_OVERFLOW_P (op1)) op1 = drop_tree_overflow (op1); tree_code code = gimple_cond_code (stmt); auto_vecasserts; register_edge_assert_for (op0, pred_e, code, op0, op1, asserts); if (TREE_CODE (op1) == SSA_NAME) register_edge_assert_for (op1, pred_e, code, op0, op1, asserts); regular VRP transforms those into its own assert representation in finish_register_edge_assert_for (though assert_info should be enough for jump threading). Yea, this looks like a much simplified form of what Andrew has done. I wasn't aware of this stuff from evrp, it may be useful to simplify the 3rd patch in this series (which is still evolving and tries to put in a real solution for the conditional equivalence problems). So I think it should be possible for jump threading to use this new machinery (and even DOM based jump threading or backwards threading could make use of this!) Roughly the plan. I think Aldy is close to being ready to start review work on the new API class for querying ranges. I think it addresses the big issues we both want to see addressed (no more ANTI ranges, no fixed number of intervals within a range representation). Removal of ANTI ranges does significantly simplify things on the client side which IMHO should be the driver at this point. In parallel I expect to carve out two more hunks of Andrew's work in the near future. Essentially they're APIs that allow efficient computation and mapping of blocks that generate range data for a particular name and walkers. They should allow simplification of some stuff in DOM and improve the most glaring weakness in the backwards threader. jeff
Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP
On Wed, May 3, 2017 at 6:32 PM, Jeff Lawwrote: > [ With the patch attached... ] > > > On 05/03/2017 10:31 AM, Jeff Law wrote: >> >> This is the first of 3-5 patches to address pr78496. >> >> The goal of these patches is to catch jump threads earlier in the pipeline >> to avoid undesirable behavior in PRE and more generally be able to exploit >> the secondary opportunities exposed by jump threading. >> >> One of the more serious issues I found while investigating 78496 was VRP >> failing to find what should have been obvious jump threads. The fundamental >> issue is VRP will simplify conditionals which are fed by a typecast prior to >> jump threading. So something like this: >> >> x = (typecast) y; >> if (x == 42) >> >> Can often be transformed into: >> >> if (y == 42) >> >> >> The problem is any ASSERT_EXPRS after the conditional will reference "x" >> rather than "y". That in turn makes it impossible for VRP to use those >> ASSERT_EXPRs to thread later jumps that use x == >> >> >> More concretely consider this gimple code: >> >> >> ;; basic block 5, loop depth 0, count 0, freq 1, maybe hot >> ;;prev block 4, next block 12, flags: (NEW, REACHABLE, VISITED) >> ;;pred: 3 [50.0%] (TRUE_VALUE,EXECUTABLE) >> ;;4 [100.0%] (FALLTHRU,EXECUTABLE) >># iftmp.0_2 = PHI <1(3), 0(4)> >>in_loop_7 = (unsigned char) iftmp.0_2; >>if (in_loop_7 != 0) >> goto ; [33.00%] >>else >> goto ; [67.00%] >> >> ;;succ: 6 [33.0%] (TRUE_VALUE,EXECUTABLE) >> ;;12 [67.0%] (FALSE_VALUE,EXECUTABLE) >> >> ;; basic block 12, loop depth 0, count 0, freq 6700, maybe hot >> ;;prev block 5, next block 6, flags: (NEW) >> ;;pred: 5 [67.0%] (FALSE_VALUE,EXECUTABLE) >>in_loop_15 = ASSERT_EXPR ; >>goto ; [100.00%] >> ;;succ: 7 [100.0%] (FALLTHRU) >> >> ;; basic block 6, loop depth 0, count 0, freq 3300, maybe hot >> ;;prev block 12, next block 7, flags: (NEW, REACHABLE, VISITED) >> ;;pred: 5 [33.0%] (TRUE_VALUE,EXECUTABLE) >>in_loop_14 = ASSERT_EXPR ; >>simple_iv (); >> ;;succ: 7 [100.0%] (FALLTHRU,EXECUTABLE) >> >> And later we have: >> >> ;; basic block 9, loop depth 0, count 0, freq 8476, maybe hot >> ;;prev block 8, next block 10, flags: (NEW, REACHABLE, VISITED) >> ;;pred: 7 [84.8%] (FALSE_VALUE,EXECUTABLE) >>if (in_loop_7 == 0) >> goto ; [36.64%] >>else >> goto ; [63.36%] >> >> VRP knows it can replace the uses of in_loop_7 in the conditionals in >> blocks 5 and 9 with iftmp.0_2 and happily does so *before* jump threading >> (but well after ASSERT_EXPR insertion). >> >> As a result VRP is unable to utilize the ASSERT_EXPRs in blocks 12 and 6 >> (which reference in_loop_7) to thread the jump at bb9 (which now references >> iftmp.0_2). >> >> >> The cases in pr78496 are slightly more complex, but boil down to the same >> core issue -- simplifying the conditional too early. >> >> Thankfully this is easy to fix. We just split the conditional >> simplification into two steps so that the transformation noted above occurs >> after jump threading (the other simplifications we want to occur before jump >> threading). >> >> This allows VRP1 to pick up 27 missed jump threads in the testcase from >> 78496. It could well be enough to address 78496, but since we don't have a >> solid description of the desired end result I won't consider 78496 fixed >> quite yet as there's significant further improvements we can make. >> >> Bootstrapped and regression tested on x86_64-linux-gnu. Installing on the >> trunk. I think this is a hack ;) Basically the issue is that jump-threading uses ASSERT_EXPRs at all (which are an implementation detail of VRP). As far as I understand it does that because VRP can do "fancy" things and create ASSERT_EXPRs that do not directly map to the conditional but to its operand def stmts. I have meanwhile factored this "fancieness" out into (ok, bad name...) register_edge_assert_for which records all these fancy asserts in a vec. This is now used from EVRP: gimple *stmt = last_stmt (pred_e->src); if (stmt && gimple_code (stmt) == GIMPLE_COND && (op0 = gimple_cond_lhs (stmt)) && TREE_CODE (op0) == SSA_NAME && (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt))) || POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt) { if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file, "Visiting controlling predicate "); print_gimple_stmt (dump_file, stmt, 0, 0); } /* Entering a new scope. Try to see if we can find a VR here. */ tree op1 = gimple_cond_rhs (stmt); if (TREE_OVERFLOW_P (op1)) op1 = drop_tree_overflow (op1); tree_code code = gimple_cond_code (stmt);
Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP
On 05/03/2017 10:55 AM, Bin.Cheng wrote: On Wed, May 3, 2017 at 5:32 PM, Jeff Lawwrote: [ With the patch attached... ] On 05/03/2017 10:31 AM, Jeff Law wrote: This is the first of 3-5 patches to address pr78496. The goal of these patches is to catch jump threads earlier in the pipeline to avoid undesirable behavior in PRE and more generally be able to exploit the secondary opportunities exposed by jump threading. One of the more serious issues I found while investigating 78496 was VRP failing to find what should have been obvious jump threads. The fundamental issue is VRP will simplify conditionals which are fed by a typecast prior to jump threading. So something like this: x = (typecast) y; if (x == 42) Can often be transformed into: if (y == 42) The problem is any ASSERT_EXPRS after the conditional will reference "x" rather than "y". That in turn makes it impossible for VRP to use those ASSERT_EXPRs to thread later jumps that use x == More concretely consider this gimple code: ;; basic block 5, loop depth 0, count 0, freq 1, maybe hot ;;prev block 4, next block 12, flags: (NEW, REACHABLE, VISITED) ;;pred: 3 [50.0%] (TRUE_VALUE,EXECUTABLE) ;;4 [100.0%] (FALLTHRU,EXECUTABLE) # iftmp.0_2 = PHI <1(3), 0(4)> in_loop_7 = (unsigned char) iftmp.0_2; if (in_loop_7 != 0) goto ; [33.00%] else goto ; [67.00%] ;;succ: 6 [33.0%] (TRUE_VALUE,EXECUTABLE) ;;12 [67.0%] (FALSE_VALUE,EXECUTABLE) ;; basic block 12, loop depth 0, count 0, freq 6700, maybe hot ;;prev block 5, next block 6, flags: (NEW) ;;pred: 5 [67.0%] (FALSE_VALUE,EXECUTABLE) in_loop_15 = ASSERT_EXPR ; goto ; [100.00%] ;;succ: 7 [100.0%] (FALLTHRU) ;; basic block 6, loop depth 0, count 0, freq 3300, maybe hot ;;prev block 12, next block 7, flags: (NEW, REACHABLE, VISITED) ;;pred: 5 [33.0%] (TRUE_VALUE,EXECUTABLE) in_loop_14 = ASSERT_EXPR ; simple_iv (); ;;succ: 7 [100.0%] (FALLTHRU,EXECUTABLE) And later we have: ;; basic block 9, loop depth 0, count 0, freq 8476, maybe hot ;;prev block 8, next block 10, flags: (NEW, REACHABLE, VISITED) ;;pred: 7 [84.8%] (FALSE_VALUE,EXECUTABLE) if (in_loop_7 == 0) goto ; [36.64%] else goto ; [63.36%] VRP knows it can replace the uses of in_loop_7 in the conditionals in blocks 5 and 9 with iftmp.0_2 and happily does so *before* jump threading (but well after ASSERT_EXPR insertion). As a result VRP is unable to utilize the ASSERT_EXPRs in blocks 12 and 6 (which reference in_loop_7) to thread the jump at bb9 (which now references iftmp.0_2). The cases in pr78496 are slightly more complex, but boil down to the same core issue -- simplifying the conditional too early. Thankfully this is easy to fix. We just split the conditional simplification into two steps so that the transformation noted above occurs after jump threading (the other simplifications we want to occur before jump threading). This allows VRP1 to pick up 27 missed jump threads in the testcase from 78496. It could well be enough to address 78496, but since we don't have a solid description of the desired end result I won't consider 78496 fixed quite yet as there's significant further improvements we can make. Bootstrapped and regression tested on x86_64-linux-gnu. Installing on the trunk. Jeff diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 92a4e395ba8..32cc81978df 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,14 @@ +2017-05-03 Jeff Law + + PR tree-optimization/78496 + * tree-vrp.c (simplify_cond_using_ranges_1): Renamed + from simplify_cond_using_ranges. Split off code to walk + backwards through casts into ... + (simplify_cond_using_ranges_2): New function. + (simplify_stmt_using_ranges): Call simplify_cond_using_ranges_1. + (execute_vrp): After identifying jump threads, call + simplify_cond_using_ranges_2. + 2017-05-03 Jeff Downs Rainer Orth diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 596468d33e9..55a44e4635a 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2017-05-03 Jeff Law + + PR tree-optimization/78496 + * gcc.dg/tree-ssa/ssa-thread-15.c: New test. + 2017-05-03 Uros Bizjak * g++.dg/lto/pr79671_0.C (foo): Fix asm constraints. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c new file mode 100644 index 000..f73268cacbb --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c @@ -0,0 +1,51 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-vrp1" } */ + +/* We should thread the if
Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP
On Wed, May 3, 2017 at 5:32 PM, Jeff Lawwrote: > [ With the patch attached... ] > > > On 05/03/2017 10:31 AM, Jeff Law wrote: >> >> This is the first of 3-5 patches to address pr78496. >> >> The goal of these patches is to catch jump threads earlier in the pipeline >> to avoid undesirable behavior in PRE and more generally be able to exploit >> the secondary opportunities exposed by jump threading. >> >> One of the more serious issues I found while investigating 78496 was VRP >> failing to find what should have been obvious jump threads. The fundamental >> issue is VRP will simplify conditionals which are fed by a typecast prior to >> jump threading. So something like this: >> >> x = (typecast) y; >> if (x == 42) >> >> Can often be transformed into: >> >> if (y == 42) >> >> >> The problem is any ASSERT_EXPRS after the conditional will reference "x" >> rather than "y". That in turn makes it impossible for VRP to use those >> ASSERT_EXPRs to thread later jumps that use x == >> >> >> More concretely consider this gimple code: >> >> >> ;; basic block 5, loop depth 0, count 0, freq 1, maybe hot >> ;;prev block 4, next block 12, flags: (NEW, REACHABLE, VISITED) >> ;;pred: 3 [50.0%] (TRUE_VALUE,EXECUTABLE) >> ;;4 [100.0%] (FALLTHRU,EXECUTABLE) >># iftmp.0_2 = PHI <1(3), 0(4)> >>in_loop_7 = (unsigned char) iftmp.0_2; >>if (in_loop_7 != 0) >> goto ; [33.00%] >>else >> goto ; [67.00%] >> >> ;;succ: 6 [33.0%] (TRUE_VALUE,EXECUTABLE) >> ;;12 [67.0%] (FALSE_VALUE,EXECUTABLE) >> >> ;; basic block 12, loop depth 0, count 0, freq 6700, maybe hot >> ;;prev block 5, next block 6, flags: (NEW) >> ;;pred: 5 [67.0%] (FALSE_VALUE,EXECUTABLE) >>in_loop_15 = ASSERT_EXPR ; >>goto ; [100.00%] >> ;;succ: 7 [100.0%] (FALLTHRU) >> >> ;; basic block 6, loop depth 0, count 0, freq 3300, maybe hot >> ;;prev block 12, next block 7, flags: (NEW, REACHABLE, VISITED) >> ;;pred: 5 [33.0%] (TRUE_VALUE,EXECUTABLE) >>in_loop_14 = ASSERT_EXPR ; >>simple_iv (); >> ;;succ: 7 [100.0%] (FALLTHRU,EXECUTABLE) >> >> And later we have: >> >> ;; basic block 9, loop depth 0, count 0, freq 8476, maybe hot >> ;;prev block 8, next block 10, flags: (NEW, REACHABLE, VISITED) >> ;;pred: 7 [84.8%] (FALSE_VALUE,EXECUTABLE) >>if (in_loop_7 == 0) >> goto ; [36.64%] >>else >> goto ; [63.36%] >> >> VRP knows it can replace the uses of in_loop_7 in the conditionals in >> blocks 5 and 9 with iftmp.0_2 and happily does so *before* jump threading >> (but well after ASSERT_EXPR insertion). >> >> As a result VRP is unable to utilize the ASSERT_EXPRs in blocks 12 and 6 >> (which reference in_loop_7) to thread the jump at bb9 (which now references >> iftmp.0_2). >> >> >> The cases in pr78496 are slightly more complex, but boil down to the same >> core issue -- simplifying the conditional too early. >> >> Thankfully this is easy to fix. We just split the conditional >> simplification into two steps so that the transformation noted above occurs >> after jump threading (the other simplifications we want to occur before jump >> threading). >> >> This allows VRP1 to pick up 27 missed jump threads in the testcase from >> 78496. It could well be enough to address 78496, but since we don't have a >> solid description of the desired end result I won't consider 78496 fixed >> quite yet as there's significant further improvements we can make. >> >> Bootstrapped and regression tested on x86_64-linux-gnu. Installing on the >> trunk. >> >> Jeff >> > > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index 92a4e395ba8..32cc81978df 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,14 @@ > +2017-05-03 Jeff Law > + > + PR tree-optimization/78496 > + * tree-vrp.c (simplify_cond_using_ranges_1): Renamed > + from simplify_cond_using_ranges. Split off code to walk > + backwards through casts into ... > + (simplify_cond_using_ranges_2): New function. > + (simplify_stmt_using_ranges): Call simplify_cond_using_ranges_1. > + (execute_vrp): After identifying jump threads, call > + simplify_cond_using_ranges_2. > + > 2017-05-03 Jeff Downs > Rainer Orth > > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog > index 596468d33e9..55a44e4635a 100644 > --- a/gcc/testsuite/ChangeLog > +++ b/gcc/testsuite/ChangeLog > @@ -1,3 +1,8 @@ > +2017-05-03 Jeff Law > + > + PR tree-optimization/78496 > + * gcc.dg/tree-ssa/ssa-thread-15.c: New test. > + > 2017-05-03 Uros Bizjak > > * g++.dg/lto/pr79671_0.C (foo): Fix asm constraints. > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c >
Re: [PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP
[ With the patch attached... ] On 05/03/2017 10:31 AM, Jeff Law wrote: This is the first of 3-5 patches to address pr78496. The goal of these patches is to catch jump threads earlier in the pipeline to avoid undesirable behavior in PRE and more generally be able to exploit the secondary opportunities exposed by jump threading. One of the more serious issues I found while investigating 78496 was VRP failing to find what should have been obvious jump threads. The fundamental issue is VRP will simplify conditionals which are fed by a typecast prior to jump threading. So something like this: x = (typecast) y; if (x == 42) Can often be transformed into: if (y == 42) The problem is any ASSERT_EXPRS after the conditional will reference "x" rather than "y". That in turn makes it impossible for VRP to use those ASSERT_EXPRs to thread later jumps that use x == More concretely consider this gimple code: ;; basic block 5, loop depth 0, count 0, freq 1, maybe hot ;;prev block 4, next block 12, flags: (NEW, REACHABLE, VISITED) ;;pred: 3 [50.0%] (TRUE_VALUE,EXECUTABLE) ;;4 [100.0%] (FALLTHRU,EXECUTABLE) # iftmp.0_2 = PHI <1(3), 0(4)> in_loop_7 = (unsigned char) iftmp.0_2; if (in_loop_7 != 0) goto ; [33.00%] else goto ; [67.00%] ;;succ: 6 [33.0%] (TRUE_VALUE,EXECUTABLE) ;;12 [67.0%] (FALSE_VALUE,EXECUTABLE) ;; basic block 12, loop depth 0, count 0, freq 6700, maybe hot ;;prev block 5, next block 6, flags: (NEW) ;;pred: 5 [67.0%] (FALSE_VALUE,EXECUTABLE) in_loop_15 = ASSERT_EXPR; goto ; [100.00%] ;;succ: 7 [100.0%] (FALLTHRU) ;; basic block 6, loop depth 0, count 0, freq 3300, maybe hot ;;prev block 12, next block 7, flags: (NEW, REACHABLE, VISITED) ;;pred: 5 [33.0%] (TRUE_VALUE,EXECUTABLE) in_loop_14 = ASSERT_EXPR ; simple_iv (); ;;succ: 7 [100.0%] (FALLTHRU,EXECUTABLE) And later we have: ;; basic block 9, loop depth 0, count 0, freq 8476, maybe hot ;;prev block 8, next block 10, flags: (NEW, REACHABLE, VISITED) ;;pred: 7 [84.8%] (FALSE_VALUE,EXECUTABLE) if (in_loop_7 == 0) goto ; [36.64%] else goto ; [63.36%] VRP knows it can replace the uses of in_loop_7 in the conditionals in blocks 5 and 9 with iftmp.0_2 and happily does so *before* jump threading (but well after ASSERT_EXPR insertion). As a result VRP is unable to utilize the ASSERT_EXPRs in blocks 12 and 6 (which reference in_loop_7) to thread the jump at bb9 (which now references iftmp.0_2). The cases in pr78496 are slightly more complex, but boil down to the same core issue -- simplifying the conditional too early. Thankfully this is easy to fix. We just split the conditional simplification into two steps so that the transformation noted above occurs after jump threading (the other simplifications we want to occur before jump threading). This allows VRP1 to pick up 27 missed jump threads in the testcase from 78496. It could well be enough to address 78496, but since we don't have a solid description of the desired end result I won't consider 78496 fixed quite yet as there's significant further improvements we can make. Bootstrapped and regression tested on x86_64-linux-gnu. Installing on the trunk. Jeff diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 92a4e395ba8..32cc81978df 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,14 @@ +2017-05-03 Jeff Law + + PR tree-optimization/78496 + * tree-vrp.c (simplify_cond_using_ranges_1): Renamed + from simplify_cond_using_ranges. Split off code to walk + backwards through casts into ... + (simplify_cond_using_ranges_2): New function. + (simplify_stmt_using_ranges): Call simplify_cond_using_ranges_1. + (execute_vrp): After identifying jump threads, call + simplify_cond_using_ranges_2. + 2017-05-03 Jeff Downs Rainer Orth diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 596468d33e9..55a44e4635a 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2017-05-03 Jeff Law + + PR tree-optimization/78496 + * gcc.dg/tree-ssa/ssa-thread-15.c: New test. + 2017-05-03 Uros Bizjak * g++.dg/lto/pr79671_0.C (foo): Fix asm constraints. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c new file mode 100644 index 000..f73268cacbb --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c @@ -0,0 +1,51 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-vrp1" } */ + +/* We should thread the if (!in_loop) completely leaving + just two conditionals. */ +/* { dg-final { scan-tree-dump-times
[PATCH 1/4][PR tree-optimization/78496] Don't simplify conditionals too early in VRP
This is the first of 3-5 patches to address pr78496. The goal of these patches is to catch jump threads earlier in the pipeline to avoid undesirable behavior in PRE and more generally be able to exploit the secondary opportunities exposed by jump threading. One of the more serious issues I found while investigating 78496 was VRP failing to find what should have been obvious jump threads. The fundamental issue is VRP will simplify conditionals which are fed by a typecast prior to jump threading. So something like this: x = (typecast) y; if (x == 42) Can often be transformed into: if (y == 42) The problem is any ASSERT_EXPRS after the conditional will reference "x" rather than "y". That in turn makes it impossible for VRP to use those ASSERT_EXPRs to thread later jumps that use x == More concretely consider this gimple code: ;; basic block 5, loop depth 0, count 0, freq 1, maybe hot ;;prev block 4, next block 12, flags: (NEW, REACHABLE, VISITED) ;;pred: 3 [50.0%] (TRUE_VALUE,EXECUTABLE) ;;4 [100.0%] (FALLTHRU,EXECUTABLE) # iftmp.0_2 = PHI <1(3), 0(4)> in_loop_7 = (unsigned char) iftmp.0_2; if (in_loop_7 != 0) goto ; [33.00%] else goto ; [67.00%] ;;succ: 6 [33.0%] (TRUE_VALUE,EXECUTABLE) ;;12 [67.0%] (FALSE_VALUE,EXECUTABLE) ;; basic block 12, loop depth 0, count 0, freq 6700, maybe hot ;;prev block 5, next block 6, flags: (NEW) ;;pred: 5 [67.0%] (FALSE_VALUE,EXECUTABLE) in_loop_15 = ASSERT_EXPR; goto ; [100.00%] ;;succ: 7 [100.0%] (FALLTHRU) ;; basic block 6, loop depth 0, count 0, freq 3300, maybe hot ;;prev block 12, next block 7, flags: (NEW, REACHABLE, VISITED) ;;pred: 5 [33.0%] (TRUE_VALUE,EXECUTABLE) in_loop_14 = ASSERT_EXPR ; simple_iv (); ;;succ: 7 [100.0%] (FALLTHRU,EXECUTABLE) And later we have: ;; basic block 9, loop depth 0, count 0, freq 8476, maybe hot ;;prev block 8, next block 10, flags: (NEW, REACHABLE, VISITED) ;;pred: 7 [84.8%] (FALSE_VALUE,EXECUTABLE) if (in_loop_7 == 0) goto ; [36.64%] else goto ; [63.36%] VRP knows it can replace the uses of in_loop_7 in the conditionals in blocks 5 and 9 with iftmp.0_2 and happily does so *before* jump threading (but well after ASSERT_EXPR insertion). As a result VRP is unable to utilize the ASSERT_EXPRs in blocks 12 and 6 (which reference in_loop_7) to thread the jump at bb9 (which now references iftmp.0_2). The cases in pr78496 are slightly more complex, but boil down to the same core issue -- simplifying the conditional too early. Thankfully this is easy to fix. We just split the conditional simplification into two steps so that the transformation noted above occurs after jump threading (the other simplifications we want to occur before jump threading). This allows VRP1 to pick up 27 missed jump threads in the testcase from 78496. It could well be enough to address 78496, but since we don't have a solid description of the desired end result I won't consider 78496 fixed quite yet as there's significant further improvements we can make. Bootstrapped and regression tested on x86_64-linux-gnu. Installing on the trunk. Jeff