Re: [PATCH] Improve backwards threading
On Fri, 5 Aug 2016, Richard Biener wrote: > > This avoids regressing gcc.dg/tree-ssa/pr21417.c with the fix for > PR72772 where after it a forwarder block no longer is present. > It's easy to simply create it when FSM threading faces the situation > that the edge ending the path enters a loop. > > I also fixed the costs for obviously related anon SSA names (when > they are the same). > > Bootstrap and regtest running on x86_64-unknown-linux-gnu, I'll apply > if that finishes successfully. Ok, so that re-introduces the c-c++-common/ubsan/pr71403-2.c miscompile where we mash two loops retaining a bogus loop->nb_iterations_upper_bound. I think those two testcases show that it should be the dominance relation between the entered loop header and the threaded path that decides whether we can thread or not -- the path may not become another latch of this loop. I can't seem to figure out a correct condition for this based on the threading path but the following works (maybe by accident as I only have those two testcases): Index: gcc/tree-ssa-threadbackward.c === --- gcc/tree-ssa-threadbackward.c (revision 239164) +++ gcc/tree-ssa-threadbackward.c (working copy) @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3. #include "tree-pass.h" #include "gimple-ssa.h" #include "tree-phinodes.h" +#include "cfghooks.h" static int max_threaded_paths; @@ -560,6 +562,14 @@ fsm_find_control_statement_thread_paths edge taken_edge = profitable_jump_thread_path (path, bbi, name, arg); if (taken_edge) { + /* If the taken edge is a loop entry avoid mashing two +loops into one with multiple latches by splitting +the edge. This only works if that block won't become +a latch of this loop. */ + if ((bb_loop_depth (taken_edge->src) + < bb_loop_depth (taken_edge->dest)) + && ! single_succ_p (bbi)) + split_edge (taken_edge); if (bb_loop_depth (taken_edge->src) >= bb_loop_depth (taken_edge->dest)) convert_and_register_jump_thread_path (path, taken_edge); note you need the PR72772 fix to trigger all this. Any idea? Thanks, Richard. > Richard. > > 2016-08-05 Richard Biener > > * tree-ssa-threadbackward.c: Include cfghooks.h. > (profitable_jump_thread_path): Treat same SSA names related. > (fsm_find_control_statement_thread_paths): When the taken edge > enters a loop split it instead of giving up. > > Index: gcc/tree-ssa-threadbackward.c > === > --- gcc/tree-ssa-threadbackward.c (revision 239164) > +++ gcc/tree-ssa-threadbackward.c (working copy) > @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3. > #include "tree-pass.h" > #include "gimple-ssa.h" > #include "tree-phinodes.h" > +#include "cfghooks.h" > > static int max_threaded_paths; > > @@ -206,8 +207,9 @@ profitable_jump_thread_path (vec /* Note that if both NAME and DST are anonymous >SSA_NAMEs, then we do not have enough information >to consider them associated. */ > - if ((SSA_NAME_VAR (dst) != SSA_NAME_VAR (name) > -|| !SSA_NAME_VAR (dst)) > + if (dst != name > + && (SSA_NAME_VAR (dst) != SSA_NAME_VAR (name) > + || !SSA_NAME_VAR (dst)) > && !virtual_operand_p (dst)) > ++n_insns; > } > @@ -560,9 +562,13 @@ fsm_find_control_statement_thread_paths > edge taken_edge = profitable_jump_thread_path (path, bbi, name, arg); > if (taken_edge) > { > + /* If the taken edge is a loop entry avoid mashing two > + loops into one with multiple latches by splitting > + the edge. */ > if (bb_loop_depth (taken_edge->src) > - >= bb_loop_depth (taken_edge->dest)) > - convert_and_register_jump_thread_path (path, taken_edge); > + < bb_loop_depth (taken_edge->dest)) > + split_edge (taken_edge); > + convert_and_register_jump_thread_path (path, taken_edge); > path->pop (); > } > } > @@ -586,9 +592,13 @@ fsm_find_control_statement_thread_paths >name, arg); > if (taken_edge) > { > + /* If the taken edge is a loop entry avoid mashing two > + loops into one with multiple latches by splitting > + the edge. */ > if (bb_loop_depth (taken_edge->src) > - >= bb_loop_depth (taken_edge->dest)) > - convert_and_register_jump_thread_path (path, taken_edge); > + < bb_loop_depth (taken_edge->dest)) > +
Re: [PATCH] Improve backwards threading
On Fri, 5 Aug 2016, Richard Biener wrote: > On Fri, 5 Aug 2016, Richard Biener wrote: > > > > > This avoids regressing gcc.dg/tree-ssa/pr21417.c with the fix for > > PR72772 where after it a forwarder block no longer is present. > > It's easy to simply create it when FSM threading faces the situation > > that the edge ending the path enters a loop. > > > > I also fixed the costs for obviously related anon SSA names (when > > they are the same). > > > > Bootstrap and regtest running on x86_64-unknown-linux-gnu, I'll apply > > if that finishes successfully. > > Ok, so that re-introduces the c-c++-common/ubsan/pr71403-2.c miscompile > where we mash two loops retaining a bogus loop->nb_iterations_upper_bound. > > I think those two testcases show that it should be the dominance > relation between the entered loop header and the threaded path > that decides whether we can thread or not -- the path may not > become another latch of this loop. I can't seem to figure out > a correct condition for this based on the threading path but the > following works (maybe by accident as I only have those two > testcases): > > Index: gcc/tree-ssa-threadbackward.c > === > --- gcc/tree-ssa-threadbackward.c (revision 239164) > +++ gcc/tree-ssa-threadbackward.c (working copy) > @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3. > #include "tree-pass.h" > #include "gimple-ssa.h" > #include "tree-phinodes.h" > +#include "cfghooks.h" > > static int max_threaded_paths; > > @@ -560,6 +562,14 @@ fsm_find_control_statement_thread_paths > edge taken_edge = profitable_jump_thread_path (path, bbi, name, > arg); > if (taken_edge) > { > + /* If the taken edge is a loop entry avoid mashing two > +loops into one with multiple latches by splitting > +the edge. This only works if that block won't become > +a latch of this loop. */ > + if ((bb_loop_depth (taken_edge->src) > + < bb_loop_depth (taken_edge->dest)) > + && ! single_succ_p (bbi)) > + split_edge (taken_edge); > if (bb_loop_depth (taken_edge->src) > >= bb_loop_depth (taken_edge->dest)) > convert_and_register_jump_thread_path (path, taken_edge); > > note you need the PR72772 fix to trigger all this. I think it also shows that a thread path ending not in the inner loop header but on the preheader block might be miscompiled as well given said block can become latch as well if other paths leading into it vanish. Thus there is a latent wrong-code issue with the backward threading (and maybe it really should invalidate nb_iterations_upper_bound of all possibly participating loops, which I guess is the loop the backedge we thread through plus all inner loops). > Any idea? > > Thanks, > Richard. > > > Richard. > > > > 2016-08-05 Richard Biener > > > > * tree-ssa-threadbackward.c: Include cfghooks.h. > > (profitable_jump_thread_path): Treat same SSA names related. > > (fsm_find_control_statement_thread_paths): When the taken edge > > enters a loop split it instead of giving up. > > > > Index: gcc/tree-ssa-threadbackward.c > > === > > --- gcc/tree-ssa-threadbackward.c (revision 239164) > > +++ gcc/tree-ssa-threadbackward.c (working copy) > > @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3. > > #include "tree-pass.h" > > #include "gimple-ssa.h" > > #include "tree-phinodes.h" > > +#include "cfghooks.h" > > > > static int max_threaded_paths; > > > > @@ -206,8 +207,9 @@ profitable_jump_thread_path (vec > /* Note that if both NAME and DST are anonymous > > SSA_NAMEs, then we do not have enough information > > to consider them associated. */ > > - if ((SSA_NAME_VAR (dst) != SSA_NAME_VAR (name) > > - || !SSA_NAME_VAR (dst)) > > + if (dst != name > > + && (SSA_NAME_VAR (dst) != SSA_NAME_VAR (name) > > + || !SSA_NAME_VAR (dst)) > > && !virtual_operand_p (dst)) > > ++n_insns; > > } > > @@ -560,9 +562,13 @@ fsm_find_control_statement_thread_paths > > edge taken_edge = profitable_jump_thread_path (path, bbi, name, arg); > > if (taken_edge) > > { > > + /* If the taken edge is a loop entry avoid mashing two > > +loops into one with multiple latches by splitting > > +the edge. */ > > if (bb_loop_depth (taken_edge->src) > > - >= bb_loop_depth (taken_edge->dest)) > > - convert_and_register_jump_thread_path (path, taken_edge); > > + < bb_loop_depth (taken_edge->dest)) > > + split_edge (taken_edge); > > + convert_and_register_jum
Re: [PATCH] Improve backwards threading
On 08/05/2016 07:36 AM, Richard Biener wrote: @@ -560,6 +562,14 @@ fsm_find_control_statement_thread_paths edge taken_edge = profitable_jump_thread_path (path, bbi, name, arg); if (taken_edge) { + /* If the taken edge is a loop entry avoid mashing two +loops into one with multiple latches by splitting +the edge. This only works if that block won't become +a latch of this loop. */ + if ((bb_loop_depth (taken_edge->src) + < bb_loop_depth (taken_edge->dest)) + && ! single_succ_p (bbi)) + split_edge (taken_edge); if (bb_loop_depth (taken_edge->src) >= bb_loop_depth (taken_edge->dest)) convert_and_register_jump_thread_path (path, taken_edge); note you need the PR72772 fix to trigger all this. I'm a little confused here. In the case where the taken edge goes into a deeper loop nest you're splitting the edge -- to what end? The backwards threader isn't going to register that jump thread. So if this is fixing something, then we've got the fix in the wrong place. FWIW, I the anonymous name fix ought to go in separately, it looks like the right thing to do independent of this stuff. jeff
Re: [PATCH] Improve backwards threading
On Fri, 5 Aug 2016, Jeff Law wrote: > On 08/05/2016 07:36 AM, Richard Biener wrote: > > @@ -560,6 +562,14 @@ fsm_find_control_statement_thread_paths > > edge taken_edge = profitable_jump_thread_path (path, bbi, name, > > arg); > > if (taken_edge) > > { > > + /* If the taken edge is a loop entry avoid mashing two > > +loops into one with multiple latches by splitting > > +the edge. This only works if that block won't become > > +a latch of this loop. */ > > + if ((bb_loop_depth (taken_edge->src) > > + < bb_loop_depth (taken_edge->dest)) > > + && ! single_succ_p (bbi)) > > + split_edge (taken_edge); > > if (bb_loop_depth (taken_edge->src) > > >= bb_loop_depth (taken_edge->dest)) > > convert_and_register_jump_thread_path (path, taken_edge); > > > > note you need the PR72772 fix to trigger all this. > I'm a little confused here. In the case where the taken edge goes into a > deeper loop nest you're splitting the edge -- to what end? The backwards > threader isn't going to register that jump thread. So if this is fixing > something, then we've got the fix in the wrong place. Note that the case is not going "into" a deeper loop nest but as the threading path ends at taken_edge it is threading to a loop header of an inner loop. And yes, the fix doesn't work (not sure how I thought it could). But the condition also doesn't make sense to me and we miss a guard for the case I added a comment for - generating wrong-code because loop meta data is wrong after the transform (I think this may not only apply to the niter upper bound but also things like safelen). > FWIW, I the anonymous name fix ought to go in separately, it looks like the > right thing to do independent of this stuff. I have applied that part now. I'm not sure what to do with the pre-header creation fix (PR72772) now, but I am considering to put that into the tree regardless of the "fallout" (a few FAILs for transforms we no longer perform). I spent half a week now but am quite lost with the threading cases (which I think are latent issues). Richard.
Re: [PATCH] Improve backwards threading
On Fri, 5 Aug 2016, Jeff Law wrote: > On 08/05/2016 07:36 AM, Richard Biener wrote: > > @@ -560,6 +562,14 @@ fsm_find_control_statement_thread_paths > > edge taken_edge = profitable_jump_thread_path (path, bbi, name, > > arg); > > if (taken_edge) > > { > > + /* If the taken edge is a loop entry avoid mashing two > > +loops into one with multiple latches by splitting > > +the edge. This only works if that block won't become > > +a latch of this loop. */ > > + if ((bb_loop_depth (taken_edge->src) > > + < bb_loop_depth (taken_edge->dest)) > > + && ! single_succ_p (bbi)) > > + split_edge (taken_edge); > > if (bb_loop_depth (taken_edge->src) > > >= bb_loop_depth (taken_edge->dest)) > > convert_and_register_jump_thread_path (path, taken_edge); > > > > note you need the PR72772 fix to trigger all this. > I'm a little confused here. In the case where the taken edge goes into a > deeper loop nest you're splitting the edge -- to what end? The backwards > threader isn't going to register that jump thread. So if this is fixing > something, then we've got the fix in the wrong place. Ok, so I've figured that splitting the edge is indeed pointless unless it does exactly the same as creating a forwarder. We may not blindly thread backwards to a loop header because of correctness issues in re-using old loop meta-data for that loop (and in the ubsan.exp=pr71403-2.c case miscompiling the testcase). What we need is a forwarder block we can thread to which eventually becomes the new loop header. Note this is also what we'd achieve with properly initializing loops in the threader - sth we should do anyway with looking at loop meta data. This is likely also why the old threader has (in DOM): /* We need to know loop structures in order to avoid destroying them in jump threading. Note that we still can e.g. thread through loop headers to an exit edge, or through loop header to the loop body, assuming that we update the loop info. TODO: We don't need to set LOOPS_HAVE_PREHEADERS generally, but due to several overly conservative bail-outs in jump threading, case gcc.dg/tree-ssa/pr21417.c can't be threaded if loop preheader is missing. We should improve jump threading in future then LOOPS_HAVE_PREHEADERS won't be needed here. */ loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES); thus we run into exactly those cases now in the FSM threader. Thus the following patch fixes the two testcases with the PR72772 fix applied as well. It's also possible to create a forwarder on-demand at the place I splitted the edge and avoid creating preheaders for all loops but I as we should init the loop optimizer to be able to look at loop metadata anyway there's not much point in doing that. Bootstrap / regtest pending on x86_64-unknown-linux-gnu, ok for trunk? Thanks, Richard. 2016-08-09 Richard Biener * tree-ssa-threadbackward.c (pass_data_thread_jumps): Remove unconditional TODO_cleanup_cfg. (pass_thread_jumps::execute): Initialize loops, perform a CFG cleanup only if we threaded a jump. Index: gcc/tree-ssa-threadbackward.c === *** gcc/tree-ssa-threadbackward.c (revision 239276) --- gcc/tree-ssa-threadbackward.c (working copy) *** const pass_data pass_data_thread_jumps = *** 674,680 0, 0, 0, ! ( TODO_cleanup_cfg | TODO_update_ssa ), }; class pass_thread_jumps : public gimple_opt_pass --- 674,680 0, 0, 0, ! TODO_update_ssa, }; class pass_thread_jumps : public gimple_opt_pass *** pass_thread_jumps::gate (function *fun A *** 699,704 --- 699,706 unsigned int pass_thread_jumps::execute (function *fun) { + loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES); + /* Try to thread each block with more than one successor. */ basic_block bb; FOR_EACH_BB_FN (bb, fun) *** pass_thread_jumps::execute (function *fu *** 706,713 if (EDGE_COUNT (bb->succs) > 1) find_jump_threads_backwards (bb); } ! thread_through_all_blocks (true); ! return 0; } } --- 708,717 if (EDGE_COUNT (bb->succs) > 1) find_jump_threads_backwards (bb); } ! bool changed = thread_through_all_blocks (true); ! ! loop_optimizer_finalize (); ! return changed ? TODO_cleanup_cfg : 0; } }
Re: [PATCH] Improve backwards threading
On Tue, 9 Aug 2016, Richard Biener wrote: > On Fri, 5 Aug 2016, Jeff Law wrote: > > > On 08/05/2016 07:36 AM, Richard Biener wrote: > > > @@ -560,6 +562,14 @@ fsm_find_control_statement_thread_paths > > > edge taken_edge = profitable_jump_thread_path (path, bbi, name, > > > arg); > > > if (taken_edge) > > > { > > > + /* If the taken edge is a loop entry avoid mashing two > > > +loops into one with multiple latches by splitting > > > +the edge. This only works if that block won't become > > > +a latch of this loop. */ > > > + if ((bb_loop_depth (taken_edge->src) > > > + < bb_loop_depth (taken_edge->dest)) > > > + && ! single_succ_p (bbi)) > > > + split_edge (taken_edge); > > > if (bb_loop_depth (taken_edge->src) > > > >= bb_loop_depth (taken_edge->dest)) > > > convert_and_register_jump_thread_path (path, taken_edge); > > > > > > note you need the PR72772 fix to trigger all this. > > I'm a little confused here. In the case where the taken edge goes into a > > deeper loop nest you're splitting the edge -- to what end? The backwards > > threader isn't going to register that jump thread. So if this is fixing > > something, then we've got the fix in the wrong place. > > Ok, so I've figured that splitting the edge is indeed pointless unless > it does exactly the same as creating a forwarder. We may not blindly > thread backwards to a loop header because of correctness issues in > re-using old loop meta-data for that loop (and in the > ubsan.exp=pr71403-2.c case miscompiling the testcase). What we need > is a forwarder block we can thread to which eventually becomes the > new loop header. Note this is also what we'd achieve with properly > initializing loops in the threader - sth we should do anyway with > looking at loop meta data. This is likely also why the old threader has > (in DOM): > > /* We need to know loop structures in order to avoid destroying them > in jump threading. Note that we still can e.g. thread through loop > headers to an exit edge, or through loop header to the loop body, > assuming > that we update the loop info. > > TODO: We don't need to set LOOPS_HAVE_PREHEADERS generally, but due > to several overly conservative bail-outs in jump threading, case > gcc.dg/tree-ssa/pr21417.c can't be threaded if loop preheader is > missing. We should improve jump threading in future then > LOOPS_HAVE_PREHEADERS won't be needed here. */ > loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES); > > thus we run into exactly those cases now in the FSM threader. Thus > the following patch fixes the two testcases with the PR72772 fix > applied as well. > > It's also possible to create a forwarder on-demand at the place I > splitted the edge and avoid creating preheaders for all loops but I > as we should init the loop optimizer to be able to look at loop > metadata anyway there's not much point in doing that. > > Bootstrap / regtest pending on x86_64-unknown-linux-gnu, ok for trunk? I have now applied this as follows (gcc.dg/tree-ssa/ssa-dom-thread-7.c needed adjustment). Bootstrapped / tested on x86_64-unknown-linux-gnu. Richard. 2016-08-11 Richard Biener * tree-ssa-threadbackward.c (pass_data_thread_jumps): Remove unconditional TODO_cleanup_cfg. (pass_thread_jumps::execute): Initialize loops, perform a CFG cleanup only if we threaded a jump. * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust. Index: gcc/tree-ssa-threadbackward.c === *** gcc/tree-ssa-threadbackward.c (revision 239276) --- gcc/tree-ssa-threadbackward.c (working copy) *** const pass_data pass_data_thread_jumps = *** 674,680 0, 0, 0, ! ( TODO_cleanup_cfg | TODO_update_ssa ), }; class pass_thread_jumps : public gimple_opt_pass --- 674,680 0, 0, 0, ! TODO_update_ssa, }; class pass_thread_jumps : public gimple_opt_pass *** pass_thread_jumps::gate (function *fun A *** 699,704 --- 699,706 unsigned int pass_thread_jumps::execute (function *fun) { + loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES); + /* Try to thread each block with more than one successor. */ basic_block bb; FOR_EACH_BB_FN (bb, fun) *** pass_thread_jumps::execute (function *fu *** 706,713 if (EDGE_COUNT (bb->succs) > 1) find_jump_threads_backwards (bb); } ! thread_through_all_blocks (true); ! return 0; } } --- 708,717 if (EDGE_COUNT (bb->succs) > 1) find_jump_threads_backwards (bb); } ! bool changed = thread_through_all_blocks (true); ! ! loop_optimizer_finalize (); ! return chan
Re: [PATCH] Improve backwards threading
On 08/09/2016 07:13 AM, Richard Biener wrote: On Fri, 5 Aug 2016, Jeff Law wrote: On 08/05/2016 07:36 AM, Richard Biener wrote: @@ -560,6 +562,14 @@ fsm_find_control_statement_thread_paths edge taken_edge = profitable_jump_thread_path (path, bbi, name, arg); if (taken_edge) { + /* If the taken edge is a loop entry avoid mashing two +loops into one with multiple latches by splitting +the edge. This only works if that block won't become +a latch of this loop. */ + if ((bb_loop_depth (taken_edge->src) + < bb_loop_depth (taken_edge->dest)) + && ! single_succ_p (bbi)) + split_edge (taken_edge); if (bb_loop_depth (taken_edge->src) >= bb_loop_depth (taken_edge->dest)) convert_and_register_jump_thread_path (path, taken_edge); note you need the PR72772 fix to trigger all this. I'm a little confused here. In the case where the taken edge goes into a deeper loop nest you're splitting the edge -- to what end? The backwards threader isn't going to register that jump thread. So if this is fixing something, then we've got the fix in the wrong place. Ok, so I've figured that splitting the edge is indeed pointless unless it does exactly the same as creating a forwarder. We may not blindly thread backwards to a loop header because of correctness issues in re-using old loop meta-data for that loop (and in the ubsan.exp=pr71403-2.c case miscompiling the testcase). What we need is a forwarder block we can thread to which eventually becomes the new loop header. Note this is also what we'd achieve with properly initializing loops in the threader - sth we should do anyway with looking at loop meta data. This is likely also why the old threader has (in DOM): /* We need to know loop structures in order to avoid destroying them in jump threading. Note that we still can e.g. thread through loop headers to an exit edge, or through loop header to the loop body, assuming that we update the loop info. TODO: We don't need to set LOOPS_HAVE_PREHEADERS generally, but due to several overly conservative bail-outs in jump threading, case gcc.dg/tree-ssa/pr21417.c can't be threaded if loop preheader is missing. We should improve jump threading in future then LOOPS_HAVE_PREHEADERS won't be needed here. */ loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES); thus we run into exactly those cases now in the FSM threader. Thus the following patch fixes the two testcases with the PR72772 fix applied as well. Right. I'm pretty sure there's a BZ around this issue in my queue :-) It's also possible to create a forwarder on-demand at the place I splitted the edge and avoid creating preheaders for all loops but I as we should init the loop optimizer to be able to look at loop metadata anyway there's not much point in doing that. Bootstrap / regtest pending on x86_64-unknown-linux-gnu, ok for trunk? Thanks, Richard. 2016-08-09 Richard Biener * tree-ssa-threadbackward.c (pass_data_thread_jumps): Remove unconditional TODO_cleanup_cfg. (pass_thread_jumps::execute): Initialize loops, perform a CFG cleanup only if we threaded a jump. OK. jeff