Re: Some backward threader refactoring
On 11/14/2016 02:39 AM, Jeff Law wrote: I was looking at the possibility of dropping threading from VRP1/VRP2 or DOM1/DOM2 in favor of the backwards threader -- the obvious idea being to recover some compile-time for gcc-7. Of the old-style threader passes (VRP1, VRP2, DOM1, DOM2), VRP2 is by far the least useful. But I can't see a path to removing it in the gcc-7 timeframe. Looking at what is caught by VRP and DOM threaders is quite interesting. VRP obviously catches stuff with ranges, some fairly complex. While you might think that querying range info in the backwards threader would work, the problem is we lose way too much information as we drop ASSERT_EXPRs. (Recall that the threader runs while we're still in VRP and thus has access to the ASSERT_EXPRs). The DOM threaders catch stuff through state, simplifications and bi-directional propagation of equivalences created by conditionals. The most obvious limitation of the backwards walking threader is that it only looks at PHIs, copies and constant initializations. Other statements are ignored and stop the backwards walk. I've got a fair amount of support for walking through unary and limited form binary expressions that I believe can be extended based on needs. But that's not quite ready for stage1 close. However, some of the refactoring to make those changes easier to implement is ready. This patch starts to break down fsm_find_control_statement_thread_paths into more manageable hunks. One such hunk is sub-path checking. Essentially we're looking to add a range of blocks to the thread path as we move from one def site to another in the IL. There aren't any functional changes in that refactoring. It's really just to make f_f_c_s_t_p easier to grok. f_f_c_s_t_p has inline code to recursively walk backwards through PHI nodes as well as assignments that are copies and constant initialization terminals. Pulling that handling out results in a f_f_c_s_t_p that fits on a page. It's just a hell of a lot easier to see what's going on. The handling of assignments is slightly improved in this patch. Essentially we only considered a const initialization using an INTEGER_CST as a proper terminal node. But certainly other constants are useful -- ADDR_EXPR in particular and are now handled. I'll mirror that improvement in the PHI node routines tomorrow. Anyway, this is really just meant to make it easier to start extending the GIMPLE_ASSIGN handling. Bootstrapped and regression tested on x86_64-linux-gnu. I've got function comments for the new routines on a local branch. I'll get those installed before committing. Final version attached. Only change was allowing tcc_constant rather than just INTEGER_CST in PHIs and the addition of comments. Bootstrapped and regression tested on x86, installing on the trunk. Jeff commit 4cbde473b184922d6c8423a7a63bdbb86de32b33 Author: Jeff LawDate: Tue Nov 15 09:16:26 2016 -0700 * tree-ssa-threadbackward.c (fsm_find_thread_path): Remove unneeded parameter. Callers changed. (check-subpath_and_update_thread_path): Extracted from fsm_find_control_statement_thread_paths. (handle_phi, handle_assignment, handle_assignment_p): Likewise. (handle_phi, handle_assignment): Allow any constant node, not just INTEGER_CST. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 1e8475f..a54423a 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,13 @@ +2016-11-15 Jeff Law + + * tree-ssa-threadbackward.c (fsm_find_thread_path): Remove unneeded + parameter. Callers changed. + (check-subpath_and_update_thread_path): Extracted from + fsm_find_control_statement_thread_paths. + (handle_phi, handle_assignment, handle_assignment_p): Likewise. + (handle_phi, handle_assignment): Allow any constant node, not + just INTEGER_CST. + 2016-11-15 Claudiu Zissulescu * config/arc/arc-arch.h: New file. diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index fd7d855..203e20e 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -62,14 +62,12 @@ get_gimple_control_stmt (basic_block bb) /* Return true if the CFG contains at least one path from START_BB to END_BB. When a path is found, record in PATH the blocks from END_BB to START_BB. VISITED_BBS is used to make sure we don't fall into an infinite loop. Bound - the recursion to basic blocks belonging to LOOP. - SPEED_P indicate that we could increase code size to improve the code path */ + the recursion to basic blocks belonging to LOOP. */ static bool fsm_find_thread_path (basic_block start_bb, basic_block end_bb, vec *, - hash_set *visited_bbs, loop_p loop, - bool speed_p) + hash_set *visited_bbs, loop_p loop) { if (loop !=
Some backward threader refactoring
I was looking at the possibility of dropping threading from VRP1/VRP2 or DOM1/DOM2 in favor of the backwards threader -- the obvious idea being to recover some compile-time for gcc-7. Of the old-style threader passes (VRP1, VRP2, DOM1, DOM2), VRP2 is by far the least useful. But I can't see a path to removing it in the gcc-7 timeframe. Looking at what is caught by VRP and DOM threaders is quite interesting. VRP obviously catches stuff with ranges, some fairly complex. While you might think that querying range info in the backwards threader would work, the problem is we lose way too much information as we drop ASSERT_EXPRs. (Recall that the threader runs while we're still in VRP and thus has access to the ASSERT_EXPRs). The DOM threaders catch stuff through state, simplifications and bi-directional propagation of equivalences created by conditionals. The most obvious limitation of the backwards walking threader is that it only looks at PHIs, copies and constant initializations. Other statements are ignored and stop the backwards walk. I've got a fair amount of support for walking through unary and limited form binary expressions that I believe can be extended based on needs. But that's not quite ready for stage1 close. However, some of the refactoring to make those changes easier to implement is ready. This patch starts to break down fsm_find_control_statement_thread_paths into more manageable hunks. One such hunk is sub-path checking. Essentially we're looking to add a range of blocks to the thread path as we move from one def site to another in the IL. There aren't any functional changes in that refactoring. It's really just to make f_f_c_s_t_p easier to grok. f_f_c_s_t_p has inline code to recursively walk backwards through PHI nodes as well as assignments that are copies and constant initialization terminals. Pulling that handling out results in a f_f_c_s_t_p that fits on a page. It's just a hell of a lot easier to see what's going on. The handling of assignments is slightly improved in this patch. Essentially we only considered a const initialization using an INTEGER_CST as a proper terminal node. But certainly other constants are useful -- ADDR_EXPR in particular and are now handled. I'll mirror that improvement in the PHI node routines tomorrow. Anyway, this is really just meant to make it easier to start extending the GIMPLE_ASSIGN handling. Bootstrapped and regression tested on x86_64-linux-gnu. I've got function comments for the new routines on a local branch. I'll get those installed before committing. Jeff * tree-ssa-threadbackward.c (fsm_find_thread_path): Remove unneeded parameter. Callers changed. (check-subpath_and_update_thread_path): Extracted from fsm_find_control_statement_thread_paths. (handle_phi, handle_assignment, handle_assignment_p): Likewise. (handle_assignment): Allow any constant node, not just INTEGER_CST. diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index fd7d855..9ff3d75 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -62,14 +62,12 @@ get_gimple_control_stmt (basic_block bb) /* Return true if the CFG contains at least one path from START_BB to END_BB. When a path is found, record in PATH the blocks from END_BB to START_BB. VISITED_BBS is used to make sure we don't fall into an infinite loop. Bound - the recursion to basic blocks belonging to LOOP. - SPEED_P indicate that we could increase code size to improve the code path */ + the recursion to basic blocks belonging to LOOP. */ static bool fsm_find_thread_path (basic_block start_bb, basic_block end_bb, vec*, - hash_set *visited_bbs, loop_p loop, - bool speed_p) + hash_set *visited_bbs, loop_p loop) { if (loop != start_bb->loop_father) return false; @@ -85,8 +83,7 @@ fsm_find_thread_path (basic_block start_bb, basic_block end_bb, edge e; edge_iterator ei; FOR_EACH_EDGE (e, ei, start_bb->succs) - if (fsm_find_thread_path (e->dest, end_bb, path, visited_bbs, loop, - speed_p)) + if (fsm_find_thread_path (e->dest, end_bb, path, visited_bbs, loop)) { vec_safe_push (path, start_bb); return true; @@ -427,6 +424,196 @@ convert_and_register_jump_thread_path (vec *path, --max_threaded_paths; } +/* While following a chain of SSA_NAME definitions, we jumped from a definition + in LAST_BB to a definition in VAR_BB (walking backwards). + + Verify there is a single path between the blocks and none of the blocks + in the path is already in VISITED_BBS. If so, then update VISISTED_BBS, + add the new blocks to PATH and return TRUE. Otherwise return FALSE. + + Store the length of the subpath in