Re: Some backward threader refactoring

2016-11-15 Thread Jeff Law

On 11/14/2016 02:39 AM, Jeff Law wrote:



I was looking at the possibility of dropping threading from VRP1/VRP2 or
DOM1/DOM2 in favor of the backwards threader -- the obvious idea being
to recover some compile-time for gcc-7.

Of the old-style threader passes (VRP1, VRP2, DOM1, DOM2), VRP2 is by
far the least useful.  But I can't see a path to removing it in the
gcc-7 timeframe.

Looking at what is caught by VRP and DOM threaders is quite interesting.
 VRP obviously catches stuff with ranges, some fairly complex.  While
you might think that querying range info in the backwards threader would
work, the problem is we lose way too much information as we drop
ASSERT_EXPRs.  (Recall that the threader runs while we're still in VRP
and thus has access to the ASSERT_EXPRs).

The DOM threaders catch stuff through state, simplifications and
bi-directional propagation of equivalences created by conditionals.

The most obvious limitation of the backwards walking threader is that it
only looks at PHIs, copies and constant initializations.  Other
statements are ignored and stop the backwards walk.

I've got a fair amount of support for walking through unary and limited
form binary expressions that I believe can be extended based on needs.
But that's not quite ready for stage1 close.  However, some of the
refactoring to make those changes easier to implement is ready.

This patch starts to break down fsm_find_control_statement_thread_paths
into more manageable hunks.

One such hunk is sub-path checking.  Essentially we're looking to add a
range of blocks to the thread path as we move from one def site to
another in the IL.  There aren't any functional changes in that
refactoring.  It's really just to make f_f_c_s_t_p easier to grok.

f_f_c_s_t_p has inline code to recursively walk backwards through PHI
nodes as well as assignments that are copies and constant initialization
terminals.  Pulling that handling out results in a f_f_c_s_t_p that fits
on a page.  It's just a hell of a lot easier to see what's going on.

The handling of assignments is slightly improved in this patch.
Essentially we only considered a const initialization using an
INTEGER_CST as a proper terminal node.  But certainly other constants
are useful -- ADDR_EXPR in particular and are now handled.  I'll mirror
that improvement in the PHI node routines tomorrow.

Anyway, this is really just meant to make it easier to start extending
the GIMPLE_ASSIGN handling.

Bootstrapped and regression tested on x86_64-linux-gnu.

I've got function comments for the new routines on a local branch.  I'll
get those installed before committing.
Final version attached.  Only change was allowing tcc_constant rather 
than just INTEGER_CST in PHIs and the addition of comments.


Bootstrapped and regression tested on x86, installing on the trunk.

Jeff
commit 4cbde473b184922d6c8423a7a63bdbb86de32b33
Author: Jeff Law 
Date:   Tue Nov 15 09:16:26 2016 -0700

* tree-ssa-threadbackward.c (fsm_find_thread_path): Remove unneeded
parameter.  Callers changed.
(check-subpath_and_update_thread_path): Extracted from
fsm_find_control_statement_thread_paths.
(handle_phi, handle_assignment, handle_assignment_p): Likewise.
(handle_phi, handle_assignment): Allow any constant node, not
just INTEGER_CST.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 1e8475f..a54423a 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,13 @@
+2016-11-15  Jeff Law  
+
+   * tree-ssa-threadbackward.c (fsm_find_thread_path): Remove unneeded
+   parameter.  Callers changed.
+   (check-subpath_and_update_thread_path): Extracted from
+   fsm_find_control_statement_thread_paths.
+   (handle_phi, handle_assignment, handle_assignment_p): Likewise.
+   (handle_phi, handle_assignment): Allow any constant node, not
+   just INTEGER_CST.
+
 2016-11-15  Claudiu Zissulescu  
 
* config/arc/arc-arch.h: New file.
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index fd7d855..203e20e 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -62,14 +62,12 @@ get_gimple_control_stmt (basic_block bb)
 /* Return true if the CFG contains at least one path from START_BB to END_BB.
When a path is found, record in PATH the blocks from END_BB to START_BB.
VISITED_BBS is used to make sure we don't fall into an infinite loop.  Bound
-   the recursion to basic blocks belonging to LOOP.
-   SPEED_P indicate that we could increase code size to improve the code path 
*/
+   the recursion to basic blocks belonging to LOOP.  */
 
 static bool
 fsm_find_thread_path (basic_block start_bb, basic_block end_bb,
  vec *,
- hash_set *visited_bbs, loop_p loop,
- bool speed_p)
+ hash_set *visited_bbs, loop_p loop)
 {
   if (loop != 

Some backward threader refactoring

2016-11-14 Thread Jeff Law



I was looking at the possibility of dropping threading from VRP1/VRP2 or 
DOM1/DOM2 in favor of the backwards threader -- the obvious idea being 
to recover some compile-time for gcc-7.


Of the old-style threader passes (VRP1, VRP2, DOM1, DOM2), VRP2 is by 
far the least useful.  But I can't see a path to removing it in the 
gcc-7 timeframe.


Looking at what is caught by VRP and DOM threaders is quite interesting. 
 VRP obviously catches stuff with ranges, some fairly complex.  While 
you might think that querying range info in the backwards threader would 
work, the problem is we lose way too much information as we drop 
ASSERT_EXPRs.  (Recall that the threader runs while we're still in VRP 
and thus has access to the ASSERT_EXPRs).


The DOM threaders catch stuff through state, simplifications and 
bi-directional propagation of equivalences created by conditionals.


The most obvious limitation of the backwards walking threader is that it 
only looks at PHIs, copies and constant initializations.  Other 
statements are ignored and stop the backwards walk.


I've got a fair amount of support for walking through unary and limited 
form binary expressions that I believe can be extended based on needs. 
But that's not quite ready for stage1 close.  However, some of the 
refactoring to make those changes easier to implement is ready.


This patch starts to break down fsm_find_control_statement_thread_paths 
into more manageable hunks.


One such hunk is sub-path checking.  Essentially we're looking to add a 
range of blocks to the thread path as we move from one def site to 
another in the IL.  There aren't any functional changes in that 
refactoring.  It's really just to make f_f_c_s_t_p easier to grok.


f_f_c_s_t_p has inline code to recursively walk backwards through PHI 
nodes as well as assignments that are copies and constant initialization 
terminals.  Pulling that handling out results in a f_f_c_s_t_p that fits 
on a page.  It's just a hell of a lot easier to see what's going on.


The handling of assignments is slightly improved in this patch. 
Essentially we only considered a const initialization using an 
INTEGER_CST as a proper terminal node.  But certainly other constants 
are useful -- ADDR_EXPR in particular and are now handled.  I'll mirror 
that improvement in the PHI node routines tomorrow.


Anyway, this is really just meant to make it easier to start extending 
the GIMPLE_ASSIGN handling.


Bootstrapped and regression tested on x86_64-linux-gnu.

I've got function comments for the new routines on a local branch.  I'll 
get those installed before committing.


Jeff
* tree-ssa-threadbackward.c (fsm_find_thread_path): Remove unneeded
parameter.  Callers changed.
(check-subpath_and_update_thread_path): Extracted from
fsm_find_control_statement_thread_paths.
(handle_phi, handle_assignment, handle_assignment_p): Likewise.
(handle_assignment): Allow any constant node, not just INTEGER_CST.


diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index fd7d855..9ff3d75 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -62,14 +62,12 @@ get_gimple_control_stmt (basic_block bb)
 /* Return true if the CFG contains at least one path from START_BB to END_BB.
When a path is found, record in PATH the blocks from END_BB to START_BB.
VISITED_BBS is used to make sure we don't fall into an infinite loop.  Bound
-   the recursion to basic blocks belonging to LOOP.
-   SPEED_P indicate that we could increase code size to improve the code path 
*/
+   the recursion to basic blocks belonging to LOOP.  */
 
 static bool
 fsm_find_thread_path (basic_block start_bb, basic_block end_bb,
  vec *,
- hash_set *visited_bbs, loop_p loop,
- bool speed_p)
+ hash_set *visited_bbs, loop_p loop)
 {
   if (loop != start_bb->loop_father)
 return false;
@@ -85,8 +83,7 @@ fsm_find_thread_path (basic_block start_bb, basic_block 
end_bb,
   edge e;
   edge_iterator ei;
   FOR_EACH_EDGE (e, ei, start_bb->succs)
-   if (fsm_find_thread_path (e->dest, end_bb, path, visited_bbs, loop,
- speed_p))
+   if (fsm_find_thread_path (e->dest, end_bb, path, visited_bbs, loop))
  {
vec_safe_push (path, start_bb);
return true;
@@ -427,6 +424,196 @@ convert_and_register_jump_thread_path (vec *path,
   --max_threaded_paths;
 }
 
+/* While following a chain of SSA_NAME definitions, we jumped from a definition
+   in LAST_BB to a definition in VAR_BB (walking backwards).
+
+   Verify there is a single path between the blocks and none of the blocks
+   in the path is already in VISITED_BBS.  If so, then update VISISTED_BBS,
+   add the new blocks to PATH and return TRUE.  Otherwise return FALSE.
+
+   Store the length of the subpath in