Until now DOM has had to be very conservative with handling control
statements with known conditions. This as been an unfortunate side
effect of the interaction between removing edges and recycling names via
the SSA_NAME manager.
Essentially DOM would have to leave control statements alone. So you'd
see stuff like
if (0 == 0)
left around by DOM. The jump threader would thankfully come along and
optimize that as a jump thread. But that's terribly inefficient, not to
mention it creates unnecessary churn in the CFG and SSA_NAMEs.
By optimizing that directly in DOM, including removing whatever edges
are not executable, we no longer have to rely on jump threading to
handle that case. Less churn in the CFG & SSA_NAMEs. There's also
some chance for secondary optimizations with fewer edges left in the CFG
for DOM to consider.
Unfortunately, the churn caused by jump threading made it excessively
difficult to analyze before/after dumps. Sadly, you can have the same
code, but if the SSA_NAMEs have changed, that impacts coalescing as we
leave SSA. Churn in the CFG changes labels/jumps, often without
changing the actual structure, etc.
I did some tests with valgrind to evaluate branching behaviour
before/after effects on the resulting code and those effects were tiny,
in the I doubt you could measure them range. That was expected since
what we're really doing here is just capturing the optimization earlier.
I had a couple more tests, but they were lost in a bit of idiocy. The
test included is the one I had a second copy of lying around.
Bootstrapped and regression tested on x86_64-linux-gnu. Installed on
the trunk.
Jeff
* tree-ssa-dom.c (optimize_stmt): Collapse control flow statements
with constant conditions.
* tree-ssa-threadupdate.c (remove_jump_threads_starting_at): New.
(remove_ctrl_stmt_and_useless_edges): No longer static.
* tree-ssa-threadupdate.h (remove_jump_threads_starting_at): Prototype.
(remove_ctrl_stmt_and_useless_edges): Likewise.
* gcc.dg/tree-ssa/ssa-dom-branch-1.c: New test.
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 2c51e36..a8b7038 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -1820,31 +1820,8 @@ optimize_stmt (basic_block bb, gimple_stmt_iterator si,
if (is_gimple_assign (stmt))
record_equivalences_from_stmt (stmt, may_optimize_p, avail_exprs_stack);
- /* If STMT is a COND_EXPR and it was modified, then we may know
- where it goes. If that is the case, then mark the CFG as altered.
-
- This will cause us to later call remove_unreachable_blocks and
- cleanup_tree_cfg when it is safe to do so. It is not safe to
- clean things up here since removal of edges and such can trigger
- the removal of PHI nodes, which in turn can release SSA_NAMEs to
- the manager.
-
- That's all fine and good, except that once SSA_NAMEs are released
- to the manager, we must not call create_ssa_name until all references
- to released SSA_NAMEs have been eliminated.
-
- All references to the deleted SSA_NAMEs can not be eliminated until
- we remove unreachable blocks.
-
- We can not remove unreachable blocks until after we have completed
- any queued jump threading.
-
- We can not complete any queued jump threads until we have taken
- appropriate variables out of SSA form. Taking variables out of
- SSA form can call create_ssa_name and thus we lose.
-
- Ultimately I suspect we're going to need to change the interface
- into the SSA_NAME manager. */
+ /* If STMT is a COND_EXPR or SWITCH_EXPR and it was modified, then we may
+ know where it goes. */
if (gimple_modified_p (stmt) || modified_p)
{
tree val = NULL;
@@ -1858,8 +1835,27 @@ optimize_stmt (basic_block bb, gimple_stmt_iterator si,
else if (gswitch *swtch_stmt = dyn_cast <gswitch *> (stmt))
val = gimple_switch_index (swtch_stmt);
- if (val && TREE_CODE (val) == INTEGER_CST && find_taken_edge (bb, val))
- cfg_altered = true;
+ if (val && TREE_CODE (val) == INTEGER_CST)
+ {
+ edge taken_edge = find_taken_edge (bb, val);
+ if (taken_edge)
+ {
+ /* Delete threads that start at BB. */
+ remove_jump_threads_starting_at (bb);
+
+ /* Now clean up the control statement at the end of
+ BB and remove unexecutable edges. */
+ remove_ctrl_stmt_and_useless_edges (bb, taken_edge->dest);
+
+ /* Fixup the flags on the single remaining edge. */
+ taken_edge->flags
+ &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE | EDGE_ABNORMAL);
+ taken_edge->flags |= EDGE_FALLTHRU;
+
+ /* Further simplifications may be possible. */
+ cfg_altered = true;
+ }
+ }
/* If we simplified a statement in such a way as to be shown that it
cannot trap, update the eh information and the cfg to match. */
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 6f21529..4a147bb 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -260,7 +260,7 @@ struct thread_stats_d thread_stats;
Also remove all outgoing edges except the edge which reaches DEST_BB.
If DEST_BB is NULL, then remove all outgoing edges. */
-static void
+void
remove_ctrl_stmt_and_useless_edges (basic_block bb, basic_block dest_bb)
{
gimple_stmt_iterator gsi;
@@ -2539,6 +2539,37 @@ valid_jump_thread_path (vec<jump_thread_edge *> *path)
return true;
}
+/* Remove any queued jump threads that start at BB. */
+
+void
+remove_jump_threads_starting_at (basic_block bb)
+{
+ if (!paths.exists ())
+ return;
+
+ for (unsigned i = 0; i < paths.length ();)
+ {
+ vec<jump_thread_edge *> *path = paths[i];
+
+ /* Sadly, FSM jump threads have a slightly different
+ representation than the rest of the jump threads. */
+ if ((*path)[0]->type == EDGE_FSM_THREAD
+ && (*path)[0]->e->src == bb)
+ {
+ delete_jump_thread_path (path);
+ paths.unordered_remove (i);
+ }
+ else if ((*path)[0]->type != EDGE_FSM_THREAD
+ && (*path)[0]->e->dest == bb)
+ {
+ delete_jump_thread_path (path);
+ paths.unordered_remove (i);
+ }
+ else
+ i++;
+ }
+}
+
/* Walk through all blocks and thread incoming edges to the appropriate
outgoing edge for each edge pair recorded in THREADED_EDGES.
diff --git a/gcc/tree-ssa-threadupdate.h b/gcc/tree-ssa-threadupdate.h
index 21a9ee3..30428e8 100644
--- a/gcc/tree-ssa-threadupdate.h
+++ b/gcc/tree-ssa-threadupdate.h
@@ -43,5 +43,7 @@ public:
};
extern void register_jump_thread (vec <class jump_thread_edge *> *);
+extern void remove_jump_threads_starting_at (basic_block);
extern void delete_jump_thread_path (vec <class jump_thread_edge *> *);
+extern void remove_ctrl_stmt_and_useless_edges (basic_block, basic_block);
#endif
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c
new file mode 100644
index 0000000..4c7631c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -w -fdump-tree-dom1-details" } */
+
+typedef struct rtx_def *rtx;
+struct rtx_def
+{
+ int code;
+ rtx rt_rtx;
+};
+rtx
+try_combine (rtx i1, rtx newpat)
+{
+ rtx temp;
+ if (i1 && (temp = ((((((newpat->rt_rtx, ((((temp)->code) == 42)))))))))
+ && ((temp =
+ (((((((((((newpat)->rt_rtx),
+ ((((temp)->code) == 42) && arf ())))))))))))))
+ ;
+ else if (i1 && foo ());
+}
+
+/* There should be three tests against i1. Two from the hash table
+ dumps, one in the code itself. */
+/* { dg-final { scan-tree-dump-times "if .i1_" 3 "dom1"} } */
+
+/* There should be no actual jump threads realized by DOM. The
+ legitimize jump threads are handled in VRP and those discovered
+ by DOM are subsumed by collapsing a conditional. */
+/* { dg-final { scan-tree-dump-not "Threaded" "dom1"} } */
+
+