After shrink-wrapping has found the "tightest fit" for where to place
the prologue, it tries move it earlier (so that frame saves are run
earlier) -- but without copying any more basic blocks.

Unfortunately a candidate block we select can be inside a loop, and we
will still allow it (because the loop always exits via our previously
chosen block).  We can do that just fine if we make a duplicate of the
block, but we do not want to here.

So we need to detect this situation.  We can place the prologue at a
previous block PRE only if PRE dominates every block reachable from
it, because then we will never need to duplicate that block (it will
always be executed with prologue).

Tested on the two testcases from the PRs.  Also regression checked
on powerpc64-linux (actually, that is still running).

Is this okay for trunk?


Segher


2015-12-07  Segher Boessenkool  <seg...@kernel.crashing.org>

        PR rtl-optimization/67778
        PR rtl-optimization/68634
        * shrink-wrap.c (try_shrink_wrapping): Add a comment about why we want
        to put the prologue earlier.  When determining if an earlier block is
        suitable, make sure it dominates every block reachable from it.

---
 gcc/shrink-wrap.c | 42 +++++++++++++++++++++++++++++++++++++-----
 1 file changed, 37 insertions(+), 5 deletions(-)

diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c
index 3a1df84..48f8211 100644
--- a/gcc/shrink-wrap.c
+++ b/gcc/shrink-wrap.c
@@ -744,36 +744,68 @@ try_shrink_wrapping (edge *entry_edge, bitmap_head 
*bb_with,
          vec.quick_push (e->dest);
     }
 
-  vec.release ();
-
   if (dump_file)
     fprintf (dump_file, "Avoiding non-duplicatable blocks, PRO is now %d\n",
             pro->index);
 
   /* If we can move PRO back without having to duplicate more blocks, do so.
+     We do this because putting the prologue earlier is better for scheduling.
      We can move back to a block PRE if every path from PRE will eventually
-     need a prologue, that is, PRO is a post-dominator of PRE.  */
+     need a prologue, that is, PRO is a post-dominator of PRE.  PRE needs
+     to dominate every block reachable from itself.  */
 
   if (pro != entry)
     {
       calculate_dominance_info (CDI_POST_DOMINATORS);
 
+      bitmap bb_tmp = BITMAP_ALLOC (NULL);
+      bitmap_copy (bb_tmp, bb_with);
       basic_block last_ok = pro;
+      vec.truncate (0);
+
       while (pro != entry)
        {
          basic_block pre = get_immediate_dominator (CDI_DOMINATORS, pro);
          if (!dominated_by_p (CDI_POST_DOMINATORS, pre, pro))
            break;
 
+         if (bitmap_set_bit (bb_tmp, pre->index))
+           vec.quick_push (pre);
+
+         bool ok = true;
+         while (!vec.is_empty ())
+           {
+             basic_block bb = vec.pop ();
+             bitmap_set_bit (bb_tmp, pre->index);
+
+             if (!dominated_by_p (CDI_DOMINATORS, bb, pre))
+               {
+                 ok = false;
+                 break;
+               }
+
+             FOR_EACH_EDGE (e, ei, bb->succs)
+               if (!bitmap_bit_p (bb_with, e->dest->index)
+                   && bitmap_set_bit (bb_tmp, e->dest->index))
+                 vec.quick_push (e->dest);
+           }
+
+         if (ok)
+           last_ok = pre;
+         else
+           break;
+
          pro = pre;
-         if (can_get_prologue (pro, prologue_clobbered))
-           last_ok = pro;
        }
+
       pro = last_ok;
 
+      BITMAP_FREE (bb_tmp);
       free_dominance_info (CDI_POST_DOMINATORS);
     }
 
+  vec.release ();
+
   if (dump_file)
     fprintf (dump_file, "Bumping back to anticipatable blocks, PRO is now 
%d\n",
             pro->index);
-- 
1.9.3

Reply via email to