Re: [PATCH] shrink-wrap: Once more PRs 67778, 68634, and now 68909
On Wed, Jan 06, 2016 at 06:36:19AM -0600, Segher Boessenkool wrote: > This fixes this problem. Tested on the 68909 testcase, and bootstrapped > and regression checked on powerpc64-linux. Is this okay for trunk? Also tested on x86_64-linux now. Segher > 2016-01-06 Segher Boessenkool > > PR rtl-optimization/67778 > PR rtl-optimization/68634 > PR rtl-optimization/68909 > * shrink-wrap.c (try_shrink_wrapping): Add comment. Don't pop > block from the stack until done with it. Remove a superfluous > bitmap set. Remove a superfluous bitmap test.
Re: [PATCH] shrink-wrap: Once more PRs 67778, 68634, and now 68909
On Mon, Jan 04, 2016 at 03:24:10PM +0100, Bernd Schmidt wrote: > >>This code is getting really quite confusing, > >>and at the least I think we > >>need more documentation of what exactly vec is supposed to contain at > >>the entry to the inner while loop here. > > > >Same as in the other loop: vec is a stack of blocks that still need to > >be looked at. I can duplicate the comment if you want? > > No, I think more is needed. Thanks for the review. New patch attached. > The inner loop looks like it should be > emptying the vec, but this is not true if we break out of it, and your > patch now even adds an explicit push. I added a big comment explaining the algorithm, and changed the push back to do a peek instead. > It also looks like it wants to use > the bb_tmp bitmap to cache results for future iterations of the outer > loop, but I'm not convinced this is actually correct. I can't follow > this behaviour anymore without clear a description of intent. See the new comment. > Also, it might be clearer to not modify "pro" in this loop - use a > "cand" variable, and modify "pro" instead of last_ok, getting rid of the > latter. I tried that, and it doesn't make things clearer in my opinion. Also tried assigning "pro = pre" earlier, to make it more similar to the previous loop; also not an improvement. The patch also removes a superfluous (and confusing) bitmap set as well as a bitmap test. How's this? Segher Subject: [PATCH] shrink-wrap: Once more PRs 67778, 68634, and now 68909 If a candidate PRE cannot get the prologue because a block BB is reachable from it, but PRE does not dominate BB, we try again with the dominators of PRE. That "try again" needs to again consider BB though, we aren't done with it. This fixes this problem. Tested on the 68909 testcase, and bootstrapped and regression checked on powerpc64-linux. Is this okay for trunk? 2016-01-06 Segher Boessenkool PR rtl-optimization/67778 PR rtl-optimization/68634 PR rtl-optimization/68909 * shrink-wrap.c (try_shrink_wrapping): Add comment. Don't pop block from the stack until done with it. Remove a superfluous bitmap set. Remove a superfluous bitmap test. --- gcc/shrink-wrap.c | 23 --- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c index e51bd36..84abd6b 100644 --- a/gcc/shrink-wrap.c +++ b/gcc/shrink-wrap.c @@ -750,9 +750,21 @@ try_shrink_wrapping (edge *entry_edge, bitmap_head *bb_with, /* If we can move PRO back without having to duplicate more blocks, do so. We do this because putting the prologue earlier is better for scheduling. + We can move back to a block PRE if every path from PRE will eventually need a prologue, that is, PRO is a post-dominator of PRE. PRE needs - to dominate every block reachable from itself. */ + to dominate every block reachable from itself. We keep in BB_TMP a + bitmap of the blocks reachable from PRE that we already found, and in + VEC a stack of those we still need to consider. + + Any block reachable from PRE is also reachable from all predecessors + of PRE, so if we find we need to move PRE back further we can leave + everything not considered so far on the stack. Any block dominated + by PRE is also dominated by all other dominators of PRE, so anything + found good for some PRE does not need to be reconsidered later. + + We don't need to update BB_WITH because none of the new blocks found + can jump to a block that does not need the prologue. */ if (pro != entry) { @@ -775,18 +787,15 @@ try_shrink_wrapping (edge *entry_edge, bitmap_head *bb_with, bool ok = true; while (!vec.is_empty ()) { - basic_block bb = vec.pop (); - bitmap_set_bit (bb_tmp, pre->index); - - if (!dominated_by_p (CDI_DOMINATORS, bb, pre)) + if (!dominated_by_p (CDI_DOMINATORS, vec.last (), pre)) { ok = false; break; } + basic_block bb = vec.pop (); FOR_EACH_EDGE (e, ei, bb->succs) - if (!bitmap_bit_p (bb_with, e->dest->index) - && bitmap_set_bit (bb_tmp, e->dest->index)) + if (bitmap_set_bit (bb_tmp, e->dest->index)) vec.quick_push (e->dest); } -- 1.9.3
Re: [PATCH] shrink-wrap: Once more PRs 67778, 68634, and now 68909
On 12/20/2015 05:27 PM, Segher Boessenkool wrote: On Fri, Dec 18, 2015 at 02:19:37AM +0100, Bernd Schmidt wrote: On 12/17/2015 10:07 PM, Segher Boessenkool wrote: It turns out v4 wasn't quite complete anyway; so here "v5". If a candidate PRE cannot get the prologue because a block BB is reachable from it, but PRE does not dominate BB, we try again with the dominators of PRE. That "try again" needs to again consider BB though, we aren't done with it. This fixes this problem. Tested on the 68909 testcase, and bootstrapped and regression checked on powerpc64-linux. Is this okay for trunk? This code is getting really quite confusing, and at the least I think we need more documentation of what exactly vec is supposed to contain at the entry to the inner while loop here. Same as in the other loop: vec is a stack of blocks that still need to be looked at. I can duplicate the comment if you want? No, I think more is needed. The inner loop looks like it should be emptying the vec, but this is not true if we break out of it, and your patch now even adds an explicit push. It also looks like it wants to use the bb_tmp bitmap to cache results for future iterations of the outer loop, but I'm not convinced this is actually correct. I can't follow this behaviour anymore without clear a description of intent. Also, it might be clearer to not modify "pro" in this loop - use a "cand" variable, and modify "pro" instead of last_ok, getting rid of the latter. That would be a regression (from GCC 5); but I understand your worry. How about we disable it if any further problems show up? Let's see whether we can make sense of this code and decide then. bernd
Re: [PATCH] shrink-wrap: Once more PRs 67778, 68634, and now 68909
On Fri, Dec 18, 2015 at 02:19:37AM +0100, Bernd Schmidt wrote: > On 12/17/2015 10:07 PM, Segher Boessenkool wrote: > >It turns out v4 wasn't quite complete anyway; so here "v5". > > > >If a candidate PRE cannot get the prologue because a block BB is > >reachable from it, but PRE does not dominate BB, we try again with the > >dominators of PRE. That "try again" needs to again consider BB though, > >we aren't done with it. > > > >This fixes this problem. Tested on the 68909 testcase, and bootstrapped > >and regression checked on powerpc64-linux. Is this okay for trunk? > > This code is getting really quite confusing, Yes :-( I don't think stage 3 is the time to completely rewrite it though. > and at the least I think we > need more documentation of what exactly vec is supposed to contain at > the entry to the inner while loop here. Same as in the other loop: vec is a stack of blocks that still need to be looked at. I can duplicate the comment if you want? > I'm also beginning to think we should disable this part of the code for > gcc-6. That would be a regression (from GCC 5); but I understand your worry. How about we disable it if any further problems show up? Segher
Re: [PATCH] shrink-wrap: Once more PRs 67778, 68634, and now 68909
On 12/17/2015 10:07 PM, Segher Boessenkool wrote: It turns out v4 wasn't quite complete anyway; so here "v5". If a candidate PRE cannot get the prologue because a block BB is reachable from it, but PRE does not dominate BB, we try again with the dominators of PRE. That "try again" needs to again consider BB though, we aren't done with it. This fixes this problem. Tested on the 68909 testcase, and bootstrapped and regression checked on powerpc64-linux. Is this okay for trunk? This code is getting really quite confusing, and at the least I think we need more documentation of what exactly vec is supposed to contain at the entry to the inner while loop here. I'm also beginning to think we should disable this part of the code for gcc-6. Bernd
[PATCH] shrink-wrap: Once more PRs 67778, 68634, and now 68909
It turns out v4 wasn't quite complete anyway; so here "v5". If a candidate PRE cannot get the prologue because a block BB is reachable from it, but PRE does not dominate BB, we try again with the dominators of PRE. That "try again" needs to again consider BB though, we aren't done with it. This fixes this problem. Tested on the 68909 testcase, and bootstrapped and regression checked on powerpc64-linux. Is this okay for trunk? Segher 2015-12-17 Segher Boessenkool PR rtl-optimization/67778 PR rtl-optimization/68634 PR rtl-optimization/68909 * shrink-wrap.c (try_shrink_wrapping): If BB isn't dominated by PRE, push it back on VEC. --- gcc/shrink-wrap.c | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c index f65b0c3..85e5a8b 100644 --- a/gcc/shrink-wrap.c +++ b/gcc/shrink-wrap.c @@ -781,6 +781,7 @@ try_shrink_wrapping (edge *entry_edge, bitmap_head *bb_with, if (!dominated_by_p (CDI_DOMINATORS, bb, pre)) { ok = false; + vec.quick_push (bb); break; } -- 1.9.3