Re: Control dependence vs. builtin_unreachable
On 01/07/2013 01:42 PM, Steven Bosscher wrote: For optimizations it's a bit more difficult. Somewhat. For what I'm suggesting, the way we'd miss an optimization would be if by removing the unreachable block and simplifying the CFG we ultimately remove a conditional which produced a result which we could use later. Something like this (tweaking your 2nd test): int foo (int b, int c, int d) { int res, r = 0; res = 5 * d + b; if (d) __builtin_unreachable (); if (c) r = res; return r + d; } What I'm suggesting would effectively remove the conditional and __builtin_unreacahble, so at the source level it'd look like: int foo (int b, int c, int d) { int res, r = 0; res = 5 * d + b; if (c) r = res; return r + d; } By removing the conditional, we've lost the fact that to reach the return statement, "d" must be zero, which is useful in optimizing the return statement. The situations I'm looking at, are all sinking related, and I suspect that most missed optimizations will be of this form: int foo (int b, int c, int d) { int res, r = 0; res = 5 * d + b; if (d) __builtin_unreachable (); if (c) r = res; return r; } I don't see how my proposal would inhibit sinking in this kind of code. In fact, it'd made it easier. I already showed how the control dependence graph looks Wrong with __builtin_unreachable calls in place. Any transformation using the CDG will also be less effective because of this. And what I'm suggesting would actually fix the post dominator tree, among other things. Where we potentially lose is the tweaked case above. However, if we do the optimization at the right time, we can get the benefits we're both looking for, I believe. jeff
Re: Control dependence vs. builtin_unreachable
On 01/07/2013 01:42 PM, Steven Bosscher wrote: typedef enum fruits { banana = 0, apple = 1, pear = 2, orange = 3 } fruit; extern void price_fruit_of_the_day (int); void discount_of_the_day (fruit f) { int p, c = (int) f; switch (f) { case banana: UNREACHABLE (); case apple: p = 3 * c + 4; break; case pear: case orange: p = 7; break; } price_fruit_of_the_day (p); } So if we look at this (I'll analyze the other cases separately). In reassoc2 we have: # BLOCK 2 freq:1 # PRED: ENTRY [100.0%] (fallthru,exec) c_3 = (int) f_2(D); switch (f_2(D)) , case 0: , case 1: , case 2 ... 3: > # SUCC: 6 [25.0%] (exec) 3 [25.0%] (exec) 4 [25.0%] (exec) 5 [25.0%] (exec) # BLOCK 3 freq:2500 # PRED: 2 [25.0%] (exec) : # VUSE <.MEM_8(D)> __builtin_unreachable (); # SUCC: # BLOCK 4 freq:2500 # PRED: 2 [25.0%] (exec) : D.1724_5 = c_3 * 3; p_6 = D.1724_5 + 4; goto (); # SUCC: 6 [100.0%] (fallthru,exec) # BLOCK 5 freq:2500 # PRED: 2 [25.0%] (exec) : # SUCC: 6 [100.0%] (fallthru,exec) # BLOCK 6 freq:7500 # PRED: 2 [25.0%] (exec) 4 [100.0%] (fallthru,exec) 5 [100.0%] (fallthru,exec) # p_1 = PHI : # .MEM_9 = VDEF <.MEM_8(D)> price_fruit_of_the_day (p_1); # VUSE <.MEM_9> return; # SUCC: EXIT [100.0%] } What I'm suggesting is not simply removing the directive at the source level, but at the appropriate time using the knowledge that the block is unreachable to simplify the CFG. In this specific case we would eliminate block #3 and the edge from 2->3. If we do that VRP will come along and do its thing. With the block/edge removals, it will discover that p_6 has the value 7, p_4 is undefined and ultimately it'll collapse the PHI into p_1 = 7. The net result (in this case) will be the same as using builtin_unreachable. Contrast that to simply defining away the directive per your test. In that case we'd still have an edge from 2->3. And when that's the case, VRP won't find a constant for p_6, which ultimately makes p_1 varying and triggers the warning because of the use of p_4 to compute p_1. Jeff
Re: Control dependence vs. builtin_unreachable
On Tue, Jan 8, 2013 at 5:16 PM, Jeff Law wrote: > On 01/08/2013 04:26 AM, Richard Biener wrote: > >> >> The issue is VRP - when you remove unreachable blocks you lose the >> conditional statement as it is no longer necessary and thus the predicate >> you can derive value-ranges from. > > Understood. Perhaps we could eliminate them after the first VRP pass, but > before the second. That ought to give us the majority of the benefit of > seeing the conditional and propagating based on the conditional, but also > give us the benefit of eliminating the branch generating straight-line code. > > Clearly it needs more investigation, but I think it's worth exploring. Sure - especially if we eventually move to preserve value-range information across passes. __builtin_constant_p is a similar thing - it guards one reachable and one unreachable path but we forcefully remove it only during the late fold-all-builtins pass. Richard. > Jeff
Re: Control dependence vs. builtin_unreachable
On Tue, Jan 8, 2013 at 12:32 PM, Richard Biener wrote: > Does it handle side-effects on the builtin-unreachable path correctly? > > int b; > int a; > extern void foo (); > int main() > { > if (!a) >{ > if (!b) >foo (); > __builtin_unreachable (); >} > } > > --- > > void foo () { puts("Hello"); exit(0); } > > ? Users are not _required_ to annotate noreturn functions. The BB with > __builtin_unreachable () is still empty otherwise. Not sure what you're expecting in this case, so guessing... There is a warning for "reaching end of non-void" blah, but that's correct on the a!=0 path. With that if(!a) commented out, I don't have that warning of course. The code is identical with and without patch. It's another example of how __builtin_unreachable() helps hide unnecessary warnings. Ciao! Steven
Re: Control dependence vs. builtin_unreachable
On 01/08/2013 04:26 AM, Richard Biener wrote: The issue is VRP - when you remove unreachable blocks you lose the conditional statement as it is no longer necessary and thus the predicate you can derive value-ranges from. Understood. Perhaps we could eliminate them after the first VRP pass, but before the second. That ought to give us the majority of the benefit of seeing the conditional and propagating based on the conditional, but also give us the benefit of eliminating the branch generating straight-line code. Clearly it needs more investigation, but I think it's worth exploring. Jeff
Re: Control dependence vs. builtin_unreachable
On Sat, Jan 5, 2013 at 9:10 PM, Steven Bosscher wrote: > On Thu, Jan 3, 2013 at 9:51 PM, Jeff Law wrote: >> On 01/03/2013 12:01 PM, Steven Bosscher wrote: >>> >>> Hello, >>> >>> Consider the following test case: >>> >>> void bar (void); >>> int foo (int b, int c, int d) >>> { >>>int r = 0; >>>if (b) >>> res = b * 2 + 4; >>>if (c) >>> { >>>if (d) >>> r = res; >>>else >>> __builtin_unreachable (); >>> } >>>return r; >>> } >>> >>> This is typical for code in GCC itself in places where >>> gcc_unreachable() is used. >> >> >>> >>> The corresponding CFG looks like this: >>> >>> +-+ >>> | bb0 | >>> +-+ >>>| >>>| >>>v >>> +-+ >>> | bb2 | -+ >>> +-+ | >>>| | >>>| | >>>v | >>> +-+ | >>> | bb3 | | >>> +-+ | >>>| | >>>| | >>>v | >>> +-+ +-+ | >>> | bb8 | <-- | bb4 | <+ >>> +-+ +-+ >>>| | >>>| | >>>| v >>>| +-+ +-+ >>>| | bb5 | --> | bb7 | >>>| +-+ +-+ >>>| | >>>| | >>>| v >>>| +-+ >>>| | bb6 | >>>| +-+ >>>| | >>>| | >>>| v >>>| +-+ >>>+---> | bb9 | >>> +-+ >>>| >>>| >>>v >>> +-+ >>> | bb1 | >>> +-+ >> >> Presumably BB7 was created in response to the builtin_unreachable? > > Yes. The block only contains the BB_UNREACHABLE call. It is cleaned up > at the end of the GIMPLE passes pipeline, in the fold-all-builtins > pass (most __builtin_unreachable calls are, but not all). > > >> One >> could argue that an empty dead-end basic block should just be removed and >> the CFG appropriately simplified. > > The problem with this, is that __builtin_unreachable actually exposes > optimization opportunities: more const/copy props of "implicit sets" > in the predicate guarding the __builtin_unreachable call, more > optimistic value numbering, etc. It also helps improve maybe-unused > warnings accuracy. So simply removing these "dead ends" in the CFG is > probably not a good idea. > > >> You might want to look at a discussion from Oct/Nov 2011 "New pass to >> delete unexecutable paths in the CFG" which touches on some of this stuff. > > That's a really interesting discussion! I must have missed it at the time :-) > > >> It's not 100% the same, but the concept of eliminating edges from the CFG >> which we can never traverse in a conforming program applies to both your >> example and the stuff I was playing with. > > I think there is one important difference: In the thread you referred > to, you're removing paths in the CFG that are implicitly not > executable (for some languages anyway), whereas a > __builtin_unreachable call is an explicit marker for "this can never > happen". I think this difference is important: > > * The explicit marker may have been put there on purpose (e.g. to get > rid of a false-positive warning). The compiler should respect that. An > implicit unreachable path can be optimized away without regard for the > user's intent. > > * The explicit marker should not inhibit optimizations. For an > implicit unreachable path the compiler should be conservative. But for > a __builtin_unreachable call that is the only statement in a basic > block, the compiler should be allowed to work as if the block really > is never reached. > > The attached patch implements these ideas. During a tree-CFG cleanup, > basic blocks containing only a __builtin_unreachable call are marked > with a new flag BB_NEVER_REACHED. The flag is used in post-dominance: > A marked block is not considered in the computations of the immediate > post-dominator of the predecessor blocks. The result is a more > optimistic post-dominance tree: Without the patch all predecessors of > these BB_NEVER_REACHED blocks have the EXIT block as their > post-dominator, but with the patch this non-executable path in the CFG > is ignored and the post-dominators are those that would result if the > BB_NEVER_REACHED blocks are not there at all (the BB_NEVER_REACHED > blocks themselves are only post-dominated by the EXIT block). > > I've also added a control dependence calculation function. It's not > currently used, but it shows how the BB_NEVER_REACHED flag is used in > this case to avoid the "false" control dependences that I showed in > the graphs in http://gcc.gnu.org/ml/gcc/2013-01/msg00021.html. > > Bootstrapped&tested on powerpc64-unknown-linux-gnu. > What do you think of this approach? Does it handle
Re: Control dependence vs. builtin_unreachable
On Mon, Jan 7, 2013 at 8:45 PM, Jeff Law wrote: > On 01/05/2013 01:10 PM, Steven Bosscher wrote: >>> >>> >>> Presumably BB7 was created in response to the builtin_unreachable? >> >> >> Yes. The block only contains the BB_UNREACHABLE call. It is cleaned up >> at the end of the GIMPLE passes pipeline, in the fold-all-builtins >> pass (most __builtin_unreachable calls are, but not all). > > I think if you eliminate the block and the cleanup the CFG appropriately, > the right thing will just happen. > > >> The problem with this, is that __builtin_unreachable actually exposes >> optimization opportunities: more const/copy props of "implicit sets" >> in the predicate guarding the __builtin_unreachable call, more >> optimistic value numbering, etc. It also helps improve maybe-unused >> warnings accuracy. So simply removing these "dead ends" in the CFG is >> probably not a good idea. > > ?!? By removing the empty unreachable block that's precisely what you > enable. The block itself goes away and the branch leading to the block is > simplified appropriately. That in turn will create larger basic blocks, > enabling the const/copy propagations and similar optimizations. > > Finally removing unreachable paths was insired by the desire to eliminate > false positives from maybe-{unused,uninitialized} and similar warnings. > > I'd be very curious to see the conditions under which removing the empty > unreachable and appropriate cleanup of the CFG (including the underlying > control statement) results in less optimizations and less precision in our > may-warnings. The issue is VRP - when you remove unreachable blocks you lose the conditional statement as it is no longer necessary and thus the predicate you can derive value-ranges from. Maybe there are more examples. I can imagine false positive may-be warnings then become must-be false positives. There are clearly (as seen here) missed optimizations because we do keep the unreachable blocks around. Note that the whole point of the unreachable () excercise was to be able to implement an assert () that even with -DNDEBUG leaves the compiler with the same optimization opportunities (as of value-range analysis) as with -UNDEBUG. Richard.
Re: Control dependence vs. builtin_unreachable
On Mon, Jan 7, 2013 at 8:45 PM, Jeff Law wrote: > Before diving into the patch I think we should figure out why we see such > different effects of eliminating these paths from the CFG. Your assertion > is eliminating them will result in more false positives and less > optimization while my assertion is the opposite. Warnings: $ cat q.c #if 0 #define UNREACHABLE() __builtin_unreachable() #else #define UNREACHABLE() break #endif typedef enum fruits { banana = 0, apple = 1, pear = 2, orange = 3 } fruit; extern void price_fruit_of_the_day (int); void discount_of_the_day (fruit f) { int p, c = (int) f; switch (f) { case banana: UNREACHABLE (); case apple: p = 3 * c + 4; break; case pear: case orange: p = 7; break; } price_fruit_of_the_day (p); } $ # Compiled with UNREACHABLE defined as break: $ ./cc1 -quiet -W -Wall -Wextra q.c -fdump-tree-all -O1 q.c: In function 'discount_of_the_day': q.c:28:26: warning: 'p' may be used uninitialized in this function [-Wmaybe-uninitialized] price_fruit_of_the_day (p); ^ $ # Compiled with UNREACHABLE defined as builtin_unreachable: $ ./cc1 -quiet -W -Wall -Wextra q.c -fdump-tree-all -O1 $ For optimizations it's a bit more difficult. The situations I'm looking at, are all sinking related, and I suspect that most missed optimizations will be of this form: int foo (int b, int c, int d) { int res, r = 0; res = 5 * d + b; if (d) __builtin_unreachable (); if (c) r = res; return r; } The "res =" statement can be sunk into the conditional arm of "if(c)" and tree-ssa-sink.c is able to do so. But secondary effects are missed (and test cases for this are hard to reduce to mailing list acceptable size ;-) because tree-ssa-sink walks the post-dominator tree and, with the right amount of bad luck, it visits the "res=" statement _before_ the part of the (forward-)dominator tree below "if(d)" because the basic block containing "res=" has EXIT as its immediate post-dominator, because that's where __builtin_unreachable() leads to. The "optimistic" post-dominator of "res=" is the "if(c)" block, of course, because the builtin_unreachable is a kind-of assert, not something that can really ever happen. I already showed how the control dependence graph looks Wrong with __builtin_unreachable calls in place. Any transformation using the CDG will also be less effective because of this. Ciao! Steven
Re: Control dependence vs. builtin_unreachable
On 01/05/2013 01:10 PM, Steven Bosscher wrote: Presumably BB7 was created in response to the builtin_unreachable? Yes. The block only contains the BB_UNREACHABLE call. It is cleaned up at the end of the GIMPLE passes pipeline, in the fold-all-builtins pass (most __builtin_unreachable calls are, but not all). I think if you eliminate the block and the cleanup the CFG appropriately, the right thing will just happen. The problem with this, is that __builtin_unreachable actually exposes optimization opportunities: more const/copy props of "implicit sets" in the predicate guarding the __builtin_unreachable call, more optimistic value numbering, etc. It also helps improve maybe-unused warnings accuracy. So simply removing these "dead ends" in the CFG is probably not a good idea. ?!? By removing the empty unreachable block that's precisely what you enable. The block itself goes away and the branch leading to the block is simplified appropriately. That in turn will create larger basic blocks, enabling the const/copy propagations and similar optimizations. Finally removing unreachable paths was insired by the desire to eliminate false positives from maybe-{unused,uninitialized} and similar warnings. I'd be very curious to see the conditions under which removing the empty unreachable and appropriate cleanup of the CFG (including the underlying control statement) results in less optimizations and less precision in our may-warnings. I think there is one important difference: In the thread you referred to, you're removing paths in the CFG that are implicitly not executable (for some languages anyway), whereas a __builtin_unreachable call is an explicit marker for "this can never happen". I think this difference is important: I don't see this as an important distinction. The difference is in the implicit "can't happen" case, eliminating the "can't happen" path may eliminate a user visible side effect (for example a segfault). * The explicit marker may have been put there on purpose (e.g. to get rid of a false-positive warning). The compiler should respect that. An implicit unreachable path can be optimized away without regard for the user's intent. Right, but if done right, eliminating the unreachable path correctly should eliminate false positive warnings. * The explicit marker should not inhibit optimizations. For an implicit unreachable path the compiler should be conservative. But for a __builtin_unreachable call that is the only statement in a basic block, the compiler should be allowed to work as if the block really is never reached. Right. The attached patch implements these ideas. During a tree-CFG cleanup, basic blocks containing only a __builtin_unreachable call are marked with a new flag BB_NEVER_REACHED. The flag is used in post-dominance: A marked block is not considered in the computations of the immediate post-dominator of the predecessor blocks. The result is a more optimistic post-dominance tree: Without the patch all predecessors of these BB_NEVER_REACHED blocks have the EXIT block as their post-dominator, but with the patch this non-executable path in the CFG is ignored and the post-dominators are those that would result if the BB_NEVER_REACHED blocks are not there at all (the BB_NEVER_REACHED blocks themselves are only post-dominated by the EXIT block). I've also added a control dependence calculation function. It's not currently used, but it shows how the BB_NEVER_REACHED flag is used in this case to avoid the "false" control dependences that I showed in the graphs in http://gcc.gnu.org/ml/gcc/2013-01/msg00021.html. Bootstrapped&tested on powerpc64-unknown-linux-gnu. What do you think of this approach? Before diving into the patch I think we should figure out why we see such different effects of eliminating these paths from the CFG. Your assertion is eliminating them will result in more false positives and less optimization while my assertion is the opposite. Jeff
Re: Control dependence vs. builtin_unreachable
On Sat, Jan 5, 2013 at 9:10 PM, Steven Bosscher wrote: > Bootstrapped&tested on powerpc64-unknown-linux-gnu. And to be clear, bootstrapped with this patch on top: Index: system.h === --- system.h(revision 194924) +++ system.h(working copy) @@ -698,7 +698,7 @@ /* Use gcc_unreachable() to mark unreachable locations (like an unreachable default case of a switch. Do not use gcc_assert(0). */ -#if (GCC_VERSION >= 4005) && !ENABLE_ASSERT_CHECKING +#if (GCC_VERSION >= 4005) //&& !ENABLE_ASSERT_CHECKING #define gcc_unreachable() __builtin_unreachable() #else #define gcc_unreachable() (fancy_abort (__FILE__, __LINE__, __FUNCTION__)) Otherwise, __builtin_unreachable would be almost unused and bootstrap+test wouldn't prove much :-) Ciao! Steven
Re: Control dependence vs. builtin_unreachable
On Thu, Jan 3, 2013 at 9:51 PM, Jeff Law wrote: > On 01/03/2013 12:01 PM, Steven Bosscher wrote: >> >> Hello, >> >> Consider the following test case: >> >> void bar (void); >> int foo (int b, int c, int d) >> { >>int r = 0; >>if (b) >> res = b * 2 + 4; >>if (c) >> { >>if (d) >> r = res; >>else >> __builtin_unreachable (); >> } >>return r; >> } >> >> This is typical for code in GCC itself in places where >> gcc_unreachable() is used. > > >> >> The corresponding CFG looks like this: >> >> +-+ >> | bb0 | >> +-+ >>| >>| >>v >> +-+ >> | bb2 | -+ >> +-+ | >>| | >>| | >>v | >> +-+ | >> | bb3 | | >> +-+ | >>| | >>| | >>v | >> +-+ +-+ | >> | bb8 | <-- | bb4 | <+ >> +-+ +-+ >>| | >>| | >>| v >>| +-+ +-+ >>| | bb5 | --> | bb7 | >>| +-+ +-+ >>| | >>| | >>| v >>| +-+ >>| | bb6 | >>| +-+ >>| | >>| | >>| v >>| +-+ >>+---> | bb9 | >> +-+ >>| >>| >>v >> +-+ >> | bb1 | >> +-+ > > Presumably BB7 was created in response to the builtin_unreachable? Yes. The block only contains the BB_UNREACHABLE call. It is cleaned up at the end of the GIMPLE passes pipeline, in the fold-all-builtins pass (most __builtin_unreachable calls are, but not all). > One > could argue that an empty dead-end basic block should just be removed and > the CFG appropriately simplified. The problem with this, is that __builtin_unreachable actually exposes optimization opportunities: more const/copy props of "implicit sets" in the predicate guarding the __builtin_unreachable call, more optimistic value numbering, etc. It also helps improve maybe-unused warnings accuracy. So simply removing these "dead ends" in the CFG is probably not a good idea. > You might want to look at a discussion from Oct/Nov 2011 "New pass to > delete unexecutable paths in the CFG" which touches on some of this stuff. That's a really interesting discussion! I must have missed it at the time :-) > It's not 100% the same, but the concept of eliminating edges from the CFG > which we can never traverse in a conforming program applies to both your > example and the stuff I was playing with. I think there is one important difference: In the thread you referred to, you're removing paths in the CFG that are implicitly not executable (for some languages anyway), whereas a __builtin_unreachable call is an explicit marker for "this can never happen". I think this difference is important: * The explicit marker may have been put there on purpose (e.g. to get rid of a false-positive warning). The compiler should respect that. An implicit unreachable path can be optimized away without regard for the user's intent. * The explicit marker should not inhibit optimizations. For an implicit unreachable path the compiler should be conservative. But for a __builtin_unreachable call that is the only statement in a basic block, the compiler should be allowed to work as if the block really is never reached. The attached patch implements these ideas. During a tree-CFG cleanup, basic blocks containing only a __builtin_unreachable call are marked with a new flag BB_NEVER_REACHED. The flag is used in post-dominance: A marked block is not considered in the computations of the immediate post-dominator of the predecessor blocks. The result is a more optimistic post-dominance tree: Without the patch all predecessors of these BB_NEVER_REACHED blocks have the EXIT block as their post-dominator, but with the patch this non-executable path in the CFG is ignored and the post-dominators are those that would result if the BB_NEVER_REACHED blocks are not there at all (the BB_NEVER_REACHED blocks themselves are only post-dominated by the EXIT block). I've also added a control dependence calculation function. It's not currently used, but it shows how the BB_NEVER_REACHED flag is used in this case to avoid the "false" control dependences that I showed in the graphs in http://gcc.gnu.org/ml/gcc/2013-01/msg00021.html. Bootstrapped&tested on powerpc64-unknown-linux-gnu. What do you think of this approach? Ciao! Steven * cfg-flags.def (BB_NEVER_REACHED): New flag on basic_block. (BB_SUPERBLOCK): Add FIXME, document that this flag should be removed. * tree-cfgcleanup.c (is_only_builtin_unreachable_block): New function. (mark