Re: Control dependence vs. builtin_unreachable

2013-01-09 Thread Richard Biener
On Tue, Jan 8, 2013 at 5:16 PM, Jeff Law l...@redhat.com wrote:
 On 01/08/2013 04:26 AM, Richard Biener wrote:


 The issue is VRP - when you remove unreachable blocks you lose the
 conditional statement as it is no longer necessary and thus the predicate
 you can derive value-ranges from.

 Understood.  Perhaps we could eliminate them after the first VRP pass, but
 before the second.  That ought to give us the majority of the benefit of
 seeing the conditional and propagating based on the conditional, but also
 give us the benefit of eliminating the branch generating straight-line code.

 Clearly it needs more investigation, but I think it's worth exploring.

Sure - especially if we eventually move to preserve value-range information
across passes.  __builtin_constant_p is a similar thing - it guards one
reachable and one unreachable path but we forcefully remove it only
during the late fold-all-builtins pass.

Richard.

 Jeff


Re: Control dependence vs. builtin_unreachable

2013-01-09 Thread Jeff Law

On 01/07/2013 01:42 PM, Steven Bosscher wrote:

typedef enum fruits { banana = 0, apple = 1, pear = 2, orange = 3 } fruit;

extern void price_fruit_of_the_day (int);

void
discount_of_the_day (fruit f)
{
   int p, c = (int) f;
   switch (f)
   {
 case banana:
   UNREACHABLE ();
 case apple:
   p = 3 * c + 4;
   break;
 case pear:
 case orange:
   p = 7;
   break;
   }
   price_fruit_of_the_day (p);
}

So if we look at this (I'll analyze the other cases separately).

In reassoc2 we have:


  # BLOCK 2 freq:1
  # PRED: ENTRY [100.0%]  (fallthru,exec)
  c_3 = (int) f_2(D);
  switch (f_2(D)) default: L4, case 0: L0, case 1: L1, case 2 
... 3: L2
  # SUCC: 6 [25.0%]  (exec) 3 [25.0%]  (exec) 4 [25.0%]  (exec) 5 
[25.0%]  (exec)


  # BLOCK 3 freq:2500
  # PRED: 2 [25.0%]  (exec)
L0:
  # VUSE .MEM_8(D)
  __builtin_unreachable ();
  # SUCC:

  # BLOCK 4 freq:2500
  # PRED: 2 [25.0%]  (exec)
L1:
  D.1724_5 = c_3 * 3;
  p_6 = D.1724_5 + 4;
  goto bb 6 (L4);
  # SUCC: 6 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:2500
  # PRED: 2 [25.0%]  (exec)
L2:
  # SUCC: 6 [100.0%]  (fallthru,exec)

  # BLOCK 6 freq:7500
  # PRED: 2 [25.0%]  (exec) 4 [100.0%]  (fallthru,exec) 5 [100.0%] 
(fallthru,exec)

  # p_1 = PHI p_4(D)(2), p_6(4), 7(5)
L4:
  # .MEM_9 = VDEF .MEM_8(D)
  price_fruit_of_the_day (p_1);
  # VUSE .MEM_9
  return;
  # SUCC: EXIT [100.0%]

}




What I'm suggesting is not simply removing the directive at the source 
level, but at the appropriate time using the knowledge that the block is 
unreachable to simplify the CFG.  In this specific case we would 
eliminate block #3 and the edge from 2-3.


If we do that VRP will come along and do its thing.  With the block/edge 
removals, it will discover that p_6 has the value 7, p_4 is undefined 
and ultimately it'll collapse the PHI into p_1 = 7.  The net result (in 
this case) will be the same as using builtin_unreachable.


Contrast that to simply defining away the directive per your test.  In 
that case we'd still have an edge from 2-3.  And when that's the case, 
VRP won't find a constant for p_6, which ultimately makes p_1 varying 
and triggers the warning because of the use of p_4 to compute p_1.


Jeff





Re: Control dependence vs. builtin_unreachable

2013-01-09 Thread Jeff Law

On 01/07/2013 01:42 PM, Steven Bosscher wrote:

For optimizations it's a bit more difficult.
Somewhat.  For what I'm suggesting, the way we'd miss an optimization 
would be if by removing the unreachable block and simplifying the CFG we 
ultimately remove a conditional which produced a result which we could 
use later.


Something like this (tweaking your 2nd test):


int foo (int b, int c, int d)
{
  int res, r = 0;
  res = 5 * d + b;
  if (d)
__builtin_unreachable ();
  if (c)
r = res;
  return r + d;
}


What I'm suggesting would effectively remove the conditional and 
__builtin_unreacahble, so at the source level it'd look like:



int foo (int b, int c, int d)
{
  int res, r = 0;
  res = 5 * d + b;
  if (c)
r = res;
  return r + d;
}

By removing the conditional, we've lost the fact that to reach the 
return statement, d must be zero, which is useful in optimizing the 
return statement.






 The situations I'm

looking at, are all sinking related, and I suspect that most missed
optimizations will be of this form:

int foo (int b, int c, int d)
{
   int res, r = 0;
   res = 5 * d + b;
   if (d)
 __builtin_unreachable ();
   if (c)
 r = res;
   return r;
}
I don't see how my proposal would inhibit sinking in this kind of code. 
 In fact, it'd made it easier.




I already showed how the control dependence graph looks Wrong with
__builtin_unreachable calls in place. Any transformation using the CDG
will also be less effective because of this.
And what I'm suggesting would actually fix the post dominator tree, 
among other things.  Where we potentially lose is the tweaked case 
above.  However, if we do the optimization at the right time, we can get 
the benefits we're both looking for, I believe.


jeff


Re: Control dependence vs. builtin_unreachable

2013-01-08 Thread Richard Biener
On Mon, Jan 7, 2013 at 8:45 PM, Jeff Law l...@redhat.com wrote:
 On 01/05/2013 01:10 PM, Steven Bosscher wrote:


 Presumably BB7 was created in response to the builtin_unreachable?


 Yes. The block only contains the BB_UNREACHABLE call. It is cleaned up
 at the end of the GIMPLE passes pipeline, in the fold-all-builtins
 pass (most __builtin_unreachable calls are, but not all).

 I think if you eliminate the block and the cleanup the CFG appropriately,
 the right thing will just happen.


 The problem with this, is that __builtin_unreachable actually exposes
 optimization opportunities: more const/copy props of implicit sets
 in the predicate guarding the __builtin_unreachable call, more
 optimistic value numbering, etc. It also helps improve maybe-unused
 warnings accuracy. So simply removing these dead ends in the CFG is
 probably not a good idea.

 ?!?  By removing the empty unreachable block that's precisely what you
 enable.  The block itself goes away and the branch leading to the block is
 simplified appropriately.  That in turn will create larger basic blocks,
 enabling the const/copy propagations and similar optimizations.

 Finally removing unreachable paths was insired by the desire to eliminate
 false positives from maybe-{unused,uninitialized} and similar warnings.

 I'd be very curious to see the conditions under which removing the empty
 unreachable and appropriate cleanup of the CFG  (including the underlying
 control statement) results in less optimizations and less precision in our
 may-warnings.

The issue is VRP - when you remove unreachable blocks you lose the
conditional statement as it is no longer necessary and thus the predicate
you can derive value-ranges from.

Maybe there are more examples.  I can imagine false positive may-be
warnings then become must-be false positives.

There are clearly (as seen here) missed optimizations because we do
keep the unreachable blocks around.

Note that the whole point of the unreachable () excercise was to be
able to implement an assert () that even with -DNDEBUG leaves the
compiler with the same optimization opportunities (as of value-range
analysis) as with -UNDEBUG.

Richard.


Re: Control dependence vs. builtin_unreachable

2013-01-08 Thread Richard Biener
On Sat, Jan 5, 2013 at 9:10 PM, Steven Bosscher stevenb@gmail.com wrote:
 On Thu, Jan 3, 2013 at 9:51 PM, Jeff Law wrote:
 On 01/03/2013 12:01 PM, Steven Bosscher wrote:

 Hello,

 Consider the following test case:

 void bar (void);
 int foo (int b, int c, int d)
 {
int r = 0;
if (b)
  res = b * 2 + 4;
if (c)
  {
if (d)
  r = res;
else
  __builtin_unreachable ();
  }
return r;
 }

 This is typical for code in GCC itself in places where
 gcc_unreachable() is used.



 The corresponding CFG looks like this:

  +-+
  | bb0 |
  +-+
|
|
v
  +-+
  | bb2 | -+
  +-+  |
|  |
|  |
v  |
  +-+  |
  | bb3 |  |
  +-+  |
|  |
|  |
v  |
 +-+ +-+  |
 | bb8 | -- | bb4 | +
 +-+ +-+
|   |
|   |
|   v
| +-+ +-+
| | bb5 | -- | bb7 |
| +-+ +-+
|   |
|   |
|   v
| +-+
| | bb6 |
| +-+
|   |
|   |
|   v
| +-+
+--- | bb9 |
  +-+
|
|
v
  +-+
  | bb1 |
  +-+

 Presumably BB7 was created in response to the builtin_unreachable?

 Yes. The block only contains the BB_UNREACHABLE call. It is cleaned up
 at the end of the GIMPLE passes pipeline, in the fold-all-builtins
 pass (most __builtin_unreachable calls are, but not all).


 One
 could argue that an empty dead-end basic block should just be removed and
 the CFG appropriately simplified.

 The problem with this, is that __builtin_unreachable actually exposes
 optimization opportunities: more const/copy props of implicit sets
 in the predicate guarding the __builtin_unreachable call, more
 optimistic value numbering, etc. It also helps improve maybe-unused
 warnings accuracy. So simply removing these dead ends in the CFG is
 probably not a good idea.


 You might want to look  at a discussion from Oct/Nov 2011 New pass to
 delete unexecutable paths in the CFG which touches on some of this stuff.

 That's a really interesting discussion! I must have missed it at the time :-)


 It's not 100% the same, but the concept of eliminating edges from the CFG
 which we can never traverse in a conforming program applies to both your
 example and the stuff I was playing with.

 I think there is one important difference: In the thread you referred
 to, you're removing paths in the CFG that are implicitly not
 executable (for some languages anyway), whereas a
 __builtin_unreachable call is an explicit marker for this can never
 happen. I think this difference is important:

 * The explicit marker may have been put there on purpose (e.g. to get
 rid of a false-positive warning). The compiler should respect that. An
 implicit unreachable path can be optimized away without regard for the
 user's intent.

 * The explicit marker should not inhibit optimizations. For an
 implicit unreachable path the compiler should be conservative. But for
 a __builtin_unreachable call that is the only statement in a basic
 block, the compiler should be allowed to work as if the block really
 is never reached.

 The attached patch implements these ideas. During a tree-CFG cleanup,
 basic blocks containing only a __builtin_unreachable call are marked
 with a new flag BB_NEVER_REACHED. The flag is used in post-dominance:
 A marked block is not considered in the computations of the immediate
 post-dominator of the predecessor blocks. The result is a more
 optimistic post-dominance tree: Without the patch all predecessors of
 these BB_NEVER_REACHED blocks have the EXIT block as their
 post-dominator, but with the patch this non-executable path in the CFG
 is ignored and the post-dominators are those that would result if the
 BB_NEVER_REACHED blocks are not there at all (the BB_NEVER_REACHED
 blocks themselves are only post-dominated by the EXIT block).

 I've also added a control dependence calculation function. It's not
 currently used, but it shows how the BB_NEVER_REACHED flag is used in
 this case to avoid the false control dependences that I showed in
 the graphs in http://gcc.gnu.org/ml/gcc/2013-01/msg00021.html.

 Bootstrappedtested on powerpc64-unknown-linux-gnu.
 What do you think of this approach?

Does it handle side-effects on the builtin-unreachable path correctly?

int b;
int a;
extern void foo ();
int main()
{
  if (!a)
   {
 if (!b)
   foo ();
 __builtin_unreachable ();
   }
}

---

void foo () { puts(Hello); exit(0); }

?  Users are not _required_ to annotate noreturn 

Re: Control dependence vs. builtin_unreachable

2013-01-08 Thread Jeff Law

On 01/08/2013 04:26 AM, Richard Biener wrote:



The issue is VRP - when you remove unreachable blocks you lose the
conditional statement as it is no longer necessary and thus the predicate
you can derive value-ranges from.
Understood.  Perhaps we could eliminate them after the first VRP pass, 
but before the second.  That ought to give us the majority of the 
benefit of seeing the conditional and propagating based on the 
conditional, but also give us the benefit of eliminating the branch 
generating straight-line code.


Clearly it needs more investigation, but I think it's worth exploring.

Jeff


Re: Control dependence vs. builtin_unreachable

2013-01-08 Thread Steven Bosscher
On Tue, Jan 8, 2013 at 12:32 PM, Richard Biener wrote:
 Does it handle side-effects on the builtin-unreachable path correctly?

 int b;
 int a;
 extern void foo ();
 int main()
 {
   if (!a)
{
  if (!b)
foo ();
  __builtin_unreachable ();
}
 }

 ---

 void foo () { puts(Hello); exit(0); }

 ?  Users are not _required_ to annotate noreturn functions.  The BB with
 __builtin_unreachable () is still empty otherwise.

Not sure what you're expecting in this case, so guessing...

There is a warning for reaching end of non-void blah, but that's
correct on the a!=0 path.
With that if(!a) commented out, I don't have that warning of course.

The code is identical with and without patch. It's another example of
how __builtin_unreachable() helps hide unnecessary warnings.

Ciao!
Steven


Re: Control dependence vs. builtin_unreachable

2013-01-07 Thread Jeff Law

On 01/05/2013 01:10 PM, Steven Bosscher wrote:


Presumably BB7 was created in response to the builtin_unreachable?


Yes. The block only contains the BB_UNREACHABLE call. It is cleaned up
at the end of the GIMPLE passes pipeline, in the fold-all-builtins
pass (most __builtin_unreachable calls are, but not all).
I think if you eliminate the block and the cleanup the CFG 
appropriately, the right thing will just happen.



The problem with this, is that __builtin_unreachable actually exposes
optimization opportunities: more const/copy props of implicit sets
in the predicate guarding the __builtin_unreachable call, more
optimistic value numbering, etc. It also helps improve maybe-unused
warnings accuracy. So simply removing these dead ends in the CFG is
probably not a good idea.
?!?  By removing the empty unreachable block that's precisely what you 
enable.  The block itself goes away and the branch leading to the block 
is simplified appropriately.  That in turn will create larger basic 
blocks, enabling the const/copy propagations and similar optimizations.


Finally removing unreachable paths was insired by the desire to 
eliminate false positives from maybe-{unused,uninitialized} and similar 
warnings.


I'd be very curious to see the conditions under which removing the empty 
unreachable and appropriate cleanup of the CFG  (including the 
underlying control statement) results in less optimizations and less 
precision in our may-warnings.




I think there is one important difference: In the thread you referred
to, you're removing paths in the CFG that are implicitly not
executable (for some languages anyway), whereas a
__builtin_unreachable call is an explicit marker for this can never
happen. I think this difference is important:
I don't see this as an important distinction.  The difference is in the 
implicit can't happen case, eliminating the can't happen path may 
eliminate a user visible side effect (for example a segfault).




* The explicit marker may have been put there on purpose (e.g. to get
rid of a false-positive warning). The compiler should respect that. An
implicit unreachable path can be optimized away without regard for the
user's intent.
Right, but if done right, eliminating the unreachable path correctly 
should eliminate false positive warnings.




* The explicit marker should not inhibit optimizations. For an
implicit unreachable path the compiler should be conservative. But for
a __builtin_unreachable call that is the only statement in a basic
block, the compiler should be allowed to work as if the block really
is never reached.

Right.



The attached patch implements these ideas. During a tree-CFG cleanup,
basic blocks containing only a __builtin_unreachable call are marked
with a new flag BB_NEVER_REACHED. The flag is used in post-dominance:
A marked block is not considered in the computations of the immediate
post-dominator of the predecessor blocks. The result is a more
optimistic post-dominance tree: Without the patch all predecessors of
these BB_NEVER_REACHED blocks have the EXIT block as their
post-dominator, but with the patch this non-executable path in the CFG
is ignored and the post-dominators are those that would result if the
BB_NEVER_REACHED blocks are not there at all (the BB_NEVER_REACHED
blocks themselves are only post-dominated by the EXIT block).

I've also added a control dependence calculation function. It's not
currently used, but it shows how the BB_NEVER_REACHED flag is used in
this case to avoid the false control dependences that I showed in
the graphs in http://gcc.gnu.org/ml/gcc/2013-01/msg00021.html.

Bootstrappedtested on powerpc64-unknown-linux-gnu.
What do you think of this approach?
Before diving into the patch I think we should figure out why we see 
such different effects of eliminating these paths from the CFG.  Your 
assertion is eliminating them will result in more false positives and 
less optimization while my assertion is the opposite.


Jeff



Re: Control dependence vs. builtin_unreachable

2013-01-07 Thread Steven Bosscher
On Mon, Jan 7, 2013 at 8:45 PM, Jeff Law l...@redhat.com wrote:
 Before diving into the patch I think we should figure out why we see such
 different effects of eliminating these paths from the CFG.  Your assertion
 is eliminating them will result in more false positives and less
 optimization while my assertion is the opposite.

Warnings:

$ cat q.c
#if 0
#define UNREACHABLE() __builtin_unreachable()
#else
#define UNREACHABLE() break
#endif

typedef enum fruits { banana = 0, apple = 1, pear = 2, orange = 3 } fruit;

extern void price_fruit_of_the_day (int);

void
discount_of_the_day (fruit f)
{
  int p, c = (int) f;
  switch (f)
  {
case banana:
  UNREACHABLE ();
case apple:
  p = 3 * c + 4;
  break;
case pear:
case orange:
  p = 7;
  break;
  }
  price_fruit_of_the_day (p);
}
$ # Compiled with UNREACHABLE defined as break:
$ ./cc1 -quiet -W -Wall -Wextra q.c -fdump-tree-all -O1
q.c: In function 'discount_of_the_day':
q.c:28:26: warning: 'p' may be used uninitialized in this function
[-Wmaybe-uninitialized]
   price_fruit_of_the_day (p);
  ^
$ # Compiled with UNREACHABLE defined as builtin_unreachable:
$ ./cc1 -quiet -W -Wall -Wextra q.c -fdump-tree-all -O1
$


For optimizations it's a bit more difficult. The situations I'm
looking at, are all sinking related, and I suspect that most missed
optimizations will be of this form:

int foo (int b, int c, int d)
{
  int res, r = 0;
  res = 5 * d + b;
  if (d)
__builtin_unreachable ();
  if (c)
r = res;
  return r;
}

The res = statement can be sunk into the conditional arm of if(c)
and tree-ssa-sink.c is able to do so. But secondary effects are missed
(and test cases for this are hard to reduce to mailing list acceptable
size ;-) because tree-ssa-sink walks the post-dominator tree and, with
the right amount of bad luck, it visits the res= statement _before_
the part of the (forward-)dominator tree below if(d) because the
basic block containing res= has EXIT as its immediate
post-dominator, because that's where __builtin_unreachable() leads to.
The optimistic post-dominator of res= is the if(c) block, of
course, because the builtin_unreachable is a kind-of assert, not
something that can really ever happen.

I already showed how the control dependence graph looks Wrong with
__builtin_unreachable calls in place. Any transformation using the CDG
will also be less effective because of this.

Ciao!
Steven


Re: Control dependence vs. builtin_unreachable

2013-01-05 Thread Steven Bosscher
On Thu, Jan 3, 2013 at 9:51 PM, Jeff Law wrote:
 On 01/03/2013 12:01 PM, Steven Bosscher wrote:

 Hello,

 Consider the following test case:

 void bar (void);
 int foo (int b, int c, int d)
 {
int r = 0;
if (b)
  res = b * 2 + 4;
if (c)
  {
if (d)
  r = res;
else
  __builtin_unreachable ();
  }
return r;
 }

 This is typical for code in GCC itself in places where
 gcc_unreachable() is used.



 The corresponding CFG looks like this:

  +-+
  | bb0 |
  +-+
|
|
v
  +-+
  | bb2 | -+
  +-+  |
|  |
|  |
v  |
  +-+  |
  | bb3 |  |
  +-+  |
|  |
|  |
v  |
 +-+ +-+  |
 | bb8 | -- | bb4 | +
 +-+ +-+
|   |
|   |
|   v
| +-+ +-+
| | bb5 | -- | bb7 |
| +-+ +-+
|   |
|   |
|   v
| +-+
| | bb6 |
| +-+
|   |
|   |
|   v
| +-+
+--- | bb9 |
  +-+
|
|
v
  +-+
  | bb1 |
  +-+

 Presumably BB7 was created in response to the builtin_unreachable?

Yes. The block only contains the BB_UNREACHABLE call. It is cleaned up
at the end of the GIMPLE passes pipeline, in the fold-all-builtins
pass (most __builtin_unreachable calls are, but not all).


 One
 could argue that an empty dead-end basic block should just be removed and
 the CFG appropriately simplified.

The problem with this, is that __builtin_unreachable actually exposes
optimization opportunities: more const/copy props of implicit sets
in the predicate guarding the __builtin_unreachable call, more
optimistic value numbering, etc. It also helps improve maybe-unused
warnings accuracy. So simply removing these dead ends in the CFG is
probably not a good idea.


 You might want to look  at a discussion from Oct/Nov 2011 New pass to
 delete unexecutable paths in the CFG which touches on some of this stuff.

That's a really interesting discussion! I must have missed it at the time :-)


 It's not 100% the same, but the concept of eliminating edges from the CFG
 which we can never traverse in a conforming program applies to both your
 example and the stuff I was playing with.

I think there is one important difference: In the thread you referred
to, you're removing paths in the CFG that are implicitly not
executable (for some languages anyway), whereas a
__builtin_unreachable call is an explicit marker for this can never
happen. I think this difference is important:

* The explicit marker may have been put there on purpose (e.g. to get
rid of a false-positive warning). The compiler should respect that. An
implicit unreachable path can be optimized away without regard for the
user's intent.

* The explicit marker should not inhibit optimizations. For an
implicit unreachable path the compiler should be conservative. But for
a __builtin_unreachable call that is the only statement in a basic
block, the compiler should be allowed to work as if the block really
is never reached.

The attached patch implements these ideas. During a tree-CFG cleanup,
basic blocks containing only a __builtin_unreachable call are marked
with a new flag BB_NEVER_REACHED. The flag is used in post-dominance:
A marked block is not considered in the computations of the immediate
post-dominator of the predecessor blocks. The result is a more
optimistic post-dominance tree: Without the patch all predecessors of
these BB_NEVER_REACHED blocks have the EXIT block as their
post-dominator, but with the patch this non-executable path in the CFG
is ignored and the post-dominators are those that would result if the
BB_NEVER_REACHED blocks are not there at all (the BB_NEVER_REACHED
blocks themselves are only post-dominated by the EXIT block).

I've also added a control dependence calculation function. It's not
currently used, but it shows how the BB_NEVER_REACHED flag is used in
this case to avoid the false control dependences that I showed in
the graphs in http://gcc.gnu.org/ml/gcc/2013-01/msg00021.html.

Bootstrappedtested on powerpc64-unknown-linux-gnu.
What do you think of this approach?

Ciao!
Steven
* cfg-flags.def (BB_NEVER_REACHED): New flag on basic_block.
(BB_SUPERBLOCK): Add FIXME, document that this flag should be removed.
* tree-cfgcleanup.c (is_only_builtin_unreachable_block): New function.
(mark_builtin_unreachable_blocks): New function.
(cleanup_tree_cfg_1): Call mark_builtin_unreachable_blocks at the end
to set BB_NEVER_REACHED on basic blocks 

Re: Control dependence vs. builtin_unreachable

2013-01-05 Thread Steven Bosscher
On Sat, Jan 5, 2013 at 9:10 PM, Steven Bosscher wrote:
 Bootstrappedtested on powerpc64-unknown-linux-gnu.

And to be clear, bootstrapped with this patch on top:
Index: system.h
===
--- system.h(revision 194924)
+++ system.h(working copy)
@@ -698,7 +698,7 @@

 /* Use gcc_unreachable() to mark unreachable locations (like an
unreachable default case of a switch.  Do not use gcc_assert(0).  */
-#if (GCC_VERSION = 4005)  !ENABLE_ASSERT_CHECKING
+#if (GCC_VERSION = 4005) // !ENABLE_ASSERT_CHECKING
 #define gcc_unreachable() __builtin_unreachable()
 #else
 #define gcc_unreachable() (fancy_abort (__FILE__, __LINE__, __FUNCTION__))

Otherwise, __builtin_unreachable would be almost unused and
bootstrap+test wouldn't prove much :-)

Ciao!
Steven