RE: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-31 Thread Jiangning Liu


 -Original Message-
 From: Kai Tietz [mailto:ktiet...@googlemail.com]
 Sent: Thursday, October 27, 2011 5:36 PM
 To: Jiangning Liu
 Cc: Michael Matz; Richard Guenther; Kai Tietz; gcc-patches@gcc.gnu.org;
 Richard Henderson
 Subject: Re: [patch tree-optimization]: Improve handling of
 conditional-branches on targets with high branch costs
 
 2011/10/27 Jiangning Liu jiangning@arm.com:
 
 
  -Original Message-
  From: Michael Matz [mailto:m...@suse.de]
  Sent: Wednesday, October 26, 2011 11:47 PM
  To: Kai Tietz
  Cc: Jiangning Liu; Richard Guenther; Kai Tietz; gcc-
 patc...@gcc.gnu.org;
  Richard Henderson
  Subject: Re: [patch tree-optimization]: Improve handling of
  conditional-branches on targets with high branch costs
 
  Hi,
 
  On Wed, 26 Oct 2011, Kai Tietz wrote:
 
   So you would mean that memory dereferencing shouldn't be
 considered
  as
   side-effect at all?
 
  No.  I haven't said this at all.  Of course it's a side-effect, but
  we're
  allowed to remove existing ones (under some circumstances).  We're
 not
  allowed to introduce new ones, which means that this ...
 
   So we would happily cause by code 'if (i  *i != 0) an crash, as
   memory-dereference has for you no side-effect?
 
  ... is not allowed.  But in the original example the memread was on
 the
  left side, hence occured always, therefore we can move it to the
 right
  side, even though it might occur less often.
 
   In you special case it might be valid that, if first (and C-fold-
  const
   doesn't know if the side-effect condition is really the first, as
 it
   might be a sub-sequence of a condition) condition might trap or
 not,
  to
   combine it.  But branching has to cover the general cases.  If you
  find
   a way to determine that left-hand operand in fold_const's
 branching
  code
   is really the left-most condition in chain, then we can add such a
   special case, but I don't see here an easy way to determine it.
 
  Hmm?  I don't see why it's necessary to check if it's the left-most
  condition in a chain.  If the left hand of '' is a memread it can
  always
  be moved to the right side (or the operator transformed into ''
 which
  can
  have the same effect), of course only if the original rhs is free of
  side
  effects, but then independed if the  was part of a larger
 expression.
  The memread will possibly be done fewer times than originally, but
 as
  said, that's okay.
 
 
  Agree. The point is for the small case I gave RHS doesn't have side
 effect
  at all, so the optimization of changing it to AND doesn't violate C
  specification. We need to recover something for this case, although
 it did
  improve a lot for some particular benchmarks.
 
  Thanks,
  -Jiangning
 
 
  Ciao,
  Michael.
 
 Hmm, so we can allow merging to AND, if the left-hand-side might trap
 but has no-side-effects and rhs has neither trapping nor side-effects.
 As for the case that left-hand side has side-effects but right-hand
 not, we aren't allowed to do this AND/OR merge.  For example 'if ((f =
 foo ()) != 0  f  24)' we aren't allowed to make this
 transformation.
 
 This shouldn't be that hard.  We need to provide to simple_operand_p_2
 an additional argument for checking trapping or not.
 

Would it be OK if I file a tracker in bugzilla against this?

 Regards,
 Kai






RE: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-27 Thread Jiangning Liu


 -Original Message-
 From: Michael Matz [mailto:m...@suse.de]
 Sent: Wednesday, October 26, 2011 11:47 PM
 To: Kai Tietz
 Cc: Jiangning Liu; Richard Guenther; Kai Tietz; gcc-patches@gcc.gnu.org;
 Richard Henderson
 Subject: Re: [patch tree-optimization]: Improve handling of
 conditional-branches on targets with high branch costs
 
 Hi,
 
 On Wed, 26 Oct 2011, Kai Tietz wrote:
 
  So you would mean that memory dereferencing shouldn't be considered
 as
  side-effect at all?
 
 No.  I haven't said this at all.  Of course it's a side-effect, but
 we're
 allowed to remove existing ones (under some circumstances).  We're not
 allowed to introduce new ones, which means that this ...
 
  So we would happily cause by code 'if (i  *i != 0) an crash, as
  memory-dereference has for you no side-effect?
 
 ... is not allowed.  But in the original example the memread was on the
 left side, hence occured always, therefore we can move it to the right
 side, even though it might occur less often.
 
  In you special case it might be valid that, if first (and C-fold-
 const
  doesn't know if the side-effect condition is really the first, as it
  might be a sub-sequence of a condition) condition might trap or not,
 to
  combine it.  But branching has to cover the general cases.  If you
 find
  a way to determine that left-hand operand in fold_const's branching
 code
  is really the left-most condition in chain, then we can add such a
  special case, but I don't see here an easy way to determine it.
 
 Hmm?  I don't see why it's necessary to check if it's the left-most
 condition in a chain.  If the left hand of '' is a memread it can
 always
 be moved to the right side (or the operator transformed into '' which
 can
 have the same effect), of course only if the original rhs is free of
 side
 effects, but then independed if the  was part of a larger expression.
 The memread will possibly be done fewer times than originally, but as
 said, that's okay.
 

Agree. The point is for the small case I gave RHS doesn't have side effect
at all, so the optimization of changing it to AND doesn't violate C
specification. We need to recover something for this case, although it did
improve a lot for some particular benchmarks.

Thanks,
-Jiangning

 
 Ciao,
 Michael.






Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-27 Thread Kai Tietz
2011/10/27 Jiangning Liu jiangning@arm.com:


 -Original Message-
 From: Michael Matz [mailto:m...@suse.de]
 Sent: Wednesday, October 26, 2011 11:47 PM
 To: Kai Tietz
 Cc: Jiangning Liu; Richard Guenther; Kai Tietz; gcc-patches@gcc.gnu.org;
 Richard Henderson
 Subject: Re: [patch tree-optimization]: Improve handling of
 conditional-branches on targets with high branch costs

 Hi,

 On Wed, 26 Oct 2011, Kai Tietz wrote:

  So you would mean that memory dereferencing shouldn't be considered
 as
  side-effect at all?

 No.  I haven't said this at all.  Of course it's a side-effect, but
 we're
 allowed to remove existing ones (under some circumstances).  We're not
 allowed to introduce new ones, which means that this ...

  So we would happily cause by code 'if (i  *i != 0) an crash, as
  memory-dereference has for you no side-effect?

 ... is not allowed.  But in the original example the memread was on the
 left side, hence occured always, therefore we can move it to the right
 side, even though it might occur less often.

  In you special case it might be valid that, if first (and C-fold-
 const
  doesn't know if the side-effect condition is really the first, as it
  might be a sub-sequence of a condition) condition might trap or not,
 to
  combine it.  But branching has to cover the general cases.  If you
 find
  a way to determine that left-hand operand in fold_const's branching
 code
  is really the left-most condition in chain, then we can add such a
  special case, but I don't see here an easy way to determine it.

 Hmm?  I don't see why it's necessary to check if it's the left-most
 condition in a chain.  If the left hand of '' is a memread it can
 always
 be moved to the right side (or the operator transformed into '' which
 can
 have the same effect), of course only if the original rhs is free of
 side
 effects, but then independed if the  was part of a larger expression.
 The memread will possibly be done fewer times than originally, but as
 said, that's okay.


 Agree. The point is for the small case I gave RHS doesn't have side effect
 at all, so the optimization of changing it to AND doesn't violate C
 specification. We need to recover something for this case, although it did
 improve a lot for some particular benchmarks.

 Thanks,
 -Jiangning


 Ciao,
 Michael.

Hmm, so we can allow merging to AND, if the left-hand-side might trap
but has no-side-effects and rhs has neither trapping nor side-effects.
As for the case that left-hand side has side-effects but right-hand
not, we aren't allowed to do this AND/OR merge.  For example 'if ((f =
foo ()) != 0  f  24)' we aren't allowed to make this
transformation.

This shouldn't be that hard.  We need to provide to simple_operand_p_2
an additional argument for checking trapping or not.

Regards,
Kai


RE: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Jiangning Liu


 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Michael Matz
 Sent: Tuesday, October 11, 2011 10:45 PM
 To: Kai Tietz
 Cc: Richard Guenther; Kai Tietz; gcc-patches@gcc.gnu.org; Richard
 Henderson
 Subject: Re: [patch tree-optimization]: Improve handling of
 conditional-branches on targets with high branch costs
 
 Hi,
 
 On Tue, 11 Oct 2011, Kai Tietz wrote:
 
   Better make it a separate function the first tests your new
   conditions, and then calls simple_operand_p.
 
  Well, either I make it a new function and call it instead of
  simple_operand_p,
 
 That's what I meant, yes.
 
   @@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_
build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg),
ll_arg, rl_arg),
build_int_cst (TREE_TYPE (ll_arg), 0));
   -
   -  if (LOGICAL_OP_NON_SHORT_CIRCUIT)
   - {
   -   if (code != orig_code || lhs != orig_lhs || rhs !=
 orig_rhs)
   - return build2_loc (loc, code, truth_type, lhs, rhs);
   -   return NULL_TREE;
   - }
  
   Why do you remove this hunk?  Shouldn't you instead move the hunk
 you
   added to fold_truth_andor() here.  I realize this needs some TLC to
   fold_truth_andor_1, because right now it early-outs for non-
 comparisons,
   but it seems the better place.  I.e. somehow move the below code
 into the
   above branch, with the associated diddling on fold_truth_andor_1
 that it
   gets called.
 
  This hunk is removed, as it is vain to do here.
 
 There is a fallthrough now, that wasn't there before.  I don't know if
 it's harmless, I just wanted to mention it.
 

Yes, this part introduced different behavior for this small case,

int f(char *i, int j)
{
if (*i  j!=2)
return *i;
else
return j;
}

Before the fix, we have

  D.4710 = *i;
  D.4711 = D.4710 != 0;
  D.4712 = j != 2;
  D.4713 = D.4711  D.4712;
  if (D.4713 != 0) goto D.4714; else goto D.4715;
  D.4714:
  D.4710 = *i;
  D.4716 = (int) D.4710;
  return D.4716;
  D.4715:
  D.4716 = j;
  return D.4716;

After the fix, we have

  D.4711 = *i;
  if (D.4711 != 0) goto D.4712; else goto D.4710;
  D.4712:
  if (j != 2) goto D.4713; else goto D.4710;
  D.4713:
  D.4711 = *i;
  D.4714 = (int) D.4711;
  return D.4714;
  D.4710:
  D.4714 = j;
  return D.4714;

Does this meet the original expectation? 

I find the code below in function fold_truth_andor makes difference,

  /* Transform (A AND-IF B) into (A AND B), or (A OR-IF B)   into (A OR B).
 For sequence point consistancy, we need to check for trapping,
 and side-effects.  */
  else if (code == icode  simple_operand_p_2 (arg0)
simple_operand_p_2 (arg1))
 return fold_build2_loc (loc, ncode, type, arg0, arg1);

for *i != 0 simple_operand_p(*i) returns false. Originally this is not 
checked by the code. I don't see the patch originally wanted to cover this. Can 
this be explained reasonably?

I'm not arguing this patch did worse thing, but only want to understand the 
rationale behind this and justify this patch doesn't hurt anything else. 
Actually on the contrary, I measured and this change accidently made some 
benchmarks significantly improved due to some other reasons.

Thanks,
-Jiangning

  Btw richi asked for it, and I agree that new TRUTH-AND/OR packing is
  better done at a single place in fold_truth_andor only.
 
 As fold_truthop is called twice by fold_truth_andor, the latter might
 indeed be the better place.
 
 
 Ciao,
 Michael.





Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Kai Tietz
2011/10/26 Jiangning Liu jiangning@arm.com:


 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Michael Matz
 Sent: Tuesday, October 11, 2011 10:45 PM
 To: Kai Tietz
 Cc: Richard Guenther; Kai Tietz; gcc-patches@gcc.gnu.org; Richard
 Henderson
 Subject: Re: [patch tree-optimization]: Improve handling of
 conditional-branches on targets with high branch costs

 Hi,

 On Tue, 11 Oct 2011, Kai Tietz wrote:

   Better make it a separate function the first tests your new
   conditions, and then calls simple_operand_p.
 
  Well, either I make it a new function and call it instead of
  simple_operand_p,

 That's what I meant, yes.

   @@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_
                            build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg),
                                    ll_arg, rl_arg),
                            build_int_cst (TREE_TYPE (ll_arg), 0));
   -
   -      if (LOGICAL_OP_NON_SHORT_CIRCUIT)
   -     {
   -       if (code != orig_code || lhs != orig_lhs || rhs !=
 orig_rhs)
   -         return build2_loc (loc, code, truth_type, lhs, rhs);
   -       return NULL_TREE;
   -     }
  
   Why do you remove this hunk?  Shouldn't you instead move the hunk
 you
   added to fold_truth_andor() here.  I realize this needs some TLC to
   fold_truth_andor_1, because right now it early-outs for non-
 comparisons,
   but it seems the better place.  I.e. somehow move the below code
 into the
   above branch, with the associated diddling on fold_truth_andor_1
 that it
   gets called.
 
  This hunk is removed, as it is vain to do here.

 There is a fallthrough now, that wasn't there before.  I don't know if
 it's harmless, I just wanted to mention it.


 Yes, this part introduced different behavior for this small case,

 int f(char *i, int j)
 {
        if (*i  j!=2)
                return *i;
        else
                return j;
 }

 Before the fix, we have

  D.4710 = *i;
  D.4711 = D.4710 != 0;
  D.4712 = j != 2;
  D.4713 = D.4711  D.4712;
  if (D.4713 != 0) goto D.4714; else goto D.4715;
  D.4714:
  D.4710 = *i;
  D.4716 = (int) D.4710;
  return D.4716;
  D.4715:
  D.4716 = j;
  return D.4716;

 After the fix, we have

  D.4711 = *i;
  if (D.4711 != 0) goto D.4712; else goto D.4710;
  D.4712:
  if (j != 2) goto D.4713; else goto D.4710;
  D.4713:
  D.4711 = *i;
  D.4714 = (int) D.4711;
  return D.4714;
  D.4710:
  D.4714 = j;
  return D.4714;

 Does this meet the original expectation?

 I find the code below in function fold_truth_andor makes difference,

      /* Transform (A AND-IF B) into (A AND B), or (A OR-IF B)   into (A OR B).
         For sequence point consistancy, we need to check for trapping,
         and side-effects.  */
      else if (code == icode  simple_operand_p_2 (arg0)
                simple_operand_p_2 (arg1))
         return fold_build2_loc (loc, ncode, type, arg0, arg1);

 for *i != 0 simple_operand_p(*i) returns false. Originally this is not 
 checked by the code. I don't see the patch originally wanted to cover this. 
 Can this be explained reasonably?

 I'm not arguing this patch did worse thing, but only want to understand the 
 rationale behind this and justify this patch doesn't hurt anything else. 
 Actually on the contrary, I measured and this change accidently made some 
 benchmarks significantly improved due to some other reasons.

 Thanks,
 -Jiangning

  Btw richi asked for it, and I agree that new TRUTH-AND/OR packing is
  better done at a single place in fold_truth_andor only.

 As fold_truthop is called twice by fold_truth_andor, the latter might
 indeed be the better place.


 Ciao,
 Michael.

Well, as far as I understand C specification and sequence points, it
makes indeed a difference to do branch-cost optimization on
instructions might cause a signal traps. As signal could be handled in
specifc handlers. You need to consider here that short-circuit
optimization assumes associative law on operands.  So right-hand side
might be evaluaded before the left-hand-side.  And this indeed breaks
IMHO the sequences as mentioned in C specification about conditions.
 So a memory de-referencing (same as a floating-point trap) can never
be treated as simple, as far as I understood this.  So this patch -
beside improving logic for branch-cost merging - fixes this latent
issue.

Regards,
Kai


RE: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Michael Matz
Hi,

On Wed, 26 Oct 2011, Jiangning Liu wrote:

-
-  if (LOGICAL_OP_NON_SHORT_CIRCUIT)
- {
-   if (code != orig_code || lhs != orig_lhs || rhs !=
  orig_rhs)
- return build2_loc (loc, code, truth_type, lhs, rhs);
-   return NULL_TREE;
- }
   
Why do you remove this hunk?  Shouldn't you instead move the hunk
  you
added to fold_truth_andor() here.  I realize this needs some TLC to
fold_truth_andor_1, because right now it early-outs for non-
  comparisons,
but it seems the better place.  I.e. somehow move the below code
  into the
above branch, with the associated diddling on fold_truth_andor_1
  that it
gets called.
  
   This hunk is removed, as it is vain to do here.
  
  There is a fallthrough now, that wasn't there before.  I don't know if
  it's harmless, I just wanted to mention it.
  
 
 Yes, this part introduced different behavior for this small case,
 
   D.4710 = *i;
   D.4711 = D.4710 != 0;
   D.4712 = j != 2;
   D.4713 = D.4711  D.4712;
   if (D.4713 != 0) goto D.4714; else goto D.4715;
 
 After the fix, we have
 
   D.4711 = *i;
   if (D.4711 != 0) goto D.4712; else goto D.4710;
   D.4712:
   if (j != 2) goto D.4713; else goto D.4710;

So, we have one more jump than originally, when the point of the patch 
was to emit less on targets with high branch costs.  So, as speculated, 
the hunk was not useless.  (It's nice that it caused a benchmark to 
improve significantly, but that should be done via a proper analysis and 
patch, not as a side effect of a supposed non-change).


Ciao,
Michael.


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Michael Matz
Hi,

On Wed, 26 Oct 2011, Kai Tietz wrote:

  Yes, this part introduced different behavior for this small case,
 
  int f(char *i, int j)
  {
         if (*i  j!=2)
                 return *i;
         else
                 return j;
  }
 
 Well, as far as I understand C specification and sequence points, it 
 makes indeed a difference to do branch-cost optimization on instructions 
 might cause a signal traps. As signal could be handled in specifc 
 handlers. You need to consider here that short-circuit optimization 
 assumes associative law on operands.  So right-hand side might be 
 evaluaded before the left-hand-side.  And this indeed breaks IMHO the 
 sequences as mentioned in C specification about conditions.

I'm not sure in this specific case.  If it had been:

if (*a  *b) ...

the you'd be right.  The side-effect of dereferencing a must happen before 
*b, and hence transforming this into (*a!=0)(*b!=0) would be wrong.  But 
in the case at hand we only have one potentially problematic (aka 
detectable) side-effect (*i), so I don't think you could construct a 
program that detects the difference.  It would entail detecting that 
j!=2 was already evaluated, when (say) the segfault happens, but you 
can't as the variable is non-volatile.  It also can't happen that the 
side-effect *i does not occur (which also would be a problem).  So, I 
think there wasn't an actual problem before, and it had fewer jumps.


Ciao,
Michael.

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Kai Tietz
2011/10/26 Michael Matz m...@suse.de:
 Hi,

 On Wed, 26 Oct 2011, Kai Tietz wrote:

  Yes, this part introduced different behavior for this small case,
 
  int f(char *i, int j)
  {
         if (*i  j!=2)
                 return *i;
         else
                 return j;
  }

 Well, as far as I understand C specification and sequence points, it
 makes indeed a difference to do branch-cost optimization on instructions
 might cause a signal traps. As signal could be handled in specifc
 handlers. You need to consider here that short-circuit optimization
 assumes associative law on operands.  So right-hand side might be
 evaluaded before the left-hand-side.  And this indeed breaks IMHO the
 sequences as mentioned in C specification about conditions.

 I'm not sure in this specific case.  If it had been:

 if (*a  *b) ...

 the you'd be right.  The side-effect of dereferencing a must happen before
 *b, and hence transforming this into (*a!=0)(*b!=0) would be wrong.  But
 in the case at hand we only have one potentially problematic (aka
 detectable) side-effect (*i), so I don't think you could construct a
 program that detects the difference.  It would entail detecting that
 j!=2 was already evaluated, when (say) the segfault happens, but you
 can't as the variable is non-volatile.  It also can't happen that the
 side-effect *i does not occur (which also would be a problem).  So, I
 think there wasn't an actual problem before, and it had fewer jumps.


 Ciao,
 Michael.

the case can be produced quite easily.

Eg:

extern int global = 0;


  if (*a  global) ...
...

the issue is that by C-specification see here 5.1.2.2.3 about
program-termination.The important point is here::

Evaluation of an expression may produce side effects. At certain
specified points in the execution sequence called sequence points, all
side effects of previous evaluations shall be complete *** and no side
effects of subsequent evaluations shall have taken place ***

Annex C describes sequence-points as

1 The following are the sequence points described in 5.1.2.3:
— The call to a function, after the arguments have been evaluated (6.5.2.2).
— The end of the first operand of the following operators: logical AND
 (6.5.13); logical OR || (6.5.14); conditional ? (6.5.15); comma ,
(6.5.17).
...

Regards,
Kai


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Kai Tietz
I describe the sample more closely here

extern int global = 0;
extern int *a = NULL;

void catchSigSegV( int sig )
{
  a = global;
 }

int foo (int j)
{
 signal (SIGSEGV, catchSigSegV);
 if (*a  global) return 2;
 return 0;
}

I admit that in most cases such a scenario is not common.  This sample
seems to be a valid C program.  So the conditions in IF shall be
evaluted strict in order of sequence-points, as first argument might
trap.
It doesn't matter if second argument have side-effects or none.  The
point is the first and so it has to be separated from other
conditions.

Regards,
Kai


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Michael Matz
Hi,

On Wed, 26 Oct 2011, Kai Tietz wrote:

   int f(char *i, int j)
   {
          if (*i  j!=2)
                  return *i;
          else
                  return j;
   }
 
 
 the case can be produced quite easily.
 
 extern int global = 0;
 
 
   if (*a  global) ...

See?  You had to change the program to prove the transformation to be 
invalid.  But my point was that the function we discuss about was exactly 
as above.  It didn't have globals, or two loads, or a volatile, or 
anything else you can come up with where the transformation would be 
detectable (and hence invalid).  I claim that you can't construct a 
program that can distinguish between this function:

int f(char *i, int j)
{
  if (*i  j!=2)
    return *i;
  else
    return j;
}

and this one:

int f(char *i, int j)
{
  if (*i  j!=2)
    return *i;
  else
    return j;
}

And if you can't construct such a program, then the initial transformation 
before the fold-const.c change _for this specific situation_ was correct.


Ciao,
Michael.

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Kai Tietz
2011/10/26 Michael Matz m...@suse.de:
 Hi,

 On Wed, 26 Oct 2011, Kai Tietz wrote:

   int f(char *i, int j)
   {
          if (*i  j!=2)
                  return *i;
          else
                  return j;
   }
 

 the case can be produced quite easily.

 extern int global = 0;

 
   if (*a  global) ...

 See?  You had to change the program to prove the transformation to be
 invalid.  But my point was that the function we discuss about was exactly
 as above.  It didn't have globals, or two loads, or a volatile, or
 anything else you can come up with where the transformation would be
 detectable (and hence invalid).  I claim that you can't construct a
 program that can distinguish between this function:

 int f(char *i, int j)
 {
   if (*i  j!=2)
     return *i;
   else
     return j;
 }

 and this one:

 int f(char *i, int j)
 {
   if (*i  j!=2)
     return *i;
   else
     return j;
 }

 And if you can't construct such a program, then the initial transformation
 before the fold-const.c change _for this specific situation_ was correct.


 Ciao,
 Michael.

well, if such a function is used as inline and we know for it that j
has value != 2, then we have here a big difference.  For your first
example, we still have to do the memory access to *i, even if we are
not interested in result.  See here point 4 of 5.1.2.3 of C-spec.
For your second sample we don't need to do that, as the  itself is no
sequence-point and so we can eliminate the *i access without breaking
anything.

Regards,
Kai


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Michael Matz
Hi,

On Wed, 26 Oct 2011, Kai Tietz wrote:

 well, if such a function is used as inline and we know for it that j has 
 value != 2, then we have here a big difference.  For your first example, 
 we still have to do the memory access to *i, even if we are not 
 interested in result.

Actually we don't have to preserve memory accesses.  The interesting case 
is if the pointer has an invalid value.  The behaviour of the access then 
is undefined, and it's okay to not do it at all.  In case the pointer does 
point to an object the access (if it's value isn't needed) also isn't 
necessary.  IOW: in void f(int *p) { int i = *p; } we can always remove 
the pointer read.  So, I still maintain that the transformation on the 
original example was okay.


Ciao,
Michael.


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Kai Tietz
2011/10/26 Michael Matz m...@suse.de:
 Hi,

 On Wed, 26 Oct 2011, Kai Tietz wrote:

 well, if such a function is used as inline and we know for it that j has
 value != 2, then we have here a big difference.  For your first example,
 we still have to do the memory access to *i, even if we are not
 interested in result.

 Actually we don't have to preserve memory accesses.  The interesting case
 is if the pointer has an invalid value.  The behaviour of the access then
 is undefined, and it's okay to not do it at all.  In case the pointer does
 point to an object the access (if it's value isn't needed) also isn't
 necessary.  IOW: in void f(int *p) { int i = *p; } we can always remove
 the pointer read.  So, I still maintain that the transformation on the
 original example was okay.


 Ciao,
 Michael.

So you would mean that memory dereferencing shouldn't be considered as
side-effect at all?

So we would happily cause by code 'if (i  *i != 0) an crash, as
memory-dereference has for you no side-effect?

In you special case it might be valid that, if first (and C-fold-const
doesn't know if the side-effect condition is really the first, as it
might be a sub-sequence of a condition) condition might trap or not,
to combine it.  But branching has to cover the general cases.  If you
find a way to determine that left-hand operand in fold_const's
branching code is really the left-most condition in chain, then we can
add such a special case, but I don't see here an easy way to determine
it.

Regards,
Kai

Hmm, not sure


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-26 Thread Michael Matz
Hi,

On Wed, 26 Oct 2011, Kai Tietz wrote:

 So you would mean that memory dereferencing shouldn't be considered as 
 side-effect at all?

No.  I haven't said this at all.  Of course it's a side-effect, but we're 
allowed to remove existing ones (under some circumstances).  We're not 
allowed to introduce new ones, which means that this ...

 So we would happily cause by code 'if (i  *i != 0) an crash, as 
 memory-dereference has for you no side-effect?

... is not allowed.  But in the original example the memread was on the 
left side, hence occured always, therefore we can move it to the right 
side, even though it might occur less often.

 In you special case it might be valid that, if first (and C-fold-const 
 doesn't know if the side-effect condition is really the first, as it 
 might be a sub-sequence of a condition) condition might trap or not, to 
 combine it.  But branching has to cover the general cases.  If you find 
 a way to determine that left-hand operand in fold_const's branching code 
 is really the left-most condition in chain, then we can add such a 
 special case, but I don't see here an easy way to determine it.

Hmm?  I don't see why it's necessary to check if it's the left-most 
condition in a chain.  If the left hand of '' is a memread it can always 
be moved to the right side (or the operator transformed into '' which can 
have the same effect), of course only if the original rhs is free of side 
effects, but then independed if the  was part of a larger expression.  
The memread will possibly be done fewer times than originally, but as 
said, that's okay.


Ciao,
Michael.


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-17 Thread Richard Guenther
On Fri, Oct 14, 2011 at 9:43 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Hello,

 So I committed the gimplify patch separate.  And here is the remaining
 fold-const patch.
 The important tests here are in gcc.dg/tree-ssa/builtin-expect[1-4].c, which
 cover the one special-case for branching. Also tree-ssa/20040204-1.c covers
 tests for branching code (on targets having high-engough BRANCH_COST and no
 special-casing - like MIPS, S/390, and AVR.

 ChangeLog

 2011-10-14  Kai Tietz  kti...@redhat.com

        * fold-const.c (simple_operand_p_2): New function.
        (fold_truthop): Rename to
        (fold_truth_andor_1): function name.
        Additionally remove branching creation for logical and/or.
        (fold_truth_andor): Handle branching creation for logical and/or here.

 Bootstrapped and regression-tested for all languages plus Ada and
 Obj-C++ on x86_64-pc-linux-gnu.
 Ok for apply?

Ok with ...

 Regards,
 Kai

 Index: gcc/gcc/fold-const.c
 ===
 --- gcc.orig/gcc/fold-const.c
 +++ gcc/gcc/fold-const.c
 @@ -112,13 +112,13 @@ static tree decode_field_reference (loca
  static int all_ones_mask_p (const_tree, int);
  static tree sign_bit_p (tree, const_tree);
  static int simple_operand_p (const_tree);
 +static bool simple_operand_p_2 (tree);
  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
  static tree range_predecessor (tree);
  static tree range_successor (tree);
  static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
 tree, tree);
  static tree unextend (tree, int, int, tree);
 -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
  static tree optimize_minmax_comparison (location_t, enum tree_code,
                                        tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 @@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
  }

 -/* Subroutine for fold_truthop: decode a field reference.
 +/* Subroutine for fold_truth_andor_1: decode a field reference.

    If EXP is a comparison reference, we return the innermost reference.

 @@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
  }

 -/* Subroutine for fold_truthop: determine if an operand is simple enough
 +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
 enough
    to be evaluated unconditionally.  */

  static int
 @@ -3678,7 +3678,7 @@ simple_operand_p (const_tree exp)
   STRIP_NOPS (exp);

   return (CONSTANT_CLASS_P (exp)
 -         || TREE_CODE (exp) == SSA_NAME
 +         || TREE_CODE (exp) == SSA_NAME
          || (DECL_P (exp)
               ! TREE_ADDRESSABLE (exp)
               ! TREE_THIS_VOLATILE (exp)
 @@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
                 registers aren't expensive.  */
               (! TREE_STATIC (exp) || DECL_REGISTER (exp;
  }
 +
 +/* Subroutine for fold_truth_andor: determine if an operand is simple enough
 +   to be evaluated unconditionally.
 +   I addition to simple_operand_p, we assume that comparisons and logic-not
 +   operations are simple, if their operands are simple, too.  */
 +
 +static bool
 +simple_operand_p_2 (tree exp)
 +{
 +  enum tree_code code;
 +
 +  /* Strip any conversions that don't change the machine mode.  */
 +  STRIP_NOPS (exp);
 +
 +  code = TREE_CODE (exp);
 +
 +  if (TREE_CODE_CLASS (code) == tcc_comparison)
 +    return (!tree_could_trap_p (exp)
 +            simple_operand_p_2 (TREE_OPERAND (exp, 0))
 +            simple_operand_p_2 (TREE_OPERAND (exp, 1)));

recurse with simple_operand_p.

 +
 +  if (TREE_SIDE_EFFECTS (exp)
 +      || tree_could_trap_p (exp))

Move this check before the tcc_comparison check and remove the
then redundant tree_could_trap_p check there.

 +    return false;
 +
 +  switch (code)
 +    {
 +    case SSA_NAME:
 +      return true;

Do not handle here, it's handled in simple_operand_p.

 +    case TRUTH_NOT_EXPR:
 +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
 +    case BIT_NOT_EXPR:
 +      if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
 +       return false;

Remove the BIT_NOT_EXPR handling.  Thus, simply change this switch
to

if (code == TRUTH_NOT_EXPR)
  return simple_operand_p_2 (TREE_OPERAND (exp, 0));

return simple_operand_p (exp);

 +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
 +    default:
 +      return simple_operand_p (exp);
 +    }
 +}
 +

  /* The following functions are subroutines to fold_range_test and allow it to
    try to change a logical combination of comparisons into a range test.
 @@ -4888,7 +4928,7 @@ fold_range_test (location_t loc, enum tr
   return 0;
  }

 -/* Subroutine for fold_truthop: C is an INTEGER_CST interpreted as a P
 +/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P
    bit value.  Arrange things 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-17 Thread Kai Tietz
2011/10/17 Richard Guenther richard.guent...@gmail.com:
 On Fri, Oct 14, 2011 at 9:43 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Hello,

 So I committed the gimplify patch separate.  And here is the remaining
 fold-const patch.
 The important tests here are in gcc.dg/tree-ssa/builtin-expect[1-4].c, which
 cover the one special-case for branching. Also tree-ssa/20040204-1.c covers
 tests for branching code (on targets having high-engough BRANCH_COST and no
 special-casing - like MIPS, S/390, and AVR.

 ChangeLog

 2011-10-14  Kai Tietz  kti...@redhat.com

        * fold-const.c (simple_operand_p_2): New function.
        (fold_truthop): Rename to
        (fold_truth_andor_1): function name.
        Additionally remove branching creation for logical and/or.
        (fold_truth_andor): Handle branching creation for logical and/or here.

 Bootstrapped and regression-tested for all languages plus Ada and
 Obj-C++ on x86_64-pc-linux-gnu.
 Ok for apply?

 Ok with ...

 Regards,
 Kai

 Index: gcc/gcc/fold-const.c
 ===
 --- gcc.orig/gcc/fold-const.c
 +++ gcc/gcc/fold-const.c
 @@ -112,13 +112,13 @@ static tree decode_field_reference (loca
  static int all_ones_mask_p (const_tree, int);
  static tree sign_bit_p (tree, const_tree);
  static int simple_operand_p (const_tree);
 +static bool simple_operand_p_2 (tree);
  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
  static tree range_predecessor (tree);
  static tree range_successor (tree);
  static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
 tree, tree);
  static tree unextend (tree, int, int, tree);
 -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
  static tree optimize_minmax_comparison (location_t, enum tree_code,
                                        tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 @@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
  }

 -/* Subroutine for fold_truthop: decode a field reference.
 +/* Subroutine for fold_truth_andor_1: decode a field reference.

    If EXP is a comparison reference, we return the innermost reference.

 @@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
  }

 -/* Subroutine for fold_truthop: determine if an operand is simple enough
 +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
 enough
    to be evaluated unconditionally.  */

  static int
 @@ -3678,7 +3678,7 @@ simple_operand_p (const_tree exp)
   STRIP_NOPS (exp);

   return (CONSTANT_CLASS_P (exp)
 -         || TREE_CODE (exp) == SSA_NAME
 +         || TREE_CODE (exp) == SSA_NAME
          || (DECL_P (exp)
               ! TREE_ADDRESSABLE (exp)
               ! TREE_THIS_VOLATILE (exp)
 @@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
                 registers aren't expensive.  */
               (! TREE_STATIC (exp) || DECL_REGISTER (exp;
  }
 +
 +/* Subroutine for fold_truth_andor: determine if an operand is simple enough
 +   to be evaluated unconditionally.
 +   I addition to simple_operand_p, we assume that comparisons and logic-not
 +   operations are simple, if their operands are simple, too.  */
 +
 +static bool
 +simple_operand_p_2 (tree exp)
 +{
 +  enum tree_code code;
 +
 +  /* Strip any conversions that don't change the machine mode.  */
 +  STRIP_NOPS (exp);
 +
 +  code = TREE_CODE (exp);
 +
 +  if (TREE_CODE_CLASS (code) == tcc_comparison)
 +    return (!tree_could_trap_p (exp)
 +            simple_operand_p_2 (TREE_OPERAND (exp, 0))
 +            simple_operand_p_2 (TREE_OPERAND (exp, 1)));

 recurse with simple_operand_p.

No, as this again would reject simple operations and additionally
wouldn't check for trapping.

 +
 +  if (TREE_SIDE_EFFECTS (exp)
 +      || tree_could_trap_p (exp))

 Move this check before the tcc_comparison check and remove the
 then redundant tree_could_trap_p check there.

Ok

 +    return false;
 +
 +  switch (code)
 +    {
 +    case SSA_NAME:
 +      return true;

 Do not handle here, it's handled in simple_operand_p.

Well, was more a short-cut here.

 +    case TRUTH_NOT_EXPR:
 +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
 +    case BIT_NOT_EXPR:
 +      if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
 +       return false;

 Remove the BIT_NOT_EXPR handling.  Thus, simply change this switch
 to

Why should we reject simple ~X operations from gimplified code here?
I admit that from FE-code we won't see that, as always an integer-cast
is done for foo (_Bool x) { ... if (~x) ... }, but from
gimplified-code this is the general description of an boolean-typed !=
0?

 if (code == TRUTH_NOT_EXPR)
  return simple_operand_p_2 (TREE_OPERAND (exp, 0));

 return simple_operand_p (exp);

 +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
 +    default:
 +  

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-17 Thread Richard Guenther
On Mon, Oct 17, 2011 at 12:59 PM, Kai Tietz ktiet...@googlemail.com wrote:
 2011/10/17 Richard Guenther richard.guent...@gmail.com:
 On Fri, Oct 14, 2011 at 9:43 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Hello,

 So I committed the gimplify patch separate.  And here is the remaining
 fold-const patch.
 The important tests here are in gcc.dg/tree-ssa/builtin-expect[1-4].c, which
 cover the one special-case for branching. Also tree-ssa/20040204-1.c covers
 tests for branching code (on targets having high-engough BRANCH_COST and no
 special-casing - like MIPS, S/390, and AVR.

 ChangeLog

 2011-10-14  Kai Tietz  kti...@redhat.com

        * fold-const.c (simple_operand_p_2): New function.
        (fold_truthop): Rename to
        (fold_truth_andor_1): function name.
        Additionally remove branching creation for logical and/or.
        (fold_truth_andor): Handle branching creation for logical and/or 
 here.

 Bootstrapped and regression-tested for all languages plus Ada and
 Obj-C++ on x86_64-pc-linux-gnu.
 Ok for apply?

 Ok with ...

 Regards,
 Kai

 Index: gcc/gcc/fold-const.c
 ===
 --- gcc.orig/gcc/fold-const.c
 +++ gcc/gcc/fold-const.c
 @@ -112,13 +112,13 @@ static tree decode_field_reference (loca
  static int all_ones_mask_p (const_tree, int);
  static tree sign_bit_p (tree, const_tree);
  static int simple_operand_p (const_tree);
 +static bool simple_operand_p_2 (tree);
  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
  static tree range_predecessor (tree);
  static tree range_successor (tree);
  static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
 tree, tree);
  static tree unextend (tree, int, int, tree);
 -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
  static tree optimize_minmax_comparison (location_t, enum tree_code,
                                        tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 @@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
  }

 -/* Subroutine for fold_truthop: decode a field reference.
 +/* Subroutine for fold_truth_andor_1: decode a field reference.

    If EXP is a comparison reference, we return the innermost reference.

 @@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
  }

 -/* Subroutine for fold_truthop: determine if an operand is simple enough
 +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
 enough
    to be evaluated unconditionally.  */

  static int
 @@ -3678,7 +3678,7 @@ simple_operand_p (const_tree exp)
   STRIP_NOPS (exp);

   return (CONSTANT_CLASS_P (exp)
 -         || TREE_CODE (exp) == SSA_NAME
 +         || TREE_CODE (exp) == SSA_NAME
          || (DECL_P (exp)
               ! TREE_ADDRESSABLE (exp)
               ! TREE_THIS_VOLATILE (exp)
 @@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
                 registers aren't expensive.  */
               (! TREE_STATIC (exp) || DECL_REGISTER (exp;
  }
 +
 +/* Subroutine for fold_truth_andor: determine if an operand is simple 
 enough
 +   to be evaluated unconditionally.
 +   I addition to simple_operand_p, we assume that comparisons and logic-not
 +   operations are simple, if their operands are simple, too.  */
 +
 +static bool
 +simple_operand_p_2 (tree exp)
 +{
 +  enum tree_code code;
 +
 +  /* Strip any conversions that don't change the machine mode.  */
 +  STRIP_NOPS (exp);
 +
 +  code = TREE_CODE (exp);
 +
 +  if (TREE_CODE_CLASS (code) == tcc_comparison)
 +    return (!tree_could_trap_p (exp)
 +            simple_operand_p_2 (TREE_OPERAND (exp, 0))
 +            simple_operand_p_2 (TREE_OPERAND (exp, 1)));

 recurse with simple_operand_p.

 No, as this again would reject simple operations and additionally
 wouldn't check for trapping.

?  Your code allows arbitrarily complex expressions.  Also
tree_could_trap_p obviously extents to operands.


 +
 +  if (TREE_SIDE_EFFECTS (exp)
 +      || tree_could_trap_p (exp))

 Move this check before the tcc_comparison check and remove the
 then redundant tree_could_trap_p check there.

 Ok

 +    return false;
 +
 +  switch (code)
 +    {
 +    case SSA_NAME:
 +      return true;

 Do not handle here, it's handled in simple_operand_p.

 Well, was more a short-cut here.

 +    case TRUTH_NOT_EXPR:
 +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
 +    case BIT_NOT_EXPR:
 +      if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
 +       return false;

 Remove the BIT_NOT_EXPR handling.  Thus, simply change this switch
 to

 Why should we reject simple ~X operations from gimplified code here?

Because this is FE triggered code.  From gimple you won't ever see
such complex expressions (never even the TRUTH_AND*_EXPR variants).

 I admit that from FE-code we won't see that, as always an 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-17 Thread Kai Tietz
2011/10/17 Richard Guenther richard.guent...@gmail.com:
 On Mon, Oct 17, 2011 at 12:59 PM, Kai Tietz ktiet...@googlemail.com wrote:
 2011/10/17 Richard Guenther richard.guent...@gmail.com:
 On Fri, Oct 14, 2011 at 9:43 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Hello,

 So I committed the gimplify patch separate.  And here is the remaining
 fold-const patch.
 The important tests here are in gcc.dg/tree-ssa/builtin-expect[1-4].c, 
 which
 cover the one special-case for branching. Also tree-ssa/20040204-1.c covers
 tests for branching code (on targets having high-engough BRANCH_COST and no
 special-casing - like MIPS, S/390, and AVR.

 ChangeLog

 2011-10-14  Kai Tietz  kti...@redhat.com

        * fold-const.c (simple_operand_p_2): New function.
        (fold_truthop): Rename to
        (fold_truth_andor_1): function name.
        Additionally remove branching creation for logical and/or.
        (fold_truth_andor): Handle branching creation for logical and/or 
 here.

 Bootstrapped and regression-tested for all languages plus Ada and
 Obj-C++ on x86_64-pc-linux-gnu.
 Ok for apply?

 Ok with ...

 Regards,
 Kai

 Index: gcc/gcc/fold-const.c
 ===
 --- gcc.orig/gcc/fold-const.c
 +++ gcc/gcc/fold-const.c
 @@ -112,13 +112,13 @@ static tree decode_field_reference (loca
  static int all_ones_mask_p (const_tree, int);
  static tree sign_bit_p (tree, const_tree);
  static int simple_operand_p (const_tree);
 +static bool simple_operand_p_2 (tree);
  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
  static tree range_predecessor (tree);
  static tree range_successor (tree);
  static tree fold_range_test (location_t, enum tree_code, tree, tree, 
 tree);
  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
 tree, tree);
  static tree unextend (tree, int, int, tree);
 -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
  static tree optimize_minmax_comparison (location_t, enum tree_code,
                                        tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 @@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
  }

 -/* Subroutine for fold_truthop: decode a field reference.
 +/* Subroutine for fold_truth_andor_1: decode a field reference.

    If EXP is a comparison reference, we return the innermost reference.

 @@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
  }

 -/* Subroutine for fold_truthop: determine if an operand is simple enough
 +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
 enough
    to be evaluated unconditionally.  */

  static int
 @@ -3678,7 +3678,7 @@ simple_operand_p (const_tree exp)
   STRIP_NOPS (exp);

   return (CONSTANT_CLASS_P (exp)
 -         || TREE_CODE (exp) == SSA_NAME
 +         || TREE_CODE (exp) == SSA_NAME
          || (DECL_P (exp)
               ! TREE_ADDRESSABLE (exp)
               ! TREE_THIS_VOLATILE (exp)
 @@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
                 registers aren't expensive.  */
               (! TREE_STATIC (exp) || DECL_REGISTER (exp;
  }
 +
 +/* Subroutine for fold_truth_andor: determine if an operand is simple 
 enough
 +   to be evaluated unconditionally.
 +   I addition to simple_operand_p, we assume that comparisons and 
 logic-not
 +   operations are simple, if their operands are simple, too.  */
 +
 +static bool
 +simple_operand_p_2 (tree exp)
 +{
 +  enum tree_code code;
 +
 +  /* Strip any conversions that don't change the machine mode.  */
 +  STRIP_NOPS (exp);
 +
 +  code = TREE_CODE (exp);
 +
 +  if (TREE_CODE_CLASS (code) == tcc_comparison)
 +    return (!tree_could_trap_p (exp)
 +            simple_operand_p_2 (TREE_OPERAND (exp, 0))
 +            simple_operand_p_2 (TREE_OPERAND (exp, 1)));

 recurse with simple_operand_p.

 No, as this again would reject simple operations and additionally
 wouldn't check for trapping.

 ?  Your code allows arbitrarily complex expressions.  Also
 tree_could_trap_p obviously extents to operands.

Ah, ok. I wasn't aware that it walks into tree.


 +
 +  if (TREE_SIDE_EFFECTS (exp)
 +      || tree_could_trap_p (exp))

 Move this check before the tcc_comparison check and remove the
 then redundant tree_could_trap_p check there.

 Ok

 +    return false;
 +
 +  switch (code)
 +    {
 +    case SSA_NAME:
 +      return true;

 Do not handle here, it's handled in simple_operand_p.

 Well, was more a short-cut here.

 +    case TRUTH_NOT_EXPR:
 +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
 +    case BIT_NOT_EXPR:
 +      if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
 +       return false;

 Remove the BIT_NOT_EXPR handling.  Thus, simply change this switch
 to

 Why should we reject simple ~X operations from gimplified code here?

 Because this is FE triggered code.  From gimple you won't ever see
 such complex 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-17 Thread Richard Guenther
On Mon, Oct 17, 2011 at 1:31 PM, Kai Tietz ktiet...@googlemail.com wrote:
 2011/10/17 Richard Guenther richard.guent...@gmail.com:
 On Mon, Oct 17, 2011 at 12:59 PM, Kai Tietz ktiet...@googlemail.com wrote:
 2011/10/17 Richard Guenther richard.guent...@gmail.com:
 On Fri, Oct 14, 2011 at 9:43 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Hello,

 So I committed the gimplify patch separate.  And here is the remaining
 fold-const patch.
 The important tests here are in gcc.dg/tree-ssa/builtin-expect[1-4].c, 
 which
 cover the one special-case for branching. Also tree-ssa/20040204-1.c 
 covers
 tests for branching code (on targets having high-engough BRANCH_COST and 
 no
 special-casing - like MIPS, S/390, and AVR.

 ChangeLog

 2011-10-14  Kai Tietz  kti...@redhat.com

        * fold-const.c (simple_operand_p_2): New function.
        (fold_truthop): Rename to
        (fold_truth_andor_1): function name.
        Additionally remove branching creation for logical and/or.
        (fold_truth_andor): Handle branching creation for logical and/or 
 here.

 Bootstrapped and regression-tested for all languages plus Ada and
 Obj-C++ on x86_64-pc-linux-gnu.
 Ok for apply?

 Ok with ...

 Regards,
 Kai

 Index: gcc/gcc/fold-const.c
 ===
 --- gcc.orig/gcc/fold-const.c
 +++ gcc/gcc/fold-const.c
 @@ -112,13 +112,13 @@ static tree decode_field_reference (loca
  static int all_ones_mask_p (const_tree, int);
  static tree sign_bit_p (tree, const_tree);
  static int simple_operand_p (const_tree);
 +static bool simple_operand_p_2 (tree);
  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
  static tree range_predecessor (tree);
  static tree range_successor (tree);
  static tree fold_range_test (location_t, enum tree_code, tree, tree, 
 tree);
  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
 tree, tree);
  static tree unextend (tree, int, int, tree);
 -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
  static tree optimize_minmax_comparison (location_t, enum tree_code,
                                        tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 @@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
  }

 -/* Subroutine for fold_truthop: decode a field reference.
 +/* Subroutine for fold_truth_andor_1: decode a field reference.

    If EXP is a comparison reference, we return the innermost reference.

 @@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
  }

 -/* Subroutine for fold_truthop: determine if an operand is simple enough
 +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
 enough
    to be evaluated unconditionally.  */

  static int
 @@ -3678,7 +3678,7 @@ simple_operand_p (const_tree exp)
   STRIP_NOPS (exp);

   return (CONSTANT_CLASS_P (exp)
 -         || TREE_CODE (exp) == SSA_NAME
 +         || TREE_CODE (exp) == SSA_NAME
          || (DECL_P (exp)
               ! TREE_ADDRESSABLE (exp)
               ! TREE_THIS_VOLATILE (exp)
 @@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
                 registers aren't expensive.  */
               (! TREE_STATIC (exp) || DECL_REGISTER (exp;
  }
 +
 +/* Subroutine for fold_truth_andor: determine if an operand is simple 
 enough
 +   to be evaluated unconditionally.
 +   I addition to simple_operand_p, we assume that comparisons and 
 logic-not
 +   operations are simple, if their operands are simple, too.  */
 +
 +static bool
 +simple_operand_p_2 (tree exp)
 +{
 +  enum tree_code code;
 +
 +  /* Strip any conversions that don't change the machine mode.  */
 +  STRIP_NOPS (exp);
 +
 +  code = TREE_CODE (exp);
 +
 +  if (TREE_CODE_CLASS (code) == tcc_comparison)
 +    return (!tree_could_trap_p (exp)
 +            simple_operand_p_2 (TREE_OPERAND (exp, 0))
 +            simple_operand_p_2 (TREE_OPERAND (exp, 1)));

 recurse with simple_operand_p.

 No, as this again would reject simple operations and additionally
 wouldn't check for trapping.

 ?  Your code allows arbitrarily complex expressions.  Also
 tree_could_trap_p obviously extents to operands.

 Ah, ok. I wasn't aware that it walks into tree.


 +
 +  if (TREE_SIDE_EFFECTS (exp)
 +      || tree_could_trap_p (exp))

 Move this check before the tcc_comparison check and remove the
 then redundant tree_could_trap_p check there.

 Ok

 +    return false;
 +
 +  switch (code)
 +    {
 +    case SSA_NAME:
 +      return true;

 Do not handle here, it's handled in simple_operand_p.

 Well, was more a short-cut here.

 +    case TRUTH_NOT_EXPR:
 +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
 +    case BIT_NOT_EXPR:
 +      if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
 +       return false;

 Remove the BIT_NOT_EXPR handling.  Thus, simply change this switch
 to

 Why should we reject simple ~X operations from gimplified code here?

 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-17 Thread Kai Tietz
Ok, I see.  This might be profitable to do that.  So fold_truth_op
hunk looks like this

@@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_
   build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg),
   ll_arg, rl_arg),
   build_int_cst (TREE_TYPE (ll_arg), 0));
-
-  if (LOGICAL_OP_NON_SHORT_CIRCUIT)
-   {
- if (code != orig_code || lhs != orig_lhs || rhs != orig_rhs)
-   return build2_loc (loc, code, truth_type, lhs, rhs);
- return NULL_TREE;
-   }
 }

   /* See if the comparisons can be merged.  Then get all the parameters for
@@ -8380,13 +8400,77 @@ fold_truth_andor (location_t loc, enum t
  lhs is another similar operation, try to merge its rhs with our
  rhs.  Then try to merge our lhs and rhs.  */
   if (TREE_CODE (arg0) == code
-   0 != (tem = fold_truthop (loc, code, type,
-  TREE_OPERAND (arg0, 1), arg1)))
+   0 != (tem = fold_truth_andor_1 (loc, code, type,
+TREE_OPERAND (arg0, 1), arg1)))
 return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0), tem);

-  if ((tem = fold_truthop (loc, code, type, arg0, arg1)) != 0)
+  if ((tem = fold_truth_andor_1 (loc, code, type, arg0, arg1)) != 0)
 return tem;

+  if ((BRANCH_COST (optimize_function_for_speed_p (cfun),
+   false) = 2)
+   LOGICAL_OP_NON_SHORT_CIRCUIT
+   simple_operand_p_2 (arg1))
+{
+  enum tree_code ncode;
+
+  if (code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR)
+{
+ ncode = (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR : TRUTH_OR_EXPR);
+
+ /* Transform ((A AND-IF B) AND-IF C) into (A AND-IF (B AND C)),
+or ((A OR-IF B) OR-IF C) into (A OR-IF (B OR C))
+We don't want to pack more than two leafs to a non-IF AND/OR
+expression.
+If tree-code of left-hand operand isn't an AND/OR-IF code and not
+equal to CODE, then we don't want to add right-hand operand.
+If the inner right-hand side of left-hand operand has
+side-effects, or isn't simple, then we can't add to it,
+as otherwise we might destroy if-sequence.  */
+ if (TREE_CODE (arg0) == code
+ /* Needed for sequence points to handle trappings, and
+side-effects.  */
+  simple_operand_p_2 (TREE_OPERAND (arg0, 1)))
+   {
+ tem = fold_build2_loc (loc, ncode, type, TREE_OPERAND (arg0, 1),
+arg1);
+ return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0),
+ tem);
+   }
+ /* Transform (A AND-IF B) into (A AND B), or (A OR-IF B)
+into (A OR B).
+For sequence point consistancy, we need to check for trapping,
+and side-effects.  */
+ else if (simple_operand_p_2 (arg0))
+   return fold_build2_loc (loc, ncode, type, arg0, arg1);
+   }
+  else
+{
+ ncode = (code == TRUTH_AND_EXPR ? TRUTH_ANDIF_EXPR
+ : TRUTH_ORIF_EXPR);
+ /* Transform ((A AND-IF B) AND C) into (A AND-IF (B AND C)),
+or ((A OR-IF B) OR C) into (A OR-IF (B OR C))
+We don't want to pack more than two leafs to a non-IF AND/OR
+expression.
+If tree-code of left-hand operand isn't an AND/OR-IF code and not
+equal to NCODE, then we don't want to add right-hand operand.
+If the inner right-hand side of left-hand operand has
+side-effects, or isn't simple, then we can't add to it,
+as otherwise we might destroy if-sequence.  */
+ if (TREE_CODE (arg0) == ncode
+ /* Needed for sequence points to handle trappings, and
+side-effects.  */
+  simple_operand_p_2 (TREE_OPERAND (arg0, 1)))
+   {
+ tem = fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 1),
+arg1);
+ return fold_build2_loc (loc, ncode, type,
+ TREE_OPERAND (arg0, 0), tem);
+   }
+   }
+
+}
+
   return NULL_TREE;
 }

Ok, with other changes you mentioned?

Regards,
Kai


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-17 Thread Richard Guenther
On Mon, Oct 17, 2011 at 2:22 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Ok, I see.  This might be profitable to do that.  So fold_truth_op
 hunk looks like this

 @@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_
                           build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg),
                                   ll_arg, rl_arg),
                           build_int_cst (TREE_TYPE (ll_arg), 0));
 -
 -      if (LOGICAL_OP_NON_SHORT_CIRCUIT)
 -       {
 -         if (code != orig_code || lhs != orig_lhs || rhs != orig_rhs)
 -           return build2_loc (loc, code, truth_type, lhs, rhs);
 -         return NULL_TREE;
 -       }
     }

   /* See if the comparisons can be merged.  Then get all the parameters for
 @@ -8380,13 +8400,77 @@ fold_truth_andor (location_t loc, enum t
      lhs is another similar operation, try to merge its rhs with our
      rhs.  Then try to merge our lhs and rhs.  */
   if (TREE_CODE (arg0) == code
 -       0 != (tem = fold_truthop (loc, code, type,
 -                                  TREE_OPERAND (arg0, 1), arg1)))
 +       0 != (tem = fold_truth_andor_1 (loc, code, type,
 +                                        TREE_OPERAND (arg0, 1), arg1)))
     return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0), tem);

 -  if ((tem = fold_truthop (loc, code, type, arg0, arg1)) != 0)
 +  if ((tem = fold_truth_andor_1 (loc, code, type, arg0, arg1)) != 0)
     return tem;

 +  if ((BRANCH_COST (optimize_function_for_speed_p (cfun),
 +                   false) = 2)
 +       LOGICAL_OP_NON_SHORT_CIRCUIT
 +       simple_operand_p_2 (arg1))
 +    {
 +      enum tree_code ncode;
 +
 +      if (code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR)
 +        {
 +         ncode = (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR : TRUTH_OR_EXPR);
 +
 +         /* Transform ((A AND-IF B) AND-IF C) into (A AND-IF (B AND C)),
 +            or ((A OR-IF B) OR-IF C) into (A OR-IF (B OR C))
 +            We don't want to pack more than two leafs to a non-IF AND/OR
 +            expression.
 +            If tree-code of left-hand operand isn't an AND/OR-IF code and not
 +            equal to CODE, then we don't want to add right-hand operand.
 +            If the inner right-hand side of left-hand operand has
 +            side-effects, or isn't simple, then we can't add to it,
 +            as otherwise we might destroy if-sequence.  */
 +         if (TREE_CODE (arg0) == code
 +             /* Needed for sequence points to handle trappings, and
 +                side-effects.  */
 +              simple_operand_p_2 (TREE_OPERAND (arg0, 1)))
 +           {
 +             tem = fold_build2_loc (loc, ncode, type, TREE_OPERAND (arg0, 1),
 +                                    arg1);
 +             return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0),
 +                                     tem);
 +           }
 +         /* Transform (A AND-IF B) into (A AND B), or (A OR-IF B)
 +            into (A OR B).
 +            For sequence point consistancy, we need to check for trapping,
 +            and side-effects.  */
 +         else if (simple_operand_p_2 (arg0))
 +           return fold_build2_loc (loc, ncode, type, arg0, arg1);
 +       }
 +      else
 +        {
 +         ncode = (code == TRUTH_AND_EXPR ? TRUTH_ANDIF_EXPR
 +                                         : TRUTH_ORIF_EXPR);
 +         /* Transform ((A AND-IF B) AND C) into (A AND-IF (B AND C)),
 +            or ((A OR-IF B) OR C) into (A OR-IF (B OR C))
 +            We don't want to pack more than two leafs to a non-IF AND/OR
 +            expression.
 +            If tree-code of left-hand operand isn't an AND/OR-IF code and not
 +            equal to NCODE, then we don't want to add right-hand operand.
 +            If the inner right-hand side of left-hand operand has
 +            side-effects, or isn't simple, then we can't add to it,
 +            as otherwise we might destroy if-sequence.  */
 +         if (TREE_CODE (arg0) == ncode
 +             /* Needed for sequence points to handle trappings, and
 +                side-effects.  */
 +              simple_operand_p_2 (TREE_OPERAND (arg0, 1)))
 +           {
 +             tem = fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 1),
 +                                    arg1);
 +             return fold_build2_loc (loc, ncode, type,
 +                                     TREE_OPERAND (arg0, 0), tem);
 +           }
 +       }
 +
 +    }
 +
   return NULL_TREE;
  }

 Ok, with other changes you mentioned?

This can be done without so much code duplication.

 Regards,
 Kai



Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-17 Thread Kai Tietz
Sure,

Is simplier and also handles (A T[-IF] (B T-IF C) - (A T B) T-IF C
case, which can happen by framing in conditions.

@@ -8380,13 +8400,65 @@ fold_truth_andor (location_t loc, enum t
  lhs is another similar operation, try to merge its rhs with our
  rhs.  Then try to merge our lhs and rhs.  */
   if (TREE_CODE (arg0) == code
-   0 != (tem = fold_truthop (loc, code, type,
-  TREE_OPERAND (arg0, 1), arg1)))
+   0 != (tem = fold_truth_andor_1 (loc, code, type,
+TREE_OPERAND (arg0, 1), arg1)))
 return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0), tem);

-  if ((tem = fold_truthop (loc, code, type, arg0, arg1)) != 0)
+  if ((tem = fold_truth_andor_1 (loc, code, type, arg0, arg1)) != 0)
 return tem;

+  if ((BRANCH_COST (optimize_function_for_speed_p (cfun),
+   false) = 2)
+   LOGICAL_OP_NON_SHORT_CIRCUIT)
+{
+  enum tree_code ncode, icode;
+
+  ncode = (code == TRUTH_ANDIF_EXPR || code == TRUTH_AND_EXPR)
+ ? TRUTH_AND_EXPR : TRUTH_OR_EXPR;
+  icode = ncode == TRUTH_AND_EXPR ? TRUTH_ANDIF_EXPR : TRUTH_ORIF_EXPR;
+
+  /* Transform ((A AND-IF B) AND[-IF] C) into (A AND-IF (B AND C)),
+or ((A OR-IF B) OR[-IF] C) into (A OR-IF (B OR C))
+We don't want to pack more than two leafs to a non-IF AND/OR
+expression.
+If tree-code of left-hand operand isn't an AND/OR-IF code and not
+equal to IF-CODE, then we don't want to add right-hand operand.
+If the inner right-hand side of left-hand operand has
+side-effects, or isn't simple, then we can't add to it,
+as otherwise we might destroy if-sequence.  */
+  if (TREE_CODE (arg0) == icode
+  simple_operand_p_2 (arg1)
+ /* Needed for sequence points to handle trappings, and
+side-effects.  */
+  simple_operand_p_2 (TREE_OPERAND (arg0, 1)))
+   {
+ tem = fold_build2_loc (loc, ncode, type, TREE_OPERAND (arg0, 1),
+arg1);
+ return fold_build2_loc (loc, icode, type, TREE_OPERAND (arg0, 0),
+ tem);
+   }
+   /* Same as abouve but for (A AND[-IF] (B AND-IF C)) - ((A AND B) 
AND-IF C),
+  or (A OR[-IF] (B OR-IF C) - ((A OR B) OR-IF C).  */
+  else if (TREE_CODE (arg1) == icode
+  simple_operand_p_2 (arg0)
+ /* Needed for sequence points to handle trappings, and
+side-effects.  */
+  simple_operand_p_2 (TREE_OPERAND (arg1, 0)))
+   {
+ tem = fold_build2_loc (loc, ncode, type,
+arg0, TREE_OPERAND (arg1, 0));
+ return fold_build2_loc (loc, icode, type, tem,
+ TREE_OPERAND (arg1, 1));
+   }
+  /* Transform (A AND-IF B) into (A AND B), or (A OR-IF B)
+into (A OR B).
+For sequence point consistancy, we need to check for trapping,
+and side-effects.  */
+  else if (code == icode  simple_operand_p_2 (arg0)
+simple_operand_p_2 (arg1))
+   return fold_build2_loc (loc, ncode, type, arg0, arg1);
+}
+
   return NULL_TREE;
 }

Regards,
Kai


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-17 Thread Richard Guenther
On Mon, Oct 17, 2011 at 3:36 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Sure,

 Is simplier and also handles (A T[-IF] (B T-IF C) - (A T B) T-IF C
 case, which can happen by framing in conditions.

 @@ -8380,13 +8400,65 @@ fold_truth_andor (location_t loc, enum t
      lhs is another similar operation, try to merge its rhs with our
      rhs.  Then try to merge our lhs and rhs.  */
   if (TREE_CODE (arg0) == code
 -       0 != (tem = fold_truthop (loc, code, type,
 -                                  TREE_OPERAND (arg0, 1), arg1)))
 +       0 != (tem = fold_truth_andor_1 (loc, code, type,
 +                                        TREE_OPERAND (arg0, 1), arg1)))
     return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0), tem);

 -  if ((tem = fold_truthop (loc, code, type, arg0, arg1)) != 0)
 +  if ((tem = fold_truth_andor_1 (loc, code, type, arg0, arg1)) != 0)
     return tem;

 +  if ((BRANCH_COST (optimize_function_for_speed_p (cfun),
 +                   false) = 2)
 +       LOGICAL_OP_NON_SHORT_CIRCUIT)
 +    {
 +      enum tree_code ncode, icode;
 +
 +      ncode = (code == TRUTH_ANDIF_EXPR || code == TRUTH_AND_EXPR)
 +             ? TRUTH_AND_EXPR : TRUTH_OR_EXPR;
 +      icode = ncode == TRUTH_AND_EXPR ? TRUTH_ANDIF_EXPR : TRUTH_ORIF_EXPR;
 +
 +      /* Transform ((A AND-IF B) AND[-IF] C) into (A AND-IF (B AND C)),
 +        or ((A OR-IF B) OR[-IF] C) into (A OR-IF (B OR C))
 +        We don't want to pack more than two leafs to a non-IF AND/OR
 +        expression.
 +        If tree-code of left-hand operand isn't an AND/OR-IF code and not
 +        equal to IF-CODE, then we don't want to add right-hand operand.
 +        If the inner right-hand side of left-hand operand has
 +        side-effects, or isn't simple, then we can't add to it,
 +        as otherwise we might destroy if-sequence.  */
 +      if (TREE_CODE (arg0) == icode
 +          simple_operand_p_2 (arg1)
 +         /* Needed for sequence points to handle trappings, and
 +            side-effects.  */
 +          simple_operand_p_2 (TREE_OPERAND (arg0, 1)))
 +       {
 +         tem = fold_build2_loc (loc, ncode, type, TREE_OPERAND (arg0, 1),
 +                                arg1);
 +         return fold_build2_loc (loc, icode, type, TREE_OPERAND (arg0, 0),
 +                                 tem);
 +       }
 +       /* Same as abouve but for (A AND[-IF] (B AND-IF C)) - ((A AND B) 
 AND-IF C),
 +          or (A OR[-IF] (B OR-IF C) - ((A OR B) OR-IF C).  */
 +      else if (TREE_CODE (arg1) == icode
 +          simple_operand_p_2 (arg0)
 +         /* Needed for sequence points to handle trappings, and
 +            side-effects.  */
 +          simple_operand_p_2 (TREE_OPERAND (arg1, 0)))
 +       {
 +         tem = fold_build2_loc (loc, ncode, type,
 +                                arg0, TREE_OPERAND (arg1, 0));
 +         return fold_build2_loc (loc, icode, type, tem,
 +                                 TREE_OPERAND (arg1, 1));
 +       }
 +      /* Transform (A AND-IF B) into (A AND B), or (A OR-IF B)
 +        into (A OR B).
 +        For sequence point consistancy, we need to check for trapping,
 +        and side-effects.  */
 +      else if (code == icode  simple_operand_p_2 (arg0)
 +                simple_operand_p_2 (arg1))
 +       return fold_build2_loc (loc, ncode, type, arg0, arg1);
 +    }
 +
   return NULL_TREE;
  }

Ok with the rest of the changes I requested.

Richard.

 Regards,
 Kai



Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-14 Thread Richard Guenther
On Thu, Oct 13, 2011 at 3:25 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Hello,

 this new version addresses the comments from you.
 On gimplify.c's gimplify_expr we didn't handled the case that operands
 for TRUTH-AND/OR/XOR expressions need to have same operand-size in
 case  of transformation to bitwise-binary operation.  This shows up
 for Fortran, as there are more than one boolean-kind type with
 different mode-sizes.  I added a testcase for this,

The gimplify.c bits and the new testcase is ok.  They should have been
submitted separately.

Please re-submit the fold-const.c part.

Thanks,
Richard.

 ChangeLog

 2011-10-13  Kai Tietz  kti...@redhat.com

        * fold-const.c (simple_operand_p_2): New function.
        (fold_truthop): Rename to
        (fold_truth_andor_1): function name.
        Additionally remove branching creation for logical and/or.
        (fold_truth_andor): Handle branching creation for logical and/or here.
        * gimplify.c (gimplify_expr): Take care that for bitwise-binary
        transformation the operands have compatible types.

 2011-10-13  Kai Tietz  kti...@redhat.com

        * gfortran.fortran-torture/compile/logical-2.f90: New test.

 Bootstrapped and regression-tested for all languages plus Ada and
 Obj-C++ on x86_64-pc-linux-gnu.
 Ok for apply?

 Regards,
 Kai

 Index: gcc/gcc/fold-const.c
 ===
 --- gcc.orig/gcc/fold-const.c
 +++ gcc/gcc/fold-const.c
 @@ -112,13 +112,13 @@ static tree decode_field_reference (loca
  static int all_ones_mask_p (const_tree, int);
  static tree sign_bit_p (tree, const_tree);
  static int simple_operand_p (const_tree);
 +static bool simple_operand_p_2 (tree);
  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
  static tree range_predecessor (tree);
  static tree range_successor (tree);
  static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
 tree, tree);
  static tree unextend (tree, int, int, tree);
 -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
  static tree optimize_minmax_comparison (location_t, enum tree_code,
                                        tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 @@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
  }

 -/* Subroutine for fold_truthop: decode a field reference.
 +/* Subroutine for fold_truth_andor_1: decode a field reference.

    If EXP is a comparison reference, we return the innermost reference.

 @@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
  }

 -/* Subroutine for fold_truthop: determine if an operand is simple enough
 +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
 enough
    to be evaluated unconditionally.  */

  static int
 @@ -3678,7 +3678,7 @@ simple_operand_p (const_tree exp)
   STRIP_NOPS (exp);

   return (CONSTANT_CLASS_P (exp)
 -         || TREE_CODE (exp) == SSA_NAME
 +         || TREE_CODE (exp) == SSA_NAME
          || (DECL_P (exp)
               ! TREE_ADDRESSABLE (exp)
               ! TREE_THIS_VOLATILE (exp)
 @@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
                 registers aren't expensive.  */
               (! TREE_STATIC (exp) || DECL_REGISTER (exp;
  }
 +
 +/* Subroutine for fold_truth_andor: determine if an operand is simple enough
 +   to be evaluated unconditionally.
 +   I addition to simple_operand_p, we assume that comparisons and logic-not
 +   operations are simple, if their operands are simple, too.  */
 +
 +static bool
 +simple_operand_p_2 (tree exp)
 +{
 +  enum tree_code code;
 +
 +  /* Strip any conversions that don't change the machine mode.  */
 +  STRIP_NOPS (exp);
 +
 +  code = TREE_CODE (exp);
 +
 +  if (TREE_CODE_CLASS (code) == tcc_comparison)
 +    return (!tree_could_trap_p (exp)
 +            simple_operand_p_2 (TREE_OPERAND (exp, 0))
 +            simple_operand_p_2 (TREE_OPERAND (exp, 1)));
 +
 +  if (TREE_SIDE_EFFECTS (exp)
 +      || tree_could_trap_p (exp))
 +    return false;
 +
 +  switch (code)
 +    {
 +    case SSA_NAME:
 +      return true;
 +    case TRUTH_NOT_EXPR:
 +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
 +    case BIT_NOT_EXPR:
 +      if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
 +       return false;
 +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
 +    default:
 +      return simple_operand_p (exp);
 +    }
 +}
 +

  /* The following functions are subroutines to fold_range_test and allow it to
    try to change a logical combination of comparisons into a range test.
 @@ -4888,7 +4928,7 @@ fold_range_test (location_t loc, enum tr
   return 0;
  }

 -/* Subroutine for fold_truthop: C is an INTEGER_CST interpreted as a P
 +/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P
    bit value.  Arrange 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-13 Thread Kai Tietz
Hello,

this new version addresses the comments from Michael and additional fixes
an latent issue shown up by this rewrite in fold-const.
On gimplify.c's gimple_boolify we didn't handled the case that operands
for TRUTH-expressions need to have same operand-size for transformation to
bitwise operation.  This shows up for Fortran, as here are more then one
boolean-kind type with different mode-sizes.  I added a testcase for this,

ChangeLog

2011-10-13  Kai Tietz  kti...@redhat.com

* fold-const.c (simple_operand_p_2): New function.
(fold_truthop): Rename to
(fold_truth_andor_1): function name.
Additionally remove branching creation for logical and/or.
(fold_truth_andor): Handle branching creation for logical and/or here.
* gimplify.c (gimple_boolify): Take care that for bitwise-binary
transformation the operands have compatible types.

2011-10-13  Kai Tietz  kti...@redhat.com

* gfortran.fortran-torture/compile/logical-2.f90: New test.

Bootstrapped and regression-tested for all languages plus Ada and
Obj-C++ on x86_64-pc-linux-gnu.
Ok for apply?

Regards,
Kai

Index: gcc/gcc/fold-const.c
===
--- gcc.orig/gcc/fold-const.c
+++ gcc/gcc/fold-const.c
@@ -112,13 +112,13 @@ static tree decode_field_reference (loca
 static int all_ones_mask_p (const_tree, int);
 static tree sign_bit_p (tree, const_tree);
 static int simple_operand_p (const_tree);
+static bool simple_operand_p_2 (tree);
 static tree range_binop (enum tree_code, tree, tree, int, tree, int);
 static tree range_predecessor (tree);
 static tree range_successor (tree);
 static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
 static tree fold_cond_expr_with_comparison (location_t, tree, tree,
tree, tree);
 static tree unextend (tree, int, int, tree);
-static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
 static tree optimize_minmax_comparison (location_t, enum tree_code,
tree, tree, tree);
 static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
@@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
 }
 
-/* Subroutine for fold_truthop: decode a field reference.
+/* Subroutine for fold_truth_andor_1: decode a field reference.

If EXP is a comparison reference, we return the innermost reference.

@@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
 }

-/* Subroutine for fold_truthop: determine if an operand is simple enough
+/* Subroutine for fold_truth_andor_1: determine if an operand is simple enough
to be evaluated unconditionally.  */

 static int
@@ -3678,7 +3678,7 @@ simple_operand_p (const_tree exp)
   STRIP_NOPS (exp);

   return (CONSTANT_CLASS_P (exp)
- || TREE_CODE (exp) == SSA_NAME
+ || TREE_CODE (exp) == SSA_NAME
  || (DECL_P (exp)
   ! TREE_ADDRESSABLE (exp)
   ! TREE_THIS_VOLATILE (exp)
@@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
 registers aren't expensive.  */
   (! TREE_STATIC (exp) || DECL_REGISTER (exp;
 }
+
+/* Subroutine for fold_truth_andor: determine if an operand is simple enough
+   to be evaluated unconditionally.
+   I addition to simple_operand_p, we assume that comparisons and logic-not
+   operations are simple, if their operands are simple, too.  */
+
+static bool
+simple_operand_p_2 (tree exp)
+{
+  enum tree_code code;
+
+  /* Strip any conversions that don't change the machine mode.  */
+  STRIP_NOPS (exp);
+
+  code = TREE_CODE (exp);
+
+  if (TREE_CODE_CLASS (code) == tcc_comparison)
+return (!tree_could_trap_p (exp)
+simple_operand_p_2 (TREE_OPERAND (exp, 0))
+simple_operand_p_2 (TREE_OPERAND (exp, 1)));
+
+  if (TREE_SIDE_EFFECTS (exp)
+  || tree_could_trap_p (exp))
+return false;
+
+  switch (code)
+{
+case SSA_NAME:
+  return true;
+case TRUTH_NOT_EXPR:
+  return simple_operand_p_2 (TREE_OPERAND (exp, 0));
+case BIT_NOT_EXPR:
+  if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
+   return false;
+  return simple_operand_p_2 (TREE_OPERAND (exp, 0));
+default:
+  return simple_operand_p (exp);
+}
+}
+
 
 /* The following functions are subroutines to fold_range_test and allow it to
try to change a logical combination of comparisons into a range test.
@@ -4888,7 +4928,7 @@ fold_range_test (location_t loc, enum tr
   return 0;
 }
 
-/* Subroutine for fold_truthop: C is an INTEGER_CST interpreted as a P
+/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P
bit value.  Arrange things so the extra bits will be set to zero if and
only if C is signed-extended to its full width.  If MASK is nonzero,
it is an INTEGER_CST that should be AND'ed with the extra bits.  */
@@ -5025,8 +5065,8 @@ merge_truthop_with_opposite_arm (locatio
   

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-13 Thread Richard Guenther
On Thu, Oct 13, 2011 at 1:25 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Hello,

 this new version addresses the comments from Michael and additional fixes
 an latent issue shown up by this rewrite in fold-const.
 On gimplify.c's gimple_boolify we didn't handled the case that operands
 for TRUTH-expressions need to have same operand-size for transformation to
 bitwise operation.  This shows up for Fortran, as here are more then one
 boolean-kind type with different mode-sizes.  I added a testcase for this,

 ChangeLog

 2011-10-13  Kai Tietz  kti...@redhat.com

        * fold-const.c (simple_operand_p_2): New function.
        (fold_truthop): Rename to
        (fold_truth_andor_1): function name.
        Additionally remove branching creation for logical and/or.
        (fold_truth_andor): Handle branching creation for logical and/or here.
        * gimplify.c (gimple_boolify): Take care that for bitwise-binary
        transformation the operands have compatible types.

 2011-10-13  Kai Tietz  kti...@redhat.com

        * gfortran.fortran-torture/compile/logical-2.f90: New test.

 Bootstrapped and regression-tested for all languages plus Ada and
 Obj-C++ on x86_64-pc-linux-gnu.
 Ok for apply?

 Regards,
 Kai

 Index: gcc/gcc/fold-const.c
 ===
 --- gcc.orig/gcc/fold-const.c
 +++ gcc/gcc/fold-const.c
 @@ -112,13 +112,13 @@ static tree decode_field_reference (loca
  static int all_ones_mask_p (const_tree, int);
  static tree sign_bit_p (tree, const_tree);
  static int simple_operand_p (const_tree);
 +static bool simple_operand_p_2 (tree);
  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
  static tree range_predecessor (tree);
  static tree range_successor (tree);
  static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
 tree, tree);
  static tree unextend (tree, int, int, tree);
 -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
  static tree optimize_minmax_comparison (location_t, enum tree_code,
                                        tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 @@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
  }

 -/* Subroutine for fold_truthop: decode a field reference.
 +/* Subroutine for fold_truth_andor_1: decode a field reference.

    If EXP is a comparison reference, we return the innermost reference.

 @@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
  }

 -/* Subroutine for fold_truthop: determine if an operand is simple enough
 +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
 enough
    to be evaluated unconditionally.  */

  static int
 @@ -3678,7 +3678,7 @@ simple_operand_p (const_tree exp)
   STRIP_NOPS (exp);

   return (CONSTANT_CLASS_P (exp)
 -         || TREE_CODE (exp) == SSA_NAME
 +         || TREE_CODE (exp) == SSA_NAME
          || (DECL_P (exp)
               ! TREE_ADDRESSABLE (exp)
               ! TREE_THIS_VOLATILE (exp)
 @@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
                 registers aren't expensive.  */
               (! TREE_STATIC (exp) || DECL_REGISTER (exp;
  }
 +
 +/* Subroutine for fold_truth_andor: determine if an operand is simple enough
 +   to be evaluated unconditionally.
 +   I addition to simple_operand_p, we assume that comparisons and logic-not
 +   operations are simple, if their operands are simple, too.  */
 +
 +static bool
 +simple_operand_p_2 (tree exp)
 +{
 +  enum tree_code code;
 +
 +  /* Strip any conversions that don't change the machine mode.  */
 +  STRIP_NOPS (exp);
 +
 +  code = TREE_CODE (exp);
 +
 +  if (TREE_CODE_CLASS (code) == tcc_comparison)
 +    return (!tree_could_trap_p (exp)
 +            simple_operand_p_2 (TREE_OPERAND (exp, 0))
 +            simple_operand_p_2 (TREE_OPERAND (exp, 1)));
 +
 +  if (TREE_SIDE_EFFECTS (exp)
 +      || tree_could_trap_p (exp))
 +    return false;
 +
 +  switch (code)
 +    {
 +    case SSA_NAME:
 +      return true;
 +    case TRUTH_NOT_EXPR:
 +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
 +    case BIT_NOT_EXPR:
 +      if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
 +       return false;
 +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
 +    default:
 +      return simple_operand_p (exp);
 +    }
 +}
 +

  /* The following functions are subroutines to fold_range_test and allow it to
    try to change a logical combination of comparisons into a range test.
 @@ -4888,7 +4928,7 @@ fold_range_test (location_t loc, enum tr
   return 0;
  }

 -/* Subroutine for fold_truthop: C is an INTEGER_CST interpreted as a P
 +/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P
    bit value.  Arrange things so the extra bits will be set to zero if and
    only if C is signed-extended to its full 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-13 Thread Kai Tietz
2011/10/13 Richard Guenther richard.guent...@gmail.com:
 On Thu, Oct 13, 2011 at 1:25 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Hello,

 this new version addresses the comments from Michael and additional fixes
 an latent issue shown up by this rewrite in fold-const.
 On gimplify.c's gimple_boolify we didn't handled the case that operands
 for TRUTH-expressions need to have same operand-size for transformation to
 bitwise operation.  This shows up for Fortran, as here are more then one
 boolean-kind type with different mode-sizes.  I added a testcase for this,

 ChangeLog

 2011-10-13  Kai Tietz  kti...@redhat.com

        * fold-const.c (simple_operand_p_2): New function.
        (fold_truthop): Rename to
        (fold_truth_andor_1): function name.
        Additionally remove branching creation for logical and/or.
        (fold_truth_andor): Handle branching creation for logical and/or here.
        * gimplify.c (gimple_boolify): Take care that for bitwise-binary
        transformation the operands have compatible types.

 2011-10-13  Kai Tietz  kti...@redhat.com

        * gfortran.fortran-torture/compile/logical-2.f90: New test.

 Bootstrapped and regression-tested for all languages plus Ada and
 Obj-C++ on x86_64-pc-linux-gnu.
 Ok for apply?

 Regards,
 Kai

 Index: gcc/gcc/fold-const.c
 ===
 --- gcc.orig/gcc/fold-const.c
 +++ gcc/gcc/fold-const.c
 @@ -112,13 +112,13 @@ static tree decode_field_reference (loca
  static int all_ones_mask_p (const_tree, int);
  static tree sign_bit_p (tree, const_tree);
  static int simple_operand_p (const_tree);
 +static bool simple_operand_p_2 (tree);
  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
  static tree range_predecessor (tree);
  static tree range_successor (tree);
  static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
 tree, tree);
  static tree unextend (tree, int, int, tree);
 -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
  static tree optimize_minmax_comparison (location_t, enum tree_code,
                                        tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 @@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
  }

 -/* Subroutine for fold_truthop: decode a field reference.
 +/* Subroutine for fold_truth_andor_1: decode a field reference.

    If EXP is a comparison reference, we return the innermost reference.

 @@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
  }

 -/* Subroutine for fold_truthop: determine if an operand is simple enough
 +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
 enough
    to be evaluated unconditionally.  */

  static int
 @@ -3678,7 +3678,7 @@ simple_operand_p (const_tree exp)
   STRIP_NOPS (exp);

   return (CONSTANT_CLASS_P (exp)
 -         || TREE_CODE (exp) == SSA_NAME
 +         || TREE_CODE (exp) == SSA_NAME
          || (DECL_P (exp)
               ! TREE_ADDRESSABLE (exp)
               ! TREE_THIS_VOLATILE (exp)
 @@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
                 registers aren't expensive.  */
               (! TREE_STATIC (exp) || DECL_REGISTER (exp;
  }
 +
 +/* Subroutine for fold_truth_andor: determine if an operand is simple enough
 +   to be evaluated unconditionally.
 +   I addition to simple_operand_p, we assume that comparisons and logic-not
 +   operations are simple, if their operands are simple, too.  */
 +
 +static bool
 +simple_operand_p_2 (tree exp)
 +{
 +  enum tree_code code;
 +
 +  /* Strip any conversions that don't change the machine mode.  */
 +  STRIP_NOPS (exp);
 +
 +  code = TREE_CODE (exp);
 +
 +  if (TREE_CODE_CLASS (code) == tcc_comparison)
 +    return (!tree_could_trap_p (exp)
 +            simple_operand_p_2 (TREE_OPERAND (exp, 0))
 +            simple_operand_p_2 (TREE_OPERAND (exp, 1)));
 +
 +  if (TREE_SIDE_EFFECTS (exp)
 +      || tree_could_trap_p (exp))
 +    return false;
 +
 +  switch (code)
 +    {
 +    case SSA_NAME:
 +      return true;
 +    case TRUTH_NOT_EXPR:
 +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
 +    case BIT_NOT_EXPR:
 +      if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
 +       return false;
 +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
 +    default:
 +      return simple_operand_p (exp);
 +    }
 +}
 +

  /* The following functions are subroutines to fold_range_test and allow it 
 to
    try to change a logical combination of comparisons into a range test.
 @@ -4888,7 +4928,7 @@ fold_range_test (location_t loc, enum tr
   return 0;
  }

 -/* Subroutine for fold_truthop: C is an INTEGER_CST interpreted as a P
 +/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P
    bit value.  Arrange things so the extra bits will be set to 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-13 Thread Kai Tietz
Hello,

this new version addresses the comments from you.
On gimplify.c's gimplify_expr we didn't handled the case that operands
for TRUTH-AND/OR/XOR expressions need to have same operand-size in
case  of transformation to bitwise-binary operation.  This shows up
for Fortran, as there are more than one boolean-kind type with
different mode-sizes.  I added a testcase for this,

ChangeLog

2011-10-13  Kai Tietz  kti...@redhat.com

* fold-const.c (simple_operand_p_2): New function.
(fold_truthop): Rename to
(fold_truth_andor_1): function name.
Additionally remove branching creation for logical and/or.
(fold_truth_andor): Handle branching creation for logical and/or here.
* gimplify.c (gimplify_expr): Take care that for bitwise-binary
transformation the operands have compatible types.

2011-10-13  Kai Tietz  kti...@redhat.com

* gfortran.fortran-torture/compile/logical-2.f90: New test.

Bootstrapped and regression-tested for all languages plus Ada and
Obj-C++ on x86_64-pc-linux-gnu.
Ok for apply?

Regards,
Kai

Index: gcc/gcc/fold-const.c
===
--- gcc.orig/gcc/fold-const.c
+++ gcc/gcc/fold-const.c
@@ -112,13 +112,13 @@ static tree decode_field_reference (loca
 static int all_ones_mask_p (const_tree, int);
 static tree sign_bit_p (tree, const_tree);
 static int simple_operand_p (const_tree);
+static bool simple_operand_p_2 (tree);
 static tree range_binop (enum tree_code, tree, tree, int, tree, int);
 static tree range_predecessor (tree);
 static tree range_successor (tree);
 static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
 static tree fold_cond_expr_with_comparison (location_t, tree, tree,
tree, tree);
 static tree unextend (tree, int, int, tree);
-static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
 static tree optimize_minmax_comparison (location_t, enum tree_code,
tree, tree, tree);
 static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
@@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
 }
 
-/* Subroutine for fold_truthop: decode a field reference.
+/* Subroutine for fold_truth_andor_1: decode a field reference.

If EXP is a comparison reference, we return the innermost reference.

@@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
 }

-/* Subroutine for fold_truthop: determine if an operand is simple enough
+/* Subroutine for fold_truth_andor_1: determine if an operand is simple enough
to be evaluated unconditionally.  */

 static int
@@ -3678,7 +3678,7 @@ simple_operand_p (const_tree exp)
   STRIP_NOPS (exp);

   return (CONSTANT_CLASS_P (exp)
- || TREE_CODE (exp) == SSA_NAME
+ || TREE_CODE (exp) == SSA_NAME
  || (DECL_P (exp)
   ! TREE_ADDRESSABLE (exp)
   ! TREE_THIS_VOLATILE (exp)
@@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
 registers aren't expensive.  */
   (! TREE_STATIC (exp) || DECL_REGISTER (exp;
 }
+
+/* Subroutine for fold_truth_andor: determine if an operand is simple enough
+   to be evaluated unconditionally.
+   I addition to simple_operand_p, we assume that comparisons and logic-not
+   operations are simple, if their operands are simple, too.  */
+
+static bool
+simple_operand_p_2 (tree exp)
+{
+  enum tree_code code;
+
+  /* Strip any conversions that don't change the machine mode.  */
+  STRIP_NOPS (exp);
+
+  code = TREE_CODE (exp);
+
+  if (TREE_CODE_CLASS (code) == tcc_comparison)
+return (!tree_could_trap_p (exp)
+simple_operand_p_2 (TREE_OPERAND (exp, 0))
+simple_operand_p_2 (TREE_OPERAND (exp, 1)));
+
+  if (TREE_SIDE_EFFECTS (exp)
+  || tree_could_trap_p (exp))
+return false;
+
+  switch (code)
+{
+case SSA_NAME:
+  return true;
+case TRUTH_NOT_EXPR:
+  return simple_operand_p_2 (TREE_OPERAND (exp, 0));
+case BIT_NOT_EXPR:
+  if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
+   return false;
+  return simple_operand_p_2 (TREE_OPERAND (exp, 0));
+default:
+  return simple_operand_p (exp);
+}
+}
+
 
 /* The following functions are subroutines to fold_range_test and allow it to
try to change a logical combination of comparisons into a range test.
@@ -4888,7 +4928,7 @@ fold_range_test (location_t loc, enum tr
   return 0;
 }
 
-/* Subroutine for fold_truthop: C is an INTEGER_CST interpreted as a P
+/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P
bit value.  Arrange things so the extra bits will be set to zero if and
only if C is signed-extended to its full width.  If MASK is nonzero,
it is an INTEGER_CST that should be AND'ed with the extra bits.  */
@@ -5025,8 +5065,8 @@ merge_truthop_with_opposite_arm (locatio
We return the simplified tree or 0 if no optimization 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-13 Thread Michael Matz
Hi,

On Thu, 13 Oct 2011, Kai Tietz wrote:

 this new version addresses the comments from Michael and additional fixes
 an latent issue shown up by this rewrite in fold-const.
 On gimplify.c's gimple_boolify we didn't handled the case that operands
 for TRUTH-expressions need to have same operand-size for transformation to
 bitwise operation.

The requirement comes from BIT_AND_EXPR, not from any of the TRUTH_* 
expressions, hence the point of generating the BIT_AND_EXPR is the point 
to fixup the types.  Similar to this (fixes your testcase):

Index: gimplify.c
===
--- gimplify.c  (revision 179855)
+++ gimplify.c  (working copy)
   /* See if any simplifications can be done based on what the RHS is.  */
@@ -7257,6 +7264,18 @@ gimplify_expr (tree *expr_p, gimple_seq
  {
tree orig_type = TREE_TYPE (*expr_p);
*expr_p = gimple_boolify (*expr_p);
+   /* We are going to transform this into BIT operations,
+  which have stricter requirements on the operand types.  */
+   if (!useless_type_conversion_p
+(orig_type, TREE_TYPE (TREE_OPERAND (*expr_p, 0
+ TREE_OPERAND (*expr_p, 0)
+   = fold_convert_loc (input_location, orig_type,
+   TREE_OPERAND (*expr_p, 0));
+   if (!useless_type_conversion_p
+(orig_type, TREE_TYPE (TREE_OPERAND (*expr_p, 1
+ TREE_OPERAND (*expr_p, 1)
+   = fold_convert_loc (input_location, orig_type,
+   TREE_OPERAND (*expr_p, 1));
if (!useless_type_conversion_p (orig_type, TREE_TYPE (*expr_p)))
  {
*expr_p = fold_convert_loc (input_location, orig_type, *expr_p);


Ciao,
Michael.


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-12 Thread Michael Matz
Hi,

On Wed, 12 Oct 2011, Kai Tietz wrote:

  And I think it could use some overview of the transformation done like in
  the initial patch, ala:
 
  Transform ((A  B)  C) into (A  (B  C)).
 
  and
 
  Or (A  B) into (A  B). for this part:
 
  +     /* Needed for sequence points to handle trappings, and side-effects.  
  */
  +     else if (simple_operand_p_2 (arg0))
  +       return fold_build2_loc (loc, ncode, type, arg0, arg1);
 
 Well to use here binary form of operation seems to me misleading.

Hmm?  What do you mean?  Both operations are binary.  ANDIF is '', AND 
is ''.  In fold-const.c comments we usually use the C notations of the 
operations.

 It is right that the non-IF AND/OR has finally the same behavior as the 
 binary form in gimple.  Nevertheless it isn't the same on AST level. But 
 sure I can Add comments for operations like (A OR/AND-IF B) OR/AND-IF C 
 - (A OR/AND-IF (B OR/AND C and A OR/AND-IF C - (A OR/AND C)

Too much noise, leave out the || variant, and just say once Same for ||.


Ciao,
Michael.

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-12 Thread Kai Tietz
2011/10/12 Michael Matz m...@suse.de:
 Hi,

 On Wed, 12 Oct 2011, Kai Tietz wrote:

  And I think it could use some overview of the transformation done like in
  the initial patch, ala:
 
  Transform ((A  B)  C) into (A  (B  C)).
 
  and
 
  Or (A  B) into (A  B). for this part:
 
  +     /* Needed for sequence points to handle trappings, and side-effects. 
   */
  +     else if (simple_operand_p_2 (arg0))
  +       return fold_build2_loc (loc, ncode, type, arg0, arg1);

 Well to use here binary form of operation seems to me misleading.

 Hmm?  What do you mean?  Both operations are binary.  ANDIF is '', AND
 is ''.  In fold-const.c comments we usually use the C notations of the
 operations.

See TRUTH_AND_EXPR is in C-notation  and TRUTH_ANDIF_EXPR is also
.  The transcription to binary  is done in gimplifier.  Btw I just
noticed that by this patch a latent bug in gimplifier about
boolification for TRUTH_NOT_EXPR/TRUTH_AND/OR... is present.
On Fortran there are different boolean-kinds types with different
precision.  This makes them incompatible to eachother in gimple (as
useless_type_conversion_p returns for them false).  For gimplier needs
to ensure that operands for those TRUTH_... expression met a
compatible type of final expression type.

I will sent a patch for this as soon I completed regression-testing for it.

 It is right that the non-IF AND/OR has finally the same behavior as the
 binary form in gimple.  Nevertheless it isn't the same on AST level. But
 sure I can Add comments for operations like (A OR/AND-IF B) OR/AND-IF C
 - (A OR/AND-IF (B OR/AND C and A OR/AND-IF C - (A OR/AND C)

 Too much noise, leave out the || variant, and just say once Same for ||.


 Ciao,
 Michael.

Cheers,
Kai


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-12 Thread Michael Matz
Hi,

On Wed, 12 Oct 2011, Kai Tietz wrote:

  Hmm?  What do you mean?  Both operations are binary.  ANDIF is '', AND
  is ''.  In fold-const.c comments we usually use the C notations of the
  operations.
 
 See TRUTH_AND_EXPR is in C-notation  and TRUTH_ANDIF_EXPR is also
 .

Ah right, confusing but there we are.  A comment using ANDIF and AND it is 
then.


Ciao,
Michael.

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-11 Thread Michael Matz
Hi,

On Mon, 10 Oct 2011, Kai Tietz wrote:

 To ensure that we use simple_operand_p in all cases, beside for 
 branching AND/OR chains, in same way as before, I added to this function 
 an additional argument, by which the looking into comparisons can be 
 activated.

Better make it a separate function the first tests your new conditions, 
and then calls simple_operand_p.

 +fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type,
 + tree lhs, tree rhs)
  {
/* If this is the or of two comparisons, we can do something if
   the comparisons are NE_EXPR.  If this is the and, we can do something
 @@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_
  build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg),
  ll_arg, rl_arg),
  build_int_cst (TREE_TYPE (ll_arg), 0));
 -
 -  if (LOGICAL_OP_NON_SHORT_CIRCUIT)
 - {
 -   if (code != orig_code || lhs != orig_lhs || rhs != orig_rhs)
 - return build2_loc (loc, code, truth_type, lhs, rhs);
 -   return NULL_TREE;
 - }

Why do you remove this hunk?  Shouldn't you instead move the hunk you 
added to fold_truth_andor() here.  I realize this needs some TLC to 
fold_truth_andor_1, because right now it early-outs for non-comparisons, 
but it seems the better place.  I.e. somehow move the below code into the 
above branch, with the associated diddling on fold_truth_andor_1 that it 
gets called.

 +  if ((code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR)
 +   (BRANCH_COST (optimize_function_for_speed_p (cfun),
 +false) = 2)
 +   !TREE_SIDE_EFFECTS (arg1)
 +   LOGICAL_OP_NON_SHORT_CIRCUIT
 +   simple_operand_p (arg1, true))
 +{
 +  enum tree_code ncode = (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
 +: TRUTH_OR_EXPR);
 +
 +  /* We don't want to pack more then two leafs to an non-IF

Missing continuation of the sentence?

 + If tree-code of left-hand operand isn't an AND/OR-IF code and not
 + equal to CODE, then we don't want to add right-hand operand.
 + If the inner right-hand side of left-hand operand has side-effects,
 + or isn't simple, then we can't add to it, as otherwise we might
 + destroy if-sequence.  */


 +  if (TREE_CODE (arg0) == code
 +   /* Needed for sequence points to handle trappings, and
 +  side-effects.  */
 +!TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1))
 +simple_operand_p (TREE_OPERAND (arg0, 1), true))
 +   {
 + tem = fold_build2_loc (loc, ncode, type, TREE_OPERAND (arg0, 1),
 + arg1);
 + return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0),
 +  tem);
 +   }
 + /* Needed for sequence points to handle trappings, and side-effects.  */
 + else if (!TREE_SIDE_EFFECTS (arg0)
 +simple_operand_p (arg0, true))
 +   return fold_build2_loc (loc, ncode, type, arg0, arg1);
 +}
 +


Ciao,
Michael.


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-11 Thread Kai Tietz
So updated version for patch.  It creates new simple_operand_p_2
function instead of modifying simple_operand_p function.

ChangeLog

2011-10-11  Kai Tietz  kti...@redhat.com

* fold-const.c (simple_operand_p_2): New function.
(fold_truthop): Rename to
(fold_truth_andor_1): function name.
Additionally remove branching creation for logical and/or.
(fold_truth_andor): Handle branching creation for logical and/or here.

Bootstrapped and regression-tested for all languages plus Ada and
Obj-C++ on x86_64-pc-linux-gnu.
Ok for apply?

Regards,
Kai

Index: gcc/gcc/fold-const.c
===
--- gcc.orig/gcc/fold-const.c
+++ gcc/gcc/fold-const.c
@@ -112,13 +112,13 @@ static tree decode_field_reference (loca
 static int all_ones_mask_p (const_tree, int);
 static tree sign_bit_p (tree, const_tree);
 static int simple_operand_p (const_tree);
+static bool simple_operand_p_2 (tree);
 static tree range_binop (enum tree_code, tree, tree, int, tree, int);
 static tree range_predecessor (tree);
 static tree range_successor (tree);
 static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
 static tree fold_cond_expr_with_comparison (location_t, tree, tree,
tree, tree);
 static tree unextend (tree, int, int, tree);
-static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
 static tree optimize_minmax_comparison (location_t, enum tree_code,
tree, tree, tree);
 static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
@@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
 }
 
-/* Subroutine for fold_truthop: decode a field reference.
+/* Subroutine for fold_truth_andor_1: decode a field reference.

If EXP is a comparison reference, we return the innermost reference.

@@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
 }

-/* Subroutine for fold_truthop: determine if an operand is simple enough
+/* Subroutine for fold_truth_andor_1: determine if an operand is simple enough
to be evaluated unconditionally.  */

 static int
@@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
 registers aren't expensive.  */
   (! TREE_STATIC (exp) || DECL_REGISTER (exp;
 }
+
+/* Subroutine for fold_truth_andor: determine if an operand is simple enough
+   to be evaluated unconditionally.
+   I addition to simple_operand_p, we assume that comparisons and logic-not
+   operations are simple, if their operands are simple, too.  */
+
+static bool
+simple_operand_p_2 (tree exp)
+{
+  enum tree_code code;
+
+  /* Strip any conversions that don't change the machine mode.  */
+  STRIP_NOPS (exp);
+
+  code = TREE_CODE (exp);
+
+  if (TREE_CODE_CLASS (code) == tcc_comparison)
+return (!tree_could_trap_p (exp)
+simple_operand_p_2 (TREE_OPERAND (exp, 0))
+simple_operand_p_2 (TREE_OPERAND (exp, 1)));
+
+  if (FLOAT_TYPE_P (TREE_TYPE (exp))
+   tree_could_trap_p (exp))
+return false;
+
+  switch (code)
+{
+case SSA_NAME:
+  return true;
+case TRUTH_NOT_EXPR:
+  return simple_operand_p_2 (TREE_OPERAND (exp, 0));
+case BIT_NOT_EXPR:
+  if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
+   return false;
+  return simple_operand_p_2 (TREE_OPERAND (exp, 0));
+default:
+  return simple_operand_p (exp);
+}
+}
+
 
 /* The following functions are subroutines to fold_range_test and allow it to
try to change a logical combination of comparisons into a range test.
@@ -4888,7 +4928,7 @@ fold_range_test (location_t loc, enum tr
   return 0;
 }
 
-/* Subroutine for fold_truthop: C is an INTEGER_CST interpreted as a P
+/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P
bit value.  Arrange things so the extra bits will be set to zero if and
only if C is signed-extended to its full width.  If MASK is nonzero,
it is an INTEGER_CST that should be AND'ed with the extra bits.  */
@@ -5025,8 +5065,8 @@ merge_truthop_with_opposite_arm (locatio
We return the simplified tree or 0 if no optimization is possible.  */

 static tree
-fold_truthop (location_t loc, enum tree_code code, tree truth_type,
- tree lhs, tree rhs)
+fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type,
+   tree lhs, tree rhs)
 {
   /* If this is the or of two comparisons, we can do something if
  the comparisons are NE_EXPR.  If this is the and, we can do something
@@ -5054,8 +5094,6 @@ fold_truthop (location_t loc, enum tree_
   tree lntype, rntype, result;
   HOST_WIDE_INT first_bit, end_bit;
   int volatilep;
-  tree orig_lhs = lhs, orig_rhs = rhs;
-  enum tree_code orig_code = code;

   /* Start by getting the comparison codes.  Fail if anything is volatile.
  If one operand is a BIT_AND_EXPR with the constant one, treat it as if
@@ -5119,8 +5157,7 @@ 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-11 Thread Michael Matz
Hi,

On Tue, 11 Oct 2011, Kai Tietz wrote:

  Better make it a separate function the first tests your new 
  conditions, and then calls simple_operand_p.
 
 Well, either I make it a new function and call it instead of 
 simple_operand_p,

That's what I meant, yes.

  @@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_
                           build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg),
                                   ll_arg, rl_arg),
                           build_int_cst (TREE_TYPE (ll_arg), 0));
  -
  -      if (LOGICAL_OP_NON_SHORT_CIRCUIT)
  -     {
  -       if (code != orig_code || lhs != orig_lhs || rhs != orig_rhs)
  -         return build2_loc (loc, code, truth_type, lhs, rhs);
  -       return NULL_TREE;
  -     }
 
  Why do you remove this hunk?  Shouldn't you instead move the hunk you
  added to fold_truth_andor() here.  I realize this needs some TLC to
  fold_truth_andor_1, because right now it early-outs for non-comparisons,
  but it seems the better place.  I.e. somehow move the below code into the
  above branch, with the associated diddling on fold_truth_andor_1 that it
  gets called.
 
 This hunk is removed, as it is vain to do here.

There is a fallthrough now, that wasn't there before.  I don't know if 
it's harmless, I just wanted to mention it.

 Btw richi asked for it, and I agree that new TRUTH-AND/OR packing is 
 better done at a single place in fold_truth_andor only.

As fold_truthop is called twice by fold_truth_andor, the latter might 
indeed be the better place.


Ciao,
Michael.

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-11 Thread Kai Tietz
2011/10/11 Michael Matz m...@suse.de:
 Hi,

 On Tue, 11 Oct 2011, Kai Tietz wrote:

  Better make it a separate function the first tests your new
  conditions, and then calls simple_operand_p.

 Well, either I make it a new function and call it instead of
 simple_operand_p,

 That's what I meant, yes.

  @@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_
                           build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg),
                                   ll_arg, rl_arg),
                           build_int_cst (TREE_TYPE (ll_arg), 0));
  -
  -      if (LOGICAL_OP_NON_SHORT_CIRCUIT)
  -     {
  -       if (code != orig_code || lhs != orig_lhs || rhs != orig_rhs)
  -         return build2_loc (loc, code, truth_type, lhs, rhs);
  -       return NULL_TREE;
  -     }
 
  Why do you remove this hunk?  Shouldn't you instead move the hunk you
  added to fold_truth_andor() here.  I realize this needs some TLC to
  fold_truth_andor_1, because right now it early-outs for non-comparisons,
  but it seems the better place.  I.e. somehow move the below code into the
  above branch, with the associated diddling on fold_truth_andor_1 that it
  gets called.

 This hunk is removed, as it is vain to do here.

 There is a fallthrough now, that wasn't there before.  I don't know if
 it's harmless, I just wanted to mention it.

It is.  Before we changed expression here and recurse here with the
non-IF AND/OR expression later.  So there is no need to do this
recursion.

 Btw richi asked for it, and I agree that new TRUTH-AND/OR packing is
 better done at a single place in fold_truth_andor only.

 As fold_truthop is called twice by fold_truth_andor, the latter might
 indeed be the better place.


 Ciao,
 Michael.

Kai


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-11 Thread Michael Matz
Hi,

On Tue, 11 Oct 2011, Kai Tietz wrote:

 So updated version for patch.  It creates new simple_operand_p_2
 function instead of modifying simple_operand_p function.

FWIW: I also can't think of a nice short name for that predicate function 
:)  One thing: move the test for TREE_SIDE_EFFECTS to that new function, 
then the if()s in fold_truth_andor become nicer.  I think the code then is 
okay, but I can't approve.  Just one remark about the comment:

 +  /* We don't want to pack more then two leafs to an non-IF AND/OR

s/then/than/ s/an/a/

 + expression.
 + If tree-code of left-hand operand isn't an AND/OR-IF code and not
 + equal to CODE, then we don't want to add right-hand operand.
 + If the inner right-hand side of left-hand operand has side-effects,
 + or isn't simple, then we can't add to it, as otherwise we might
 + destroy if-sequence.  */

And I think it could use some overview of the transformation done like in 
the initial patch, ala:

Transform ((A  B)  C) into (A  (B  C)).

and

Or (A  B) into (A  B). for this part:

+ /* Needed for sequence points to handle trappings, and side-effects.  */
+ else if (simple_operand_p_2 (arg0))
+   return fold_build2_loc (loc, ncode, type, arg0, arg1);


Ciao,
Michael.


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-10 Thread Kai Tietz
2011/10/7 Kai Tietz ktiet...@googlemail.com:
 Hello,

 this is the updated version with the suggestion

 2011/10/7 Richard Guenther richard.guent...@gmail.com:
 On Thu, Oct 6, 2011 at 4:25 PM, Kai Tietz kti...@redhat.com wrote:
 +       ((TREE_CODE_CLASS (TREE_CODE (arg1)) != tcc_comparison
 +            TREE_CODE (arg1) != TRUTH_NOT_EXPR
 +            simple_operand_p (arg1))

 As I said previously simple_operand_p already rejects covers
 comparisons and TRUTH_NOT_EXPR.  Also arg1 had better
 TREE_SIDE_EFFECTS set if the comparison might trap, as
 it might just be hidden in something more complicated - so
 the simple check isn't enough anyway (and if simple_operand_p
 would cover it, the check would be better placed there).

 I reworked simple_operand_p so that it does this special-casing and 
 additionally
 also checks for trapping.

 +      if (TREE_CODE (arg0) == code
 +           !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1))
 +           simple_operand_p (TREE_OPERAND (arg0, 1)))
 +       {
 +         tem = build2_loc (loc,
 +                           (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
 +                                                     : TRUTH_OR_EXPR),
 +                           type, TREE_OPERAND (arg0, 1), arg1);
 +         return build2_loc (loc, code, type, TREE_OPERAND (arg0, 0),
 +                            tem);

 All trees should be folded, don't use plain build without a good reason.

 Ok, done

 +       }
 +      /* Convert X TRUTH-ANDORIF Y to X TRUTH-ANDOR Y, if X and Y
 +        are simple operands and have no side-effects.  */
 +      if (simple_operand_p (arg0)
 +           !TREE_SIDE_EFFECTS (arg0))

 Again, the checks you do for arg0 do not match those for arg1.  OTOH
 it doesn't matter whether arg0 is simple or not or has side-effects or
 not for this transformation, so why check it at all?

 It is required.  For left-hand operand, if it isn't a logical
 and/or/xor, we need to check for side-effects (and for trapping).  I
 see that calling of simple_operand_p is wrong here, as it rejects too
 much.  Nevertheless the check for side-effects is necessary for having
 valid sequence-points.  Without that checking a simple test

So said, it is even required to use for right-hand and left-hand side
of arguments, if one of them have side-effects or isn't simple.  Means
the check in my patch should use for

 +     else if (TREE_CODE (arg0) != TRUTH_AND_EXPR
 +              TREE_CODE (arg0) != TRUTH_OR_EXPR
 +              TREE_CODE (arg0) != TRUTH_ANDIF_EXPR
 +              TREE_CODE (arg0) != TRUTH_ORIF_EXPR
 +              TREE_CODE (arg0) != TRUTH_XOR_EXPR
 +             /* Needed for sequence points and trappings, or side-effects.  
 */
 +              !TREE_SIDE_EFFECTS (arg0)
 +              !tree_could_trap_p (arg0))
 +       return fold_build2_loc (loc, ncode, type, arg0, arg1);

instead if (!TREE_SIDE_EFFECTS (arg0)  simple_operand_p (arg0)) 
instead.

The cause for this are the consitancies of sequences in tree.  I
noticed that by tests in gcc.dg/tree-ssa about builitin_expect.

for example we have

extern int foo (void); /* foo modifies gbl1 */
int gbl1 = 0;

int foo (int ns1)
{
  if (ns1  foo ()  gbl1)
return 1;
  return 0;
}

so chain of trees has to look like this:
(ANDIF (ns1 (ANDIF foo () gbl1))

but if we don't check here for side-effects for left-hand chaining
operand, then we end up with
(AND ns1 (ANDIF foo () gbl1))

As AND and has associative property, tree says that right-hand and
left-hand are exchangable, which is obviously wrong.

Cheers,
Kai


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-10 Thread Richard Guenther
On Mon, Oct 10, 2011 at 12:35 PM, Kai Tietz ktiet...@googlemail.com wrote:
 2011/10/7 Kai Tietz ktiet...@googlemail.com:
 Hello,

 this is the updated version with the suggestion

 2011/10/7 Richard Guenther richard.guent...@gmail.com:
 On Thu, Oct 6, 2011 at 4:25 PM, Kai Tietz kti...@redhat.com wrote:
 +       ((TREE_CODE_CLASS (TREE_CODE (arg1)) != tcc_comparison
 +            TREE_CODE (arg1) != TRUTH_NOT_EXPR
 +            simple_operand_p (arg1))

 As I said previously simple_operand_p already rejects covers
 comparisons and TRUTH_NOT_EXPR.  Also arg1 had better
 TREE_SIDE_EFFECTS set if the comparison might trap, as
 it might just be hidden in something more complicated - so
 the simple check isn't enough anyway (and if simple_operand_p
 would cover it, the check would be better placed there).

 I reworked simple_operand_p so that it does this special-casing and 
 additionally
 also checks for trapping.

 +      if (TREE_CODE (arg0) == code
 +           !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1))
 +           simple_operand_p (TREE_OPERAND (arg0, 1)))
 +       {
 +         tem = build2_loc (loc,
 +                           (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
 +                                                     : TRUTH_OR_EXPR),
 +                           type, TREE_OPERAND (arg0, 1), arg1);
 +         return build2_loc (loc, code, type, TREE_OPERAND (arg0, 0),
 +                            tem);

 All trees should be folded, don't use plain build without a good reason.

 Ok, done

 +       }
 +      /* Convert X TRUTH-ANDORIF Y to X TRUTH-ANDOR Y, if X and Y
 +        are simple operands and have no side-effects.  */
 +      if (simple_operand_p (arg0)
 +           !TREE_SIDE_EFFECTS (arg0))

 Again, the checks you do for arg0 do not match those for arg1.  OTOH
 it doesn't matter whether arg0 is simple or not or has side-effects or
 not for this transformation, so why check it at all?

 It is required.  For left-hand operand, if it isn't a logical
 and/or/xor, we need to check for side-effects (and for trapping).  I
 see that calling of simple_operand_p is wrong here, as it rejects too
 much.  Nevertheless the check for side-effects is necessary for having
 valid sequence-points.  Without that checking a simple test

 So said, it is even required to use for right-hand and left-hand side
 of arguments, if one of them have side-effects or isn't simple.  Means
 the check in my patch should use for

 +     else if (TREE_CODE (arg0) != TRUTH_AND_EXPR
 +              TREE_CODE (arg0) != TRUTH_OR_EXPR
 +              TREE_CODE (arg0) != TRUTH_ANDIF_EXPR
 +              TREE_CODE (arg0) != TRUTH_ORIF_EXPR
 +              TREE_CODE (arg0) != TRUTH_XOR_EXPR
 +             /* Needed for sequence points and trappings, or side-effects.  
 */
 +              !TREE_SIDE_EFFECTS (arg0)
 +              !tree_could_trap_p (arg0))
 +       return fold_build2_loc (loc, ncode, type, arg0, arg1);

 instead if (!TREE_SIDE_EFFECTS (arg0)  simple_operand_p (arg0)) 
 instead.

 The cause for this are the consitancies of sequences in tree.  I
 noticed that by tests in gcc.dg/tree-ssa about builitin_expect.

 for example we have

 extern int foo (void); /* foo modifies gbl1 */
 int gbl1 = 0;

 int foo (int ns1)
 {
  if (ns1  foo ()  gbl1)
    return 1;
  return 0;
 }

 so chain of trees has to look like this:
 (ANDIF (ns1 (ANDIF foo () gbl1))

 but if we don't check here for side-effects for left-hand chaining
 operand, then we end up with
 (AND ns1 (ANDIF foo () gbl1))

No we don't, as the right-hand (ANDIF foo () glbl1) has side-effects.

 As AND and has associative property, tree says that right-hand and
 left-hand are exchangable, which is obviously wrong.

The poitn is that if the right-hand does not have side-effects it doesn't
matter if we execute it before the left-hand (independent on whether
that has side-effects or not).

Richard.

 Cheers,
 Kai



Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-10 Thread Kai Tietz
2011/10/10 Richard Guenther richard.guent...@gmail.com:
 On Mon, Oct 10, 2011 at 12:35 PM, Kai Tietz ktiet...@googlemail.com wrote:
 2011/10/7 Kai Tietz ktiet...@googlemail.com:
 Hello,

 this is the updated version with the suggestion

 2011/10/7 Richard Guenther richard.guent...@gmail.com:
 On Thu, Oct 6, 2011 at 4:25 PM, Kai Tietz kti...@redhat.com wrote:
 +       ((TREE_CODE_CLASS (TREE_CODE (arg1)) != tcc_comparison
 +            TREE_CODE (arg1) != TRUTH_NOT_EXPR
 +            simple_operand_p (arg1))

 As I said previously simple_operand_p already rejects covers
 comparisons and TRUTH_NOT_EXPR.  Also arg1 had better
 TREE_SIDE_EFFECTS set if the comparison might trap, as
 it might just be hidden in something more complicated - so
 the simple check isn't enough anyway (and if simple_operand_p
 would cover it, the check would be better placed there).

 I reworked simple_operand_p so that it does this special-casing and 
 additionally
 also checks for trapping.

 +      if (TREE_CODE (arg0) == code
 +           !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1))
 +           simple_operand_p (TREE_OPERAND (arg0, 1)))
 +       {
 +         tem = build2_loc (loc,
 +                           (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
 +                                                     : TRUTH_OR_EXPR),
 +                           type, TREE_OPERAND (arg0, 1), arg1);
 +         return build2_loc (loc, code, type, TREE_OPERAND (arg0, 0),
 +                            tem);

 All trees should be folded, don't use plain build without a good reason.

 Ok, done

 +       }
 +      /* Convert X TRUTH-ANDORIF Y to X TRUTH-ANDOR Y, if X and Y
 +        are simple operands and have no side-effects.  */
 +      if (simple_operand_p (arg0)
 +           !TREE_SIDE_EFFECTS (arg0))

 Again, the checks you do for arg0 do not match those for arg1.  OTOH
 it doesn't matter whether arg0 is simple or not or has side-effects or
 not for this transformation, so why check it at all?

 It is required.  For left-hand operand, if it isn't a logical
 and/or/xor, we need to check for side-effects (and for trapping).  I
 see that calling of simple_operand_p is wrong here, as it rejects too
 much.  Nevertheless the check for side-effects is necessary for having
 valid sequence-points.  Without that checking a simple test

 So said, it is even required to use for right-hand and left-hand side
 of arguments, if one of them have side-effects or isn't simple.  Means
 the check in my patch should use for

 +     else if (TREE_CODE (arg0) != TRUTH_AND_EXPR
 +              TREE_CODE (arg0) != TRUTH_OR_EXPR
 +              TREE_CODE (arg0) != TRUTH_ANDIF_EXPR
 +              TREE_CODE (arg0) != TRUTH_ORIF_EXPR
 +              TREE_CODE (arg0) != TRUTH_XOR_EXPR
 +             /* Needed for sequence points and trappings, or side-effects. 
  */
 +              !TREE_SIDE_EFFECTS (arg0)
 +              !tree_could_trap_p (arg0))
 +       return fold_build2_loc (loc, ncode, type, arg0, arg1);

 instead if (!TREE_SIDE_EFFECTS (arg0)  simple_operand_p (arg0)) 
 instead.

 The cause for this are the consitancies of sequences in tree.  I
 noticed that by tests in gcc.dg/tree-ssa about builitin_expect.

 for example we have

 extern int foo (void); /* foo modifies gbl1 */
 int gbl1 = 0;

 int foo (int ns1)
 {
  if (ns1  foo ()  gbl1)
    return 1;
  return 0;
 }

 so chain of trees has to look like this:
 (ANDIF (ns1 (ANDIF foo () gbl1))

 but if we don't check here for side-effects for left-hand chaining
 operand, then we end up with
 (AND ns1 (ANDIF foo () gbl1))

 No we don't, as the right-hand (ANDIF foo () glbl1) has side-effects.

 As AND and has associative property, tree says that right-hand and
 left-hand are exchangable, which is obviously wrong.

 The poitn is that if the right-hand does not have side-effects it doesn't
 matter if we execute it before the left-hand (independent on whether
 that has side-effects or not).

 Richard.

thats just true as long as we don't make use of associative law for
AND expressions.
Otherwise we would fail for much simpler cases like
entern int doo ();
int foo ()
{
  int c, r = 0;
  if ((c = foo ()) != 0  c  20)
r = 1;
  return 0;
}

as for this left-hand operand has side-effects, but as it is the first
one, we would chain it as AND.  So we get wrong sequence.

Kai


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-10 Thread Kai Tietz
Sample had a typo. Correct sample has of course to return r.

 int foo ()
 {
  int c, r = 0;
  if ((c = foo ()) != 0  c  20)
    r = 1;
  return r;
}


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-10 Thread Richard Guenther
On Mon, Oct 10, 2011 at 12:47 PM, Kai Tietz ktiet...@googlemail.com wrote:
 2011/10/10 Richard Guenther richard.guent...@gmail.com:
 On Mon, Oct 10, 2011 at 12:35 PM, Kai Tietz ktiet...@googlemail.com wrote:
 2011/10/7 Kai Tietz ktiet...@googlemail.com:
 Hello,

 this is the updated version with the suggestion

 2011/10/7 Richard Guenther richard.guent...@gmail.com:
 On Thu, Oct 6, 2011 at 4:25 PM, Kai Tietz kti...@redhat.com wrote:
 +       ((TREE_CODE_CLASS (TREE_CODE (arg1)) != tcc_comparison
 +            TREE_CODE (arg1) != TRUTH_NOT_EXPR
 +            simple_operand_p (arg1))

 As I said previously simple_operand_p already rejects covers
 comparisons and TRUTH_NOT_EXPR.  Also arg1 had better
 TREE_SIDE_EFFECTS set if the comparison might trap, as
 it might just be hidden in something more complicated - so
 the simple check isn't enough anyway (and if simple_operand_p
 would cover it, the check would be better placed there).

 I reworked simple_operand_p so that it does this special-casing and 
 additionally
 also checks for trapping.

 +      if (TREE_CODE (arg0) == code
 +           !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1))
 +           simple_operand_p (TREE_OPERAND (arg0, 1)))
 +       {
 +         tem = build2_loc (loc,
 +                           (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
 +                                                     : TRUTH_OR_EXPR),
 +                           type, TREE_OPERAND (arg0, 1), arg1);
 +         return build2_loc (loc, code, type, TREE_OPERAND (arg0, 0),
 +                            tem);

 All trees should be folded, don't use plain build without a good reason.

 Ok, done

 +       }
 +      /* Convert X TRUTH-ANDORIF Y to X TRUTH-ANDOR Y, if X and Y
 +        are simple operands and have no side-effects.  */
 +      if (simple_operand_p (arg0)
 +           !TREE_SIDE_EFFECTS (arg0))

 Again, the checks you do for arg0 do not match those for arg1.  OTOH
 it doesn't matter whether arg0 is simple or not or has side-effects or
 not for this transformation, so why check it at all?

 It is required.  For left-hand operand, if it isn't a logical
 and/or/xor, we need to check for side-effects (and for trapping).  I
 see that calling of simple_operand_p is wrong here, as it rejects too
 much.  Nevertheless the check for side-effects is necessary for having
 valid sequence-points.  Without that checking a simple test

 So said, it is even required to use for right-hand and left-hand side
 of arguments, if one of them have side-effects or isn't simple.  Means
 the check in my patch should use for

 +     else if (TREE_CODE (arg0) != TRUTH_AND_EXPR
 +              TREE_CODE (arg0) != TRUTH_OR_EXPR
 +              TREE_CODE (arg0) != TRUTH_ANDIF_EXPR
 +              TREE_CODE (arg0) != TRUTH_ORIF_EXPR
 +              TREE_CODE (arg0) != TRUTH_XOR_EXPR
 +             /* Needed for sequence points and trappings, or 
 side-effects.  */
 +              !TREE_SIDE_EFFECTS (arg0)
 +              !tree_could_trap_p (arg0))
 +       return fold_build2_loc (loc, ncode, type, arg0, arg1);

 instead if (!TREE_SIDE_EFFECTS (arg0)  simple_operand_p (arg0)) 
 instead.

 The cause for this are the consitancies of sequences in tree.  I
 noticed that by tests in gcc.dg/tree-ssa about builitin_expect.

 for example we have

 extern int foo (void); /* foo modifies gbl1 */
 int gbl1 = 0;

 int foo (int ns1)
 {
  if (ns1  foo ()  gbl1)
    return 1;
  return 0;
 }

 so chain of trees has to look like this:
 (ANDIF (ns1 (ANDIF foo () gbl1))

 but if we don't check here for side-effects for left-hand chaining
 operand, then we end up with
 (AND ns1 (ANDIF foo () gbl1))

 No we don't, as the right-hand (ANDIF foo () glbl1) has side-effects.

 As AND and has associative property, tree says that right-hand and
 left-hand are exchangable, which is obviously wrong.

 The poitn is that if the right-hand does not have side-effects it doesn't
 matter if we execute it before the left-hand (independent on whether
 that has side-effects or not).

 Richard.

 thats just true as long as we don't make use of associative law for
 AND expressions.
 Otherwise we would fail for much simpler cases like
 entern int doo ();
 int foo ()
 {
  int c, r = 0;
  if ((c = foo ()) != 0  c  20)
    r = 1;
  return 0;
 }

 as for this left-hand operand has side-effects, but as it is the first
 one, we would chain it as AND.  So we get wrong sequence.

Well, then the predicates checking for simplicity and side-effects
should better match.

Richard.

 Kai



Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-10 Thread Richard Guenther
On Fri, Oct 7, 2011 at 11:36 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Hello,

 this is the updated version with the suggestion

 2011/10/7 Richard Guenther richard.guent...@gmail.com:
 On Thu, Oct 6, 2011 at 4:25 PM, Kai Tietz kti...@redhat.com wrote:
 +       ((TREE_CODE_CLASS (TREE_CODE (arg1)) != tcc_comparison
 +            TREE_CODE (arg1) != TRUTH_NOT_EXPR
 +            simple_operand_p (arg1))

 As I said previously simple_operand_p already rejects covers
 comparisons and TRUTH_NOT_EXPR.  Also arg1 had better
 TREE_SIDE_EFFECTS set if the comparison might trap, as
 it might just be hidden in something more complicated - so
 the simple check isn't enough anyway (and if simple_operand_p
 would cover it, the check would be better placed there).

 I reworked simple_operand_p so that it does this special-casing and 
 additionally
 also checks for trapping.

 +      if (TREE_CODE (arg0) == code
 +           !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1))
 +           simple_operand_p (TREE_OPERAND (arg0, 1)))
 +       {
 +         tem = build2_loc (loc,
 +                           (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
 +                                                     : TRUTH_OR_EXPR),
 +                           type, TREE_OPERAND (arg0, 1), arg1);
 +         return build2_loc (loc, code, type, TREE_OPERAND (arg0, 0),
 +                            tem);

 All trees should be folded, don't use plain build without a good reason.

 Ok, done

 +       }
 +      /* Convert X TRUTH-ANDORIF Y to X TRUTH-ANDOR Y, if X and Y
 +        are simple operands and have no side-effects.  */
 +      if (simple_operand_p (arg0)
 +           !TREE_SIDE_EFFECTS (arg0))

 Again, the checks you do for arg0 do not match those for arg1.  OTOH
 it doesn't matter whether arg0 is simple or not or has side-effects or
 not for this transformation, so why check it at all?

 It is required.  For left-hand operand, if it isn't a logical
 and/or/xor, we need to check for side-effects (and for trapping).  I
 see that calling of simple_operand_p is wrong here, as it rejects too
 much.  Nevertheless the check for side-effects is necessary for having
 valid sequence-points.  Without that checking a simple test

 int getter (void);

 int foo (void)
 {
  int c, r = 0;
  while ((c = getter ()) != ''  c = 0)
    r +=c;
  return r;
 }

 would give a warning about sequence-points.  As left-hand operand has
 side-effects, but right-hand not.  If we would combine it as AND, the
 operands are exchange-able.  So right-hand operand needs to be another
 ANDIF expression instead.
 Same apply on trapping.

 In fold_truthop we still have the same (albeit more restricted transform),
 but guarded with

  if (BRANCH_COST (optimize_function_for_speed_p (cfun),
                   false) = 2

 too.  Why not here?  Please delete redundant code in fold_truthop.
 Well, in general this is the default definition of
 LOGICAL_OP_NON_SHORT_CIRCUIT, so I missed that.  As for some targets
 the macro LOGICAL_OP_NON_SHORT_CIRCUIT might be defined differently,
 it might make sense to check for BRANCH_COST again.

 It's also odd that this is only called from fold_truth_andor but has
 a more generic name, so maybe rename it to fold_truth_andor_1 on the way.

 I renamed it.

 Richard.

 ChangeLog

 2011-10-07  Kai Tietz  kti...@redhat.com

        * fold-const.c (simple_operand_p): Make argument non-const
        and add floating-point trapping check, and special cases for
        comparisons, and logical-not's.
        (fold_truthop): Rename to
        (fold_truth_andor_1): function name.
        Additionally remove here TRUTH-AND|OR_EXPR generation.
        (fold_truth_andor): Handle branching at one place.

 Bootstrapped and regression-tested for all languages plus Ada and
 Obj-C++ on x86_64-pc-linux-gnu.
 Ok for apply?

 Regards,
 Kai

 Index: gcc/gcc/fold-const.c
 ===
 --- gcc.orig/gcc/fold-const.c
 +++ gcc/gcc/fold-const.c
 @@ -111,14 +111,13 @@ static tree decode_field_reference (loca
                                    tree *, tree *);
  static int all_ones_mask_p (const_tree, int);
  static tree sign_bit_p (tree, const_tree);
 -static int simple_operand_p (const_tree);
 +static int simple_operand_p (tree);
  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
  static tree range_predecessor (tree);
  static tree range_successor (tree);
  static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
 tree, tree);
  static tree unextend (tree, int, int, tree);
 -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
  static tree optimize_minmax_comparison (location_t, enum tree_code,
                                        tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 @@ -3500,7 +3499,7 @@ optimize_bit_field_compare (location_t l

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-10 Thread Kai Tietz
2011/10/10 Richard Guenther richard.guent...@gmail.com:
 On Fri, Oct 7, 2011 at 11:36 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Hello,

 this is the updated version with the suggestion

 2011/10/7 Richard Guenther richard.guent...@gmail.com:
 On Thu, Oct 6, 2011 at 4:25 PM, Kai Tietz kti...@redhat.com wrote:
 +       ((TREE_CODE_CLASS (TREE_CODE (arg1)) != tcc_comparison
 +            TREE_CODE (arg1) != TRUTH_NOT_EXPR
 +            simple_operand_p (arg1))

 As I said previously simple_operand_p already rejects covers
 comparisons and TRUTH_NOT_EXPR.  Also arg1 had better
 TREE_SIDE_EFFECTS set if the comparison might trap, as
 it might just be hidden in something more complicated - so
 the simple check isn't enough anyway (and if simple_operand_p
 would cover it, the check would be better placed there).

 I reworked simple_operand_p so that it does this special-casing and 
 additionally
 also checks for trapping.

 +      if (TREE_CODE (arg0) == code
 +           !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1))
 +           simple_operand_p (TREE_OPERAND (arg0, 1)))
 +       {
 +         tem = build2_loc (loc,
 +                           (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
 +                                                     : TRUTH_OR_EXPR),
 +                           type, TREE_OPERAND (arg0, 1), arg1);
 +         return build2_loc (loc, code, type, TREE_OPERAND (arg0, 0),
 +                            tem);

 All trees should be folded, don't use plain build without a good reason.

 Ok, done

 +       }
 +      /* Convert X TRUTH-ANDORIF Y to X TRUTH-ANDOR Y, if X and Y
 +        are simple operands and have no side-effects.  */
 +      if (simple_operand_p (arg0)
 +           !TREE_SIDE_EFFECTS (arg0))

 Again, the checks you do for arg0 do not match those for arg1.  OTOH
 it doesn't matter whether arg0 is simple or not or has side-effects or
 not for this transformation, so why check it at all?

 It is required.  For left-hand operand, if it isn't a logical
 and/or/xor, we need to check for side-effects (and for trapping).  I
 see that calling of simple_operand_p is wrong here, as it rejects too
 much.  Nevertheless the check for side-effects is necessary for having
 valid sequence-points.  Without that checking a simple test

 int getter (void);

 int foo (void)
 {
  int c, r = 0;
  while ((c = getter ()) != ''  c = 0)
    r +=c;
  return r;
 }

 would give a warning about sequence-points.  As left-hand operand has
 side-effects, but right-hand not.  If we would combine it as AND, the
 operands are exchange-able.  So right-hand operand needs to be another
 ANDIF expression instead.
 Same apply on trapping.

 In fold_truthop we still have the same (albeit more restricted transform),
 but guarded with

  if (BRANCH_COST (optimize_function_for_speed_p (cfun),
                   false) = 2

 too.  Why not here?  Please delete redundant code in fold_truthop.
 Well, in general this is the default definition of
 LOGICAL_OP_NON_SHORT_CIRCUIT, so I missed that.  As for some targets
 the macro LOGICAL_OP_NON_SHORT_CIRCUIT might be defined differently,
 it might make sense to check for BRANCH_COST again.

 It's also odd that this is only called from fold_truth_andor but has
 a more generic name, so maybe rename it to fold_truth_andor_1 on the way.

 I renamed it.

 Richard.

 ChangeLog

 2011-10-07  Kai Tietz  kti...@redhat.com

        * fold-const.c (simple_operand_p): Make argument non-const
        and add floating-point trapping check, and special cases for
        comparisons, and logical-not's.
        (fold_truthop): Rename to
        (fold_truth_andor_1): function name.
        Additionally remove here TRUTH-AND|OR_EXPR generation.
        (fold_truth_andor): Handle branching at one place.

 Bootstrapped and regression-tested for all languages plus Ada and
 Obj-C++ on x86_64-pc-linux-gnu.
 Ok for apply?

 Regards,
 Kai

 Index: gcc/gcc/fold-const.c
 ===
 --- gcc.orig/gcc/fold-const.c
 +++ gcc/gcc/fold-const.c
 @@ -111,14 +111,13 @@ static tree decode_field_reference (loca
                                    tree *, tree *);
  static int all_ones_mask_p (const_tree, int);
  static tree sign_bit_p (tree, const_tree);
 -static int simple_operand_p (const_tree);
 +static int simple_operand_p (tree);
  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
  static tree range_predecessor (tree);
  static tree range_successor (tree);
  static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
 tree, tree);
  static tree unextend (tree, int, int, tree);
 -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
  static tree optimize_minmax_comparison (location_t, enum tree_code,
                                        tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 @@ 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-10 Thread Richard Guenther
On Mon, Oct 10, 2011 at 2:29 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Recent patch had a thinko on rhs of inner lhs check for TRUTH-IF.  It
 has to be checked that the LHS code is same as outer CODE, as
 otherwise we wouldn't apply different TRUTH-IF only on inner RHS of
 LHS, which is of course wrong.

 Index: gcc/gcc/fold-const.c
 ===
 --- gcc.orig/gcc/fold-const.c
 +++ gcc/gcc/fold-const.c
 @@ -111,14 +111,13 @@ static tree decode_field_reference (loca
                                    tree *, tree *);
  static int all_ones_mask_p (const_tree, int);
  static tree sign_bit_p (tree, const_tree);
 -static int simple_operand_p (const_tree);
 +static int simple_operand_p (tree);
  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
  static tree range_predecessor (tree);
  static tree range_successor (tree);
  static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
 tree, tree);
  static tree unextend (tree, int, int, tree);
 -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
  static tree optimize_minmax_comparison (location_t, enum tree_code,
                                        tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 @@ -3500,7 +3499,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
  }

 -/* Subroutine for fold_truthop: decode a field reference.
 +/* Subroutine for fold_truth_andor_1: decode a field reference.

    If EXP is a comparison reference, we return the innermost reference.

 @@ -3668,17 +3667,43 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
  }

 -/* Subroutine for fold_truthop: determine if an operand is simple enough
 +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
 enough
    to be evaluated unconditionally.  */

  static int
 -simple_operand_p (const_tree exp)
 +simple_operand_p (tree exp)
  {
 +  enum tree_code code;
   /* Strip any conversions that don't change the machine mode.  */
   STRIP_NOPS (exp);

 +  code = TREE_CODE (exp);
 +
 +  /* Handle some trivials   */
 +  if (TREE_CODE_CLASS (code) == tcc_comparison)
 +    return (tree_could_trap_p (exp)
 +            simple_operand_p (TREE_OPERAND (exp, 0))
 +            simple_operand_p (TREE_OPERAND (exp, 1)));

And that's still wrong.

Stopped reading here.

Richard.

 +  if (FLOAT_TYPE_P (TREE_TYPE (exp))
 +       tree_could_trap_p (exp))
 +    return false;
 +
 +  switch (code)
 +    {
 +    case SSA_NAME:
 +      return true;
 +    case TRUTH_NOT_EXPR:
 +      return simple_operand_p (TREE_OPERAND (exp, 0));
 +    case BIT_NOT_EXPR:
 +      if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
 +        return false;
 +      return simple_operand_p (TREE_OPERAND (exp, 0));
 +    default:
 +      break;
 +    }
 +
   return (CONSTANT_CLASS_P (exp)
 -         || TREE_CODE (exp) == SSA_NAME
          || (DECL_P (exp)
               ! TREE_ADDRESSABLE (exp)
               ! TREE_THIS_VOLATILE (exp)
 @@ -4888,7 +4913,7 @@ fold_range_test (location_t loc, enum tr
   return 0;
  }

 -/* Subroutine for fold_truthop: C is an INTEGER_CST interpreted as a P
 +/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P
    bit value.  Arrange things so the extra bits will be set to zero if and
    only if C is signed-extended to its full width.  If MASK is nonzero,
    it is an INTEGER_CST that should be AND'ed with the extra bits.  */
 @@ -5025,8 +5050,8 @@ merge_truthop_with_opposite_arm (locatio
    We return the simplified tree or 0 if no optimization is possible.  */

  static tree
 -fold_truthop (location_t loc, enum tree_code code, tree truth_type,
 -             tree lhs, tree rhs)
 +fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type,
 +                   tree lhs, tree rhs)
  {
   /* If this is the or of two comparisons, we can do something if
      the comparisons are NE_EXPR.  If this is the and, we can do something
 @@ -5054,8 +5079,6 @@ fold_truthop (location_t loc, enum tree_
   tree lntype, rntype, result;
   HOST_WIDE_INT first_bit, end_bit;
   int volatilep;
 -  tree orig_lhs = lhs, orig_rhs = rhs;
 -  enum tree_code orig_code = code;

   /* Start by getting the comparison codes.  Fail if anything is volatile.
      If one operand is a BIT_AND_EXPR with the constant one, treat it as if
 @@ -5119,8 +5142,7 @@ fold_truthop (location_t loc, enum tree_
   /* If the RHS can be evaluated unconditionally and its operands are
      simple, it wins to evaluate the RHS unconditionally on machines
      with expensive branches.  In this case, this isn't a comparison
 -     that can be merged.  Avoid doing this if the RHS is a floating-point
 -     comparison since those can trap.  */
 +     that can be merged.  */

   if (BRANCH_COST (optimize_function_for_speed_p (cfun),
                   false) = 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-10 Thread Kai Tietz
2011/10/10 Richard Guenther richard.guent...@gmail.com:
 On Mon, Oct 10, 2011 at 2:29 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Recent patch had a thinko on rhs of inner lhs check for TRUTH-IF.  It
 has to be checked that the LHS code is same as outer CODE, as
 otherwise we wouldn't apply different TRUTH-IF only on inner RHS of
 LHS, which is of course wrong.

 Index: gcc/gcc/fold-const.c
 ===
 --- gcc.orig/gcc/fold-const.c
 +++ gcc/gcc/fold-const.c
 @@ -111,14 +111,13 @@ static tree decode_field_reference (loca
                                    tree *, tree *);
  static int all_ones_mask_p (const_tree, int);
  static tree sign_bit_p (tree, const_tree);
 -static int simple_operand_p (const_tree);
 +static int simple_operand_p (tree);
  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
  static tree range_predecessor (tree);
  static tree range_successor (tree);
  static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
 tree, tree);
  static tree unextend (tree, int, int, tree);
 -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
  static tree optimize_minmax_comparison (location_t, enum tree_code,
                                        tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 @@ -3500,7 +3499,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
  }

 -/* Subroutine for fold_truthop: decode a field reference.
 +/* Subroutine for fold_truth_andor_1: decode a field reference.

    If EXP is a comparison reference, we return the innermost reference.

 @@ -3668,17 +3667,43 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
  }

 -/* Subroutine for fold_truthop: determine if an operand is simple enough
 +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
 enough
    to be evaluated unconditionally.  */

  static int
 -simple_operand_p (const_tree exp)
 +simple_operand_p (tree exp)
  {
 +  enum tree_code code;
   /* Strip any conversions that don't change the machine mode.  */
   STRIP_NOPS (exp);

 +  code = TREE_CODE (exp);
 +
 +  /* Handle some trivials   */
 +  if (TREE_CODE_CLASS (code) == tcc_comparison)
 +    return (tree_could_trap_p (exp)
 +            simple_operand_p (TREE_OPERAND (exp, 0))
 +            simple_operand_p (TREE_OPERAND (exp, 1)));

 And that's still wrong.

 Stopped reading here.

 Richard.

Oh, there is a not missing.  I didn't spot that, sorry.

To the point why we need to handle comparisons within simple_operand_p.

If we reject comparisons and logical not here, we won't have any
branching optimization anymore, as this the patch moves into
fold_truthandor.

The result with rejecting in simple_operand_p compares and logic-not
provides for the following example:

extern int doo (void);
extern int doo2 (void);

int foo (int a, int b, int c, int d, int e, int f)
{
  if (a  b  c  d  e  f)
return doo ();
  return doo2 ();
}

we get the following gimple dump (-fdump-tree-gimple):

foo (int a, int b, int c, int d, int e, int f)
{
  int D.2752;

  if (a != 0) goto D.2740; else goto D.2741;
  D.2740:
  if (b != 0) goto D.2742; else goto D.2743;
  D.2742:
  if (c != 0) goto D.2744; else goto D.2745;
  D.2744:
  if (d != 0) goto D.2746; else goto D.2747;
  D.2746:
  if (e != 0) goto D.2748; else goto D.2749;
  D.2748:
  if (f != 0) goto D.2750; else goto D.2751;
  D.2750:
  D.2752 = doo ();
  return D.2752;
  D.2751:
  D.2749:
  D.2747:
  D.2745:
  D.2743:
  D.2741:
  D.2752 = doo2 ();
  return D.2752;
}

So this result is caused by the fact that a logical-and/or is always a
comparison.  As we are rejecting comparisons/logical-not,  we end up
with a sequence of TRUTH_(AND|OR)_IFs.  So the hole loop in
fold_truth_andor for trying to pack simple and side-effect-less
operands in tupels of two is useless.

For the new patch (attached with proper ! fix for compares) we get for
the same sample gimple output:

foo (int a, int b, int c, int d, int e, int f)
{
  _Bool D.2740;
  _Bool D.2741;
  _Bool D.2742;
  _Bool D.2745;
  _Bool D.2746;
  _Bool D.2747;
  _Bool D.2750;
  _Bool D.2751;
  _Bool D.2752;
  int D.2755;

  D.2740 = a != 0;
  D.2741 = b != 0;
  D.2742 = D.2740  D.2741;
  if (D.2742 != 0) goto D.2743; else goto D.2744;
  D.2743:
  D.2745 = c != 0;
  D.2746 = d != 0;
  D.2747 = D.2745  D.2746;
  if (D.2747 != 0) goto D.2748; else goto D.2749;
  D.2748:
  D.2750 = e != 0;
  D.2751 = f != 0;
  D.2752 = D.2750  D.2751;
  if (D.2752 != 0) goto D.2753; else goto D.2754;
  D.2753:
  D.2755 = doo ();
  return D.2755;
  D.2754:
  D.2749:
  D.2744:
  D.2755 = doo2 ();
  return D.2755;
}

which is the proper output for high branching cost, as it tries to
avoid branches and not to tend them.

--
Kai

Index: gcc/gcc/fold-const.c
===
--- 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-10 Thread Richard Guenther
On Mon, Oct 10, 2011 at 4:06 PM, Kai Tietz ktiet...@googlemail.com wrote:
 2011/10/10 Richard Guenther richard.guent...@gmail.com:
 On Mon, Oct 10, 2011 at 2:29 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Recent patch had a thinko on rhs of inner lhs check for TRUTH-IF.  It
 has to be checked that the LHS code is same as outer CODE, as
 otherwise we wouldn't apply different TRUTH-IF only on inner RHS of
 LHS, which is of course wrong.

 Index: gcc/gcc/fold-const.c
 ===
 --- gcc.orig/gcc/fold-const.c
 +++ gcc/gcc/fold-const.c
 @@ -111,14 +111,13 @@ static tree decode_field_reference (loca
                                    tree *, tree *);
  static int all_ones_mask_p (const_tree, int);
  static tree sign_bit_p (tree, const_tree);
 -static int simple_operand_p (const_tree);
 +static int simple_operand_p (tree);
  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
  static tree range_predecessor (tree);
  static tree range_successor (tree);
  static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
 tree, tree);
  static tree unextend (tree, int, int, tree);
 -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
  static tree optimize_minmax_comparison (location_t, enum tree_code,
                                        tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 @@ -3500,7 +3499,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
  }

 -/* Subroutine for fold_truthop: decode a field reference.
 +/* Subroutine for fold_truth_andor_1: decode a field reference.

    If EXP is a comparison reference, we return the innermost reference.

 @@ -3668,17 +3667,43 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
  }

 -/* Subroutine for fold_truthop: determine if an operand is simple enough
 +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
 enough
    to be evaluated unconditionally.  */

  static int
 -simple_operand_p (const_tree exp)
 +simple_operand_p (tree exp)
  {
 +  enum tree_code code;
   /* Strip any conversions that don't change the machine mode.  */
   STRIP_NOPS (exp);

 +  code = TREE_CODE (exp);
 +
 +  /* Handle some trivials   */
 +  if (TREE_CODE_CLASS (code) == tcc_comparison)
 +    return (tree_could_trap_p (exp)
 +            simple_operand_p (TREE_OPERAND (exp, 0))
 +            simple_operand_p (TREE_OPERAND (exp, 1)));

 And that's still wrong.

 Stopped reading here.

 Richard.

 Oh, there is a not missing.  I didn't spot that, sorry.

 To the point why we need to handle comparisons within simple_operand_p.

 If we reject comparisons and logical not here, we won't have any
 branching optimization anymore, as this the patch moves into
 fold_truthandor.

 The result with rejecting in simple_operand_p compares and logic-not
 provides for the following example:

But you change what simple_operand_p accepts and thus change what
fold_truthop accepts as operands to its simplifications.

Richard.

 extern int doo (void);
 extern int doo2 (void);

 int foo (int a, int b, int c, int d, int e, int f)
 {
  if (a  b  c  d  e  f)
    return doo ();
  return doo2 ();
 }

 we get the following gimple dump (-fdump-tree-gimple):

 foo (int a, int b, int c, int d, int e, int f)
 {
  int D.2752;

  if (a != 0) goto D.2740; else goto D.2741;
  D.2740:
  if (b != 0) goto D.2742; else goto D.2743;
  D.2742:
  if (c != 0) goto D.2744; else goto D.2745;
  D.2744:
  if (d != 0) goto D.2746; else goto D.2747;
  D.2746:
  if (e != 0) goto D.2748; else goto D.2749;
  D.2748:
  if (f != 0) goto D.2750; else goto D.2751;
  D.2750:
  D.2752 = doo ();
  return D.2752;
  D.2751:
  D.2749:
  D.2747:
  D.2745:
  D.2743:
  D.2741:
  D.2752 = doo2 ();
  return D.2752;
 }

 So this result is caused by the fact that a logical-and/or is always a
 comparison.  As we are rejecting comparisons/logical-not,  we end up
 with a sequence of TRUTH_(AND|OR)_IFs.  So the hole loop in
 fold_truth_andor for trying to pack simple and side-effect-less
 operands in tupels of two is useless.

 For the new patch (attached with proper ! fix for compares) we get for
 the same sample gimple output:

 foo (int a, int b, int c, int d, int e, int f)
 {
  _Bool D.2740;
  _Bool D.2741;
  _Bool D.2742;
  _Bool D.2745;
  _Bool D.2746;
  _Bool D.2747;
  _Bool D.2750;
  _Bool D.2751;
  _Bool D.2752;
  int D.2755;

  D.2740 = a != 0;
  D.2741 = b != 0;
  D.2742 = D.2740  D.2741;
  if (D.2742 != 0) goto D.2743; else goto D.2744;
  D.2743:
  D.2745 = c != 0;
  D.2746 = d != 0;
  D.2747 = D.2745  D.2746;
  if (D.2747 != 0) goto D.2748; else goto D.2749;
  D.2748:
  D.2750 = e != 0;
  D.2751 = f != 0;
  D.2752 = D.2750  D.2751;
  if (D.2752 != 0) goto D.2753; else goto D.2754;
  D.2753:
  D.2755 = doo ();
  return D.2755;
  D.2754:
  D.2749:
  D.2744:
  D.2755 = doo2 ();
  return 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-10 Thread Kai Tietz
2011/10/10 Richard Guenther richard.guent...@gmail.com:
 On Mon, Oct 10, 2011 at 4:06 PM, Kai Tietz ktiet...@googlemail.com wrote:
 2011/10/10 Richard Guenther richard.guent...@gmail.com:
 On Mon, Oct 10, 2011 at 2:29 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Recent patch had a thinko on rhs of inner lhs check for TRUTH-IF.  It
 has to be checked that the LHS code is same as outer CODE, as
 otherwise we wouldn't apply different TRUTH-IF only on inner RHS of
 LHS, which is of course wrong.

 Index: gcc/gcc/fold-const.c
 ===
 --- gcc.orig/gcc/fold-const.c
 +++ gcc/gcc/fold-const.c
 @@ -111,14 +111,13 @@ static tree decode_field_reference (loca
                                    tree *, tree *);
  static int all_ones_mask_p (const_tree, int);
  static tree sign_bit_p (tree, const_tree);
 -static int simple_operand_p (const_tree);
 +static int simple_operand_p (tree);
  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
  static tree range_predecessor (tree);
  static tree range_successor (tree);
  static tree fold_range_test (location_t, enum tree_code, tree, tree, 
 tree);
  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
 tree, tree);
  static tree unextend (tree, int, int, tree);
 -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
  static tree optimize_minmax_comparison (location_t, enum tree_code,
                                        tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 @@ -3500,7 +3499,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
  }

 -/* Subroutine for fold_truthop: decode a field reference.
 +/* Subroutine for fold_truth_andor_1: decode a field reference.

    If EXP is a comparison reference, we return the innermost reference.

 @@ -3668,17 +3667,43 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
  }

 -/* Subroutine for fold_truthop: determine if an operand is simple enough
 +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
 enough
    to be evaluated unconditionally.  */

  static int
 -simple_operand_p (const_tree exp)
 +simple_operand_p (tree exp)
  {
 +  enum tree_code code;
   /* Strip any conversions that don't change the machine mode.  */
   STRIP_NOPS (exp);

 +  code = TREE_CODE (exp);
 +
 +  /* Handle some trivials   */
 +  if (TREE_CODE_CLASS (code) == tcc_comparison)
 +    return (tree_could_trap_p (exp)
 +            simple_operand_p (TREE_OPERAND (exp, 0))
 +            simple_operand_p (TREE_OPERAND (exp, 1)));

 And that's still wrong.

 Stopped reading here.

 Richard.

 Oh, there is a not missing.  I didn't spot that, sorry.

 To the point why we need to handle comparisons within simple_operand_p.

 If we reject comparisons and logical not here, we won't have any
 branching optimization anymore, as this the patch moves into
 fold_truthandor.

 The result with rejecting in simple_operand_p compares and logic-not
 provides for the following example:

 But you change what simple_operand_p accepts and thus change what
 fold_truthop accepts as operands to its simplifications.

 Richard.

Well, not really.  I assume you mean fold_truth_andor_1 (aka fold_truthop).

It checks for
...
  if (TREE_CODE_CLASS (lcode) != tcc_comparison
  || TREE_CODE_CLASS (rcode) != tcc_comparison)
return 0;
...
before checking for simple_operand_p.  So there is actual no change.
It might be of some interest here to add in a different patch support
for logic-not, but well, this is would be material for a different
patch.
So, it won't operate on anything else then comparisons as before.

Kai


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-10 Thread Michael Matz
Hi,

On Mon, 10 Oct 2011, Kai Tietz wrote:

 extern int foo (void); /* foo modifies gbl1 */
 int gbl1 = 0;
 
 int foo (int ns1)
 {
   if (ns1  foo ()  gbl1)
 return 1;
   return 0;
 }
 
 so chain of trees has to look like this:
 (ANDIF (ns1 (ANDIF foo () gbl1))

Okay, indeed.  I was wrong when I claimed that only the RHS needs to be 
side-effect-free for ANDIF-AND to be valid.  It's true when the RHS can't 
depend on the LHS, but I only thought about the fact that the side-effect 
must occur, not that it possibly influences RHS.


Ciao,
Michael.


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-10 Thread Richard Guenther
On Mon, Oct 10, 2011 at 5:07 PM, Kai Tietz ktiet...@googlemail.com wrote:
 2011/10/10 Richard Guenther richard.guent...@gmail.com:
 On Mon, Oct 10, 2011 at 4:06 PM, Kai Tietz ktiet...@googlemail.com wrote:
 2011/10/10 Richard Guenther richard.guent...@gmail.com:
 On Mon, Oct 10, 2011 at 2:29 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Recent patch had a thinko on rhs of inner lhs check for TRUTH-IF.  It
 has to be checked that the LHS code is same as outer CODE, as
 otherwise we wouldn't apply different TRUTH-IF only on inner RHS of
 LHS, which is of course wrong.

 Index: gcc/gcc/fold-const.c
 ===
 --- gcc.orig/gcc/fold-const.c
 +++ gcc/gcc/fold-const.c
 @@ -111,14 +111,13 @@ static tree decode_field_reference (loca
                                    tree *, tree *);
  static int all_ones_mask_p (const_tree, int);
  static tree sign_bit_p (tree, const_tree);
 -static int simple_operand_p (const_tree);
 +static int simple_operand_p (tree);
  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
  static tree range_predecessor (tree);
  static tree range_successor (tree);
  static tree fold_range_test (location_t, enum tree_code, tree, tree, 
 tree);
  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
 tree, tree);
  static tree unextend (tree, int, int, tree);
 -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
  static tree optimize_minmax_comparison (location_t, enum tree_code,
                                        tree, tree, tree);
  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 @@ -3500,7 +3499,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
  }

 -/* Subroutine for fold_truthop: decode a field reference.
 +/* Subroutine for fold_truth_andor_1: decode a field reference.

    If EXP is a comparison reference, we return the innermost reference.

 @@ -3668,17 +3667,43 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
  }

 -/* Subroutine for fold_truthop: determine if an operand is simple enough
 +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
 enough
    to be evaluated unconditionally.  */

  static int
 -simple_operand_p (const_tree exp)
 +simple_operand_p (tree exp)
  {
 +  enum tree_code code;
   /* Strip any conversions that don't change the machine mode.  */
   STRIP_NOPS (exp);

 +  code = TREE_CODE (exp);
 +
 +  /* Handle some trivials   */
 +  if (TREE_CODE_CLASS (code) == tcc_comparison)
 +    return (tree_could_trap_p (exp)
 +            simple_operand_p (TREE_OPERAND (exp, 0))
 +            simple_operand_p (TREE_OPERAND (exp, 1)));

 And that's still wrong.

 Stopped reading here.

 Richard.

 Oh, there is a not missing.  I didn't spot that, sorry.

 To the point why we need to handle comparisons within simple_operand_p.

 If we reject comparisons and logical not here, we won't have any
 branching optimization anymore, as this the patch moves into
 fold_truthandor.

 The result with rejecting in simple_operand_p compares and logic-not
 provides for the following example:

 But you change what simple_operand_p accepts and thus change what
 fold_truthop accepts as operands to its simplifications.

 Richard.

 Well, not really.  I assume you mean fold_truth_andor_1 (aka fold_truthop).

 It checks for
 ...
  if (TREE_CODE_CLASS (lcode) != tcc_comparison
      || TREE_CODE_CLASS (rcode) != tcc_comparison)
    return 0;
 ...
 before checking for simple_operand_p.  So there is actual no change.
 It might be of some interest here to add in a different patch support
 for logic-not, but well, this is would be material for a different
 patch.
 So, it won't operate on anything else then comparisons as before.

Sure, because simple_operand_p is checked on the comparison
operands, not the comparison itself.

Richard.

 Kai



Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-07 Thread Richard Guenther
On Thu, Oct 6, 2011 at 4:25 PM, Kai Tietz kti...@redhat.com wrote:
 Hi,

 I modified the patch so, that it always just converts two leafs of a 
 TRUTH(AND|OR)IF chain into a TRUTH_(AND|OR) expression, if branch costs are 
 high and leafs are simple without side-effects.

 Additionally I added some testcases for it.

 2011-10-06  Kai Tietz  kti...@redhat.com

        * fold-const.c (fold_truth_andor): Convert TRUTH_(AND|OR)IF_EXPR
        to TRUTH_OR_EXPR, if suitable.

 2011-10-06  Kai Tietz  kti...@redhat.com

        * gcc.dg/tree-ssa/ssa-ifbranch-1.c: New test.
        * gcc.dg/tree-ssa/ssa-ifbranch-2.c: New test.
        * gcc.dg/tree-ssa/ssa-ifbranch-3.c: New test.
        * gcc.dg/tree-ssa/ssa-ifbranch-4.c: New test.

 Bootstrapped and tested for all languages (including Ada and Obj-C++) on host 
 x86_64-unknown-linux-gnu.  Ok for apply?

 Regards,
 Kai

 Index: gcc-head/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifbranch-1.c
 ===
 --- /dev/null
 +++ gcc-head/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifbranch-1.c
 @@ -0,0 +1,18 @@
 +/* Skip on MIPS, S/390, and AVR due LOGICAL_OP_NON_SHORT_CIRCUIT, and
 +   lower values in BRANCH_COST.  */
 +/* { dg-do compile { target { ! mips*-*-* s390*-*-*  avr-*-* mn10300-*-* } 
 } } */
 +/* { dg-options -O2 -fdump-tree-gimple } */
 +/* { dg-options -O2 -fdump-tree-gimple -march=i586 { target { i?86-*-*  
 ilp32 } } } */
 +
 +extern int doo1 (void);
 +extern int doo2 (void);
 +
 +int bar (int a, int b, int c)
 +{
 +  if (a  b  c)
 +    return doo1 ();
 +  return doo2 ();
 +}
 +
 +/* { dg-final { scan-tree-dump-times if  2 gimple } } */
 +/* { dg-final { cleanup-tree-dump gimple } } */
 Index: gcc-head/gcc/fold-const.c
 ===
 --- gcc-head.orig/gcc/fold-const.c
 +++ gcc-head/gcc/fold-const.c
 @@ -8387,6 +8387,45 @@ fold_truth_andor (location_t loc, enum t
   if ((tem = fold_truthop (loc, code, type, arg0, arg1)) != 0)
     return tem;

 +  if ((code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR)
 +       !TREE_SIDE_EFFECTS (arg1)
 +       LOGICAL_OP_NON_SHORT_CIRCUIT
 +      /* floats might trap.  */
 +       !FLOAT_TYPE_P (TREE_TYPE (arg1))

Floats don't trap on their own.  If possibly trapping trees don't have
TREE_SIDE_EFFECTS set then you want

 !tree_could_trap_p (arg1)

here.

 +       ((TREE_CODE_CLASS (TREE_CODE (arg1)) != tcc_comparison
 +            TREE_CODE (arg1) != TRUTH_NOT_EXPR
 +            simple_operand_p (arg1))

As I said previously simple_operand_p already rejects covers
comparisons and TRUTH_NOT_EXPR.  Also arg1 had better
TREE_SIDE_EFFECTS set if the comparison might trap, as
it might just be hidden in something more complicated - so
the simple check isn't enough anyway (and if simple_operand_p
would cover it, the check would be better placed there).

 +          || ((TREE_CODE_CLASS (TREE_CODE (arg1)) == tcc_comparison
 +               || TREE_CODE (arg1) == TRUTH_NOT_EXPR)
 +             /* Float comparison might trap.  */
 +               !FLOAT_TYPE_P (TREE_TYPE (TREE_OPERAND (arg1, 0)))
 +               simple_operand_p (TREE_OPERAND (arg1, 0)
 +    {
 +      /* We want to combine truth-comparison for
 +        ((W TRUTH-ANDOR X) TRUTH-ANDORIF Y) TRUTH-ANDORIF Z,
 +        if Y and Z are simple operands and have no side-effect to
 +        ((W TRUTH-ANDOR X) TRUTH-IF (Y TRUTH-ANDOR Z).  */

No we don't.  See down-thread.

 +      if (TREE_CODE (arg0) == code
 +           !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1))
 +           simple_operand_p (TREE_OPERAND (arg0, 1)))
 +       {
 +         tem = build2_loc (loc,
 +                           (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
 +                                                     : TRUTH_OR_EXPR),
 +                           type, TREE_OPERAND (arg0, 1), arg1);
 +         return build2_loc (loc, code, type, TREE_OPERAND (arg0, 0),
 +                            tem);

All trees should be folded, don't use plain build without a good reason.

 +       }
 +      /* Convert X TRUTH-ANDORIF Y to X TRUTH-ANDOR Y, if X and Y
 +        are simple operands and have no side-effects.  */
 +      if (simple_operand_p (arg0)
 +           !TREE_SIDE_EFFECTS (arg0))

Again, the checks you do for arg0 do not match those for arg1.  OTOH
it doesn't matter whether arg0 is simple or not or has side-effects or
not for this transformation, so why check it at all?

In fold_truthop we still have the same (albeit more restricted transform),
but guarded with

 if (BRANCH_COST (optimize_function_for_speed_p (cfun),
   false) = 2

too.  Why not here?  Please delete redundant code in fold_truthop.

It's also odd that this is only called from fold_truth_andor but has
a more generic name, so maybe rename it to fold_truth_andor_1 on the way.

Richard.

 +       return build2_loc (loc,
 +                          (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
 +                          

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-07 Thread Kai Tietz
Hello,

this is the updated version with the suggestion

2011/10/7 Richard Guenther richard.guent...@gmail.com:
 On Thu, Oct 6, 2011 at 4:25 PM, Kai Tietz kti...@redhat.com wrote:
 +       ((TREE_CODE_CLASS (TREE_CODE (arg1)) != tcc_comparison
 +            TREE_CODE (arg1) != TRUTH_NOT_EXPR
 +            simple_operand_p (arg1))

 As I said previously simple_operand_p already rejects covers
 comparisons and TRUTH_NOT_EXPR.  Also arg1 had better
 TREE_SIDE_EFFECTS set if the comparison might trap, as
 it might just be hidden in something more complicated - so
 the simple check isn't enough anyway (and if simple_operand_p
 would cover it, the check would be better placed there).

I reworked simple_operand_p so that it does this special-casing and additionally
also checks for trapping.

 +      if (TREE_CODE (arg0) == code
 +           !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1))
 +           simple_operand_p (TREE_OPERAND (arg0, 1)))
 +       {
 +         tem = build2_loc (loc,
 +                           (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
 +                                                     : TRUTH_OR_EXPR),
 +                           type, TREE_OPERAND (arg0, 1), arg1);
 +         return build2_loc (loc, code, type, TREE_OPERAND (arg0, 0),
 +                            tem);

 All trees should be folded, don't use plain build without a good reason.

Ok, done

 +       }
 +      /* Convert X TRUTH-ANDORIF Y to X TRUTH-ANDOR Y, if X and Y
 +        are simple operands and have no side-effects.  */
 +      if (simple_operand_p (arg0)
 +           !TREE_SIDE_EFFECTS (arg0))

 Again, the checks you do for arg0 do not match those for arg1.  OTOH
 it doesn't matter whether arg0 is simple or not or has side-effects or
 not for this transformation, so why check it at all?

It is required.  For left-hand operand, if it isn't a logical
and/or/xor, we need to check for side-effects (and for trapping).  I
see that calling of simple_operand_p is wrong here, as it rejects too
much.  Nevertheless the check for side-effects is necessary for having
valid sequence-points.  Without that checking a simple test

int getter (void);

int foo (void)
{
  int c, r = 0;
  while ((c = getter ()) != ''  c = 0)
r +=c;
  return r;
}

would give a warning about sequence-points.  As left-hand operand has
side-effects, but right-hand not.  If we would combine it as AND, the
operands are exchange-able.  So right-hand operand needs to be another
ANDIF expression instead.
Same apply on trapping.

 In fold_truthop we still have the same (albeit more restricted transform),
 but guarded with

  if (BRANCH_COST (optimize_function_for_speed_p (cfun),
                   false) = 2

 too.  Why not here?  Please delete redundant code in fold_truthop.
Well, in general this is the default definition of
LOGICAL_OP_NON_SHORT_CIRCUIT, so I missed that.  As for some targets
the macro LOGICAL_OP_NON_SHORT_CIRCUIT might be defined differently,
it might make sense to check for BRANCH_COST again.

 It's also odd that this is only called from fold_truth_andor but has
 a more generic name, so maybe rename it to fold_truth_andor_1 on the way.

I renamed it.

 Richard.

ChangeLog

2011-10-07  Kai Tietz  kti...@redhat.com

* fold-const.c (simple_operand_p): Make argument non-const
and add floating-point trapping check, and special cases for
comparisons, and logical-not's.
(fold_truthop): Rename to
(fold_truth_andor_1): function name.
Additionally remove here TRUTH-AND|OR_EXPR generation.
(fold_truth_andor): Handle branching at one place.

Bootstrapped and regression-tested for all languages plus Ada and
Obj-C++ on x86_64-pc-linux-gnu.
Ok for apply?

Regards,
Kai

Index: gcc/gcc/fold-const.c
===
--- gcc.orig/gcc/fold-const.c
+++ gcc/gcc/fold-const.c
@@ -111,14 +111,13 @@ static tree decode_field_reference (loca
tree *, tree *);
 static int all_ones_mask_p (const_tree, int);
 static tree sign_bit_p (tree, const_tree);
-static int simple_operand_p (const_tree);
+static int simple_operand_p (tree);
 static tree range_binop (enum tree_code, tree, tree, int, tree, int);
 static tree range_predecessor (tree);
 static tree range_successor (tree);
 static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
 static tree fold_cond_expr_with_comparison (location_t, tree, tree,
tree, tree);
 static tree unextend (tree, int, int, tree);
-static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
 static tree optimize_minmax_comparison (location_t, enum tree_code,
tree, tree, tree);
 static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
@@ -3500,7 +3499,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
 }
 
-/* Subroutine for fold_truthop: decode a field reference.
+/* Subroutine for fold_truth_andor_1: decode a 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-06 Thread Kai Tietz
Hello,

Sorry attached non-updated change.  Here with proper attached patch.
This patch improves in fold_truth_andor the generation of branch-conditions for 
targets having LOGICAL_OP_NON_SHORT_CIRCUIT set.  If right-hand side operation 
of a TRUTH_(OR|AND)IF_EXPR is simple operand, has no side-effects, and doesn't 
trap, then try to convert expression to a TRUTH_(AND|OR)_EXPR, if left-hand 
operand is a simple operand, and has no side-effects.

ChangeLog

2011-10-06  Kai Tietz  kti...@redhat.com

* fold-const.c (fold_truth_andor): Convert TRUTH_(AND|OR)IF_EXPR
to TRUTH_OR_EXPR, if suitable.

Bootstrapped and tested for all languages (including Ada and Obj-C++) on host 
x86_64-unknown-linux-gnu.  Ok for apply?

Regards,
Kai


ndex: fold-const.c
===
--- fold-const.c(revision 179592)
+++ fold-const.c(working copy)
@@ -8387,6 +8387,33 @@
   if ((tem = fold_truthop (loc, code, type, arg0, arg1)) != 0)
 return tem;

+  if ((code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR)
+   !TREE_SIDE_EFFECTS (arg1)
+   simple_operand_p (arg1)
+   LOGICAL_OP_NON_SHORT_CIRCUIT
+   !FLOAT_TYPE_P (TREE_TYPE (arg1))
+   ((TREE_CODE_CLASS (TREE_CODE (arg1)) != tcc_comparison
+   TREE_CODE (arg1) != TRUTH_NOT_EXPR)
+ || !FLOAT_TYPE_P (TREE_TYPE (TREE_OPERAND (arg1, 0)
+{
+  if (TREE_CODE (arg0) == code
+   !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1))
+   simple_operand_p (TREE_OPERAND (arg0, 1)))
+   {
+ tem = build2_loc (loc,
+   (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
+ : TRUTH_OR_EXPR),
+   type, TREE_OPERAND (arg0, 1), arg1);
+   return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0), tem);
+  }
+  if (!TREE_SIDE_EFFECTS (arg0)
+   simple_operand_p (arg0))
+   return build2_loc (loc,
+  (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
+: TRUTH_OR_EXPR),
+  type, arg0, arg1);
+}
+
   return NULL_TREE;
 }



Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-06 Thread Richard Guenther
On Thu, Oct 6, 2011 at 11:28 AM, Kai Tietz kti...@redhat.com wrote:
 Hello,

 Sorry attached non-updated change.  Here with proper attached patch.
 This patch improves in fold_truth_andor the generation of branch-conditions 
 for targets having LOGICAL_OP_NON_SHORT_CIRCUIT set.  If right-hand side 
 operation of a TRUTH_(OR|AND)IF_EXPR is simple operand, has no side-effects, 
 and doesn't trap, then try to convert expression to a TRUTH_(AND|OR)_EXPR, if 
 left-hand operand is a simple operand, and has no side-effects.

 ChangeLog

 2011-10-06  Kai Tietz  kti...@redhat.com

        * fold-const.c (fold_truth_andor): Convert TRUTH_(AND|OR)IF_EXPR
        to TRUTH_OR_EXPR, if suitable.

 Bootstrapped and tested for all languages (including Ada and Obj-C++) on host 
 x86_64-unknown-linux-gnu.  Ok for apply?

 Regards,
 Kai


 ndex: fold-const.c
 ===
 --- fold-const.c        (revision 179592)
 +++ fold-const.c        (working copy)
 @@ -8387,6 +8387,33 @@
   if ((tem = fold_truthop (loc, code, type, arg0, arg1)) != 0)
     return tem;

 +  if ((code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR)
 +       !TREE_SIDE_EFFECTS (arg1)
 +       simple_operand_p (arg1)
 +       LOGICAL_OP_NON_SHORT_CIRCUIT

Why only for LOGICAL_OP_NON_SHORT_CIRCUIT?  It doesn't make
a difference for !LOGICAL_OP_NON_SHORT_CIRCUIT targets, but ...

 +       !FLOAT_TYPE_P (TREE_TYPE (arg1))

?  I hope we don't have || float.

 +       ((TREE_CODE_CLASS (TREE_CODE (arg1)) != tcc_comparison
 +           TREE_CODE (arg1) != TRUTH_NOT_EXPR)
 +         || !FLOAT_TYPE_P (TREE_TYPE (TREE_OPERAND (arg1, 0)

?  simple_operand_p would have rejected both ! and comparisons.

I miss a test for side-effects on arg0 (and probably simple_operand_p there,
as well).

 +    {
 +      if (TREE_CODE (arg0) == code
 +           !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1))
 +           simple_operand_p (TREE_OPERAND (arg0, 1)))

Err ... so why do you recurse here (and associate)?  Even with different
predicates than above ...

And similar transforms seem to happen in fold_truthop - did you
investigate why it didn't trigger there.

And I'm missing a testcase.

Richard.

 +       {
 +         tem = build2_loc (loc,
 +                           (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
 +                                                     : TRUTH_OR_EXPR),
 +                           type, TREE_OPERAND (arg0, 1), arg1);
 +       return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0), tem);
 +      }
 +      if (!TREE_SIDE_EFFECTS (arg0)
 +           simple_operand_p (arg0))
 +       return build2_loc (loc,
 +                          (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
 +                                                    : TRUTH_OR_EXPR),
 +                          type, arg0, arg1);
 +    }
 +
   return NULL_TREE;
  }




Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-06 Thread Kai Tietz
2011/10/6 Richard Guenther richard.guent...@gmail.com:
 On Thu, Oct 6, 2011 at 11:28 AM, Kai Tietz kti...@redhat.com wrote:
 Hello,

 Sorry attached non-updated change.  Here with proper attached patch.
 This patch improves in fold_truth_andor the generation of branch-conditions 
 for targets having LOGICAL_OP_NON_SHORT_CIRCUIT set.  If right-hand side 
 operation of a TRUTH_(OR|AND)IF_EXPR is simple operand, has no side-effects, 
 and doesn't trap, then try to convert expression to a TRUTH_(AND|OR)_EXPR, 
 if left-hand operand is a simple operand, and has no side-effects.

 ChangeLog

 2011-10-06  Kai Tietz  kti...@redhat.com

        * fold-const.c (fold_truth_andor): Convert TRUTH_(AND|OR)IF_EXPR
        to TRUTH_OR_EXPR, if suitable.

 Bootstrapped and tested for all languages (including Ada and Obj-C++) on 
 host x86_64-unknown-linux-gnu.  Ok for apply?

 Regards,
 Kai


 ndex: fold-const.c
 ===
 --- fold-const.c        (revision 179592)
 +++ fold-const.c        (working copy)
 @@ -8387,6 +8387,33 @@
   if ((tem = fold_truthop (loc, code, type, arg0, arg1)) != 0)
     return tem;

 +  if ((code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR)
 +       !TREE_SIDE_EFFECTS (arg1)
 +       simple_operand_p (arg1)
 +       LOGICAL_OP_NON_SHORT_CIRCUIT

 Why only for LOGICAL_OP_NON_SHORT_CIRCUIT?  It doesn't make
 a difference for !LOGICAL_OP_NON_SHORT_CIRCUIT targets, but ...
Well, I used this check only for not doing this transformation for
targets, which have low-cost branches.  This is the same thing as in
fold_truthop.  It does this transformation only if
LOGICAL_OP_NON_SHORT_CIRCUIT is true.
 +       !FLOAT_TYPE_P (TREE_TYPE (arg1))

 ?  I hope we don't have || float.
This can happen.  Operands of TRUTH_AND|OR(IF)_EXPR aren't necessarily
of integral type.  After expansion in gimplifier, we have for sure
comparisons, but not in c-tree.

 +       ((TREE_CODE_CLASS (TREE_CODE (arg1)) != tcc_comparison
 +           TREE_CODE (arg1) != TRUTH_NOT_EXPR)
 +         || !FLOAT_TYPE_P (TREE_TYPE (TREE_OPERAND (arg1, 0)

 ?  simple_operand_p would have rejected both ! and comparisons.
This check is the same as in fold_truthop.  I used this check.  The
point here is that floats might trap.

 I miss a test for side-effects on arg0 (and probably simple_operand_p there,
 as well).
See inner of if condition for those checks.  I moved those checks for
arg1 out of the inner conditions to avoid double-checking.

 +    {
 +      if (TREE_CODE (arg0) == code
 +           !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1))
 +           simple_operand_p (TREE_OPERAND (arg0, 1)))

 Err ... so why do you recurse here (and associate)?  Even with different
 predicates than above ...

See, here is the missing check. Point is that even if arg0 has
side-effects and is a (AND|OR)IF expression, we might be able to
associate with right-hand argument of arg0, if for it no side-effects
are existing.  Otherwise we wouldn't catch this case.
We have here in maximum a recursion level of one.

 And similar transforms seem to happen in fold_truthop - did you
 investigate why it didn't trigger there.

This is pretty simple.  The point is that only for comparisons this
transformation is done.  But in c-tree we don't have here necessarily
for TRUTH_(AND|OR)[IF]_EXPR comparison arguments, not necessarily
integral ones (see above).

 And I'm missing a testcase.

Ok, I'll add one.  Effect can be seen best after gimplification.

 Richard.

 +       {
 +         tem = build2_loc (loc,
 +                           (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
 +                                                     : TRUTH_OR_EXPR),
 +                           type, TREE_OPERAND (arg0, 1), arg1);
 +       return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0), 
 tem);
 +      }
 +      if (!TREE_SIDE_EFFECTS (arg0)
 +           simple_operand_p (arg0))
 +       return build2_loc (loc,
 +                          (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
 +                                                    : TRUTH_OR_EXPR),
 +                          type, arg0, arg1);
 +    }
 +
   return NULL_TREE;
  }

Regards.
Kai


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-06 Thread Michael Matz
Hi,

On Thu, 6 Oct 2011, Richard Guenther wrote:

  +       ((TREE_CODE_CLASS (TREE_CODE (arg1)) != tcc_comparison
  +           TREE_CODE (arg1) != TRUTH_NOT_EXPR)
  +         || !FLOAT_TYPE_P (TREE_TYPE (TREE_OPERAND (arg1, 0)
 
 ?  simple_operand_p would have rejected both ! and comparisons.
 
 I miss a test for side-effects on arg0 (and probably simple_operand_p there,
 as well).

He has it in the if() body.  But why?  The point of ANDIF/ORIF is to not 
evaluate the second argument for side-effects when the first argument is 
false/true already, and further to establish an order between both 
evaluations.  The sideeffect on the first arg is always evaluated.  
AND/OR always evaluate both arguments (in unspecified order), but as he 
checks the second one for being free of side effects already that alone is 
already equivalent to ANDIF/ORIF.  No need to check something on the first 
argument.


Ciao,
Michael.

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-06 Thread Richard Guenther
On Thu, Oct 6, 2011 at 3:49 PM, Michael Matz m...@suse.de wrote:
 Hi,

 On Thu, 6 Oct 2011, Richard Guenther wrote:

  +       ((TREE_CODE_CLASS (TREE_CODE (arg1)) != tcc_comparison
  +           TREE_CODE (arg1) != TRUTH_NOT_EXPR)
  +         || !FLOAT_TYPE_P (TREE_TYPE (TREE_OPERAND (arg1, 0)

 ?  simple_operand_p would have rejected both ! and comparisons.

 I miss a test for side-effects on arg0 (and probably simple_operand_p there,
 as well).

 He has it in the if() body.  But why?  The point of ANDIF/ORIF is to not
 evaluate the second argument for side-effects when the first argument is
 false/true already, and further to establish an order between both
 evaluations.  The sideeffect on the first arg is always evaluated.
 AND/OR always evaluate both arguments (in unspecified order), but as he
 checks the second one for being free of side effects already that alone is
 already equivalent to ANDIF/ORIF.  No need to check something on the first
 argument.

It seems to me it should then simply be

  if (!TREE_SIDE_EFFECTS (arg1)
  simple_operand_p (arg1))
   return fold-the-not-and-variant ();

Richard.


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-06 Thread Kai Tietz
Hi,

I modified the patch so, that it always just converts two leafs of a 
TRUTH(AND|OR)IF chain into a TRUTH_(AND|OR) expression, if branch costs are 
high and leafs are simple without side-effects.

Additionally I added some testcases for it.

2011-10-06  Kai Tietz  kti...@redhat.com

* fold-const.c (fold_truth_andor): Convert TRUTH_(AND|OR)IF_EXPR
to TRUTH_OR_EXPR, if suitable.

2011-10-06  Kai Tietz  kti...@redhat.com

* gcc.dg/tree-ssa/ssa-ifbranch-1.c: New test.
* gcc.dg/tree-ssa/ssa-ifbranch-2.c: New test.
* gcc.dg/tree-ssa/ssa-ifbranch-3.c: New test.
* gcc.dg/tree-ssa/ssa-ifbranch-4.c: New test.

Bootstrapped and tested for all languages (including Ada and Obj-C++) on host 
x86_64-unknown-linux-gnu.  Ok for apply?

Regards,
Kai

Index: gcc-head/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifbranch-1.c
===
--- /dev/null
+++ gcc-head/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifbranch-1.c
@@ -0,0 +1,18 @@
+/* Skip on MIPS, S/390, and AVR due LOGICAL_OP_NON_SHORT_CIRCUIT, and
+   lower values in BRANCH_COST.  */
+/* { dg-do compile { target { ! mips*-*-* s390*-*-*  avr-*-* mn10300-*-* } } 
} */
+/* { dg-options -O2 -fdump-tree-gimple } */
+/* { dg-options -O2 -fdump-tree-gimple -march=i586 { target { i?86-*-*  
ilp32 } } } */
+
+extern int doo1 (void);
+extern int doo2 (void);
+
+int bar (int a, int b, int c)
+{
+  if (a  b  c)
+return doo1 ();
+  return doo2 ();
+}
+
+/* { dg-final { scan-tree-dump-times if  2 gimple } } */
+/* { dg-final { cleanup-tree-dump gimple } } */
Index: gcc-head/gcc/fold-const.c
===
--- gcc-head.orig/gcc/fold-const.c
+++ gcc-head/gcc/fold-const.c
@@ -8387,6 +8387,45 @@ fold_truth_andor (location_t loc, enum t
   if ((tem = fold_truthop (loc, code, type, arg0, arg1)) != 0)
 return tem;
 
+  if ((code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR)
+   !TREE_SIDE_EFFECTS (arg1)
+   LOGICAL_OP_NON_SHORT_CIRCUIT
+  /* floats might trap.  */
+   !FLOAT_TYPE_P (TREE_TYPE (arg1))
+   ((TREE_CODE_CLASS (TREE_CODE (arg1)) != tcc_comparison
+TREE_CODE (arg1) != TRUTH_NOT_EXPR
+simple_operand_p (arg1))
+  || ((TREE_CODE_CLASS (TREE_CODE (arg1)) == tcc_comparison
+   || TREE_CODE (arg1) == TRUTH_NOT_EXPR)
+ /* Float comparison might trap.  */
+   !FLOAT_TYPE_P (TREE_TYPE (TREE_OPERAND (arg1, 0)))
+   simple_operand_p (TREE_OPERAND (arg1, 0)
+{
+  /* We want to combine truth-comparison for
+((W TRUTH-ANDOR X) TRUTH-ANDORIF Y) TRUTH-ANDORIF Z,
+if Y and Z are simple operands and have no side-effect to
+((W TRUTH-ANDOR X) TRUTH-IF (Y TRUTH-ANDOR Z).  */
+  if (TREE_CODE (arg0) == code
+   !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1))
+   simple_operand_p (TREE_OPERAND (arg0, 1)))
+   {
+ tem = build2_loc (loc,
+   (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
+ : TRUTH_OR_EXPR),
+   type, TREE_OPERAND (arg0, 1), arg1);
+ return build2_loc (loc, code, type, TREE_OPERAND (arg0, 0),
+tem);
+   }
+  /* Convert X TRUTH-ANDORIF Y to X TRUTH-ANDOR Y, if X and Y
+are simple operands and have no side-effects.  */
+  if (simple_operand_p (arg0)
+   !TREE_SIDE_EFFECTS (arg0))
+   return build2_loc (loc,
+  (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
+: TRUTH_OR_EXPR),
+  type, arg0, arg1);
+}
+
   return NULL_TREE;
 }
 
Index: gcc-head/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifbranch-2.c
===
--- /dev/null
+++ gcc-head/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifbranch-2.c
@@ -0,0 +1,18 @@
+/* Skip on MIPS, S/390, and AVR due LOGICAL_OP_NON_SHORT_CIRCUIT, and
+   lower values in BRANCH_COST.  */
+/* { dg-do compile { target { ! mips*-*-* s390*-*-*  avr-*-* mn10300-*-* } } 
} */
+/* { dg-options -O2 -fdump-tree-gimple } */
+/* { dg-options -O2 -fdump-tree-gimple -march=i586 { target { i?86-*-*  
ilp32 } } } */
+
+extern int doo1 (void);
+extern int doo2 (void);
+
+int bar (int a, int b, int c, int d)
+{
+  if (a  b  c  d)
+return doo1 ();
+  return doo2 ();
+}
+
+/* { dg-final { scan-tree-dump-times if  2 gimple } } */
+/* { dg-final { cleanup-tree-dump gimple } } */
Index: gcc-head/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifbranch-3.c
===
--- /dev/null
+++ gcc-head/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifbranch-3.c
@@ -0,0 +1,18 @@
+/* Skip on MIPS, S/390, and AVR due LOGICAL_OP_NON_SHORT_CIRCUIT, and
+   lower values in BRANCH_COST.  */
+/* { dg-do compile { target { ! mips*-*-* s390*-*-*  avr-*-* 

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-06 Thread Kai Tietz
2011/10/6 Michael Matz m...@suse.de:
 Hi,

 On Thu, 6 Oct 2011, Richard Guenther wrote:

  +       ((TREE_CODE_CLASS (TREE_CODE (arg1)) != tcc_comparison
  +           TREE_CODE (arg1) != TRUTH_NOT_EXPR)
  +         || !FLOAT_TYPE_P (TREE_TYPE (TREE_OPERAND (arg1, 0)

 ?  simple_operand_p would have rejected both ! and comparisons.

 I miss a test for side-effects on arg0 (and probably simple_operand_p there,
 as well).

 He has it in the if() body.  But why?  The point of ANDIF/ORIF is to not
 evaluate the second argument for side-effects when the first argument is
 false/true already, and further to establish an order between both
 evaluations.  The sideeffect on the first arg is always evaluated.
 AND/OR always evaluate both arguments (in unspecified order), but as he
 checks the second one for being free of side effects already that alone is
 already equivalent to ANDIF/ORIF.  No need to check something on the first
 argument.


 Ciao,
 Michael.


That's not the hole story.  The difference between
TRUTH_(AND|OR)IF_EXPR and TRUTH_(AND|OR)_EXPR are, that for
TRUTH_(AND|OR)IF_EXPR gimplifier creates a COND expression, but for
TRUTH_(AND|OR)_EXPR it doesn't.

Regards,
Kai


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-06 Thread Michael Matz
Hi,

On Thu, 6 Oct 2011, Kai Tietz wrote:

 That's not the hole story.  The difference between TRUTH_(AND|OR)IF_EXPR 
 and TRUTH_(AND|OR)_EXPR are, that for TRUTH_(AND|OR)IF_EXPR gimplifier 
 creates a COND expression, but for TRUTH_(AND|OR)_EXPR it doesn't.

Yes, of course.  That is what implements the short-circuit semantics.  But 
as Richard already mentioned I also don't understand why you do the 
reassociation at that point.  Why not simply rewrite ANDIF - AND (when 
possible, i.e. no sideeffects on arg1, and desirable, i.e. when 
LOGICAL_OP_NON_SHORT_CIRCUIT, and simple_operand(arg1)) and let other 
folders do reassociation?  I ask because your comments states to 
transform:

  ((W AND X) ANDIF Y) ANDIF Z
into
  (W AND X) ANDIF (Y AND Z)

(under condition that Y and Z are simple operands).

In fact you don't check the form of arg0,0, i.e. the W AND X here.  
Independend of that it doesn't make sense, because if Y and Z are easy 
(simple and no side-effects), then Y AND Z is too, and therefore you 
should transform this (if at all) into:

  (W AND X) AND (Y AND Z)

at which point this association doesn't make sense anymore, as 

  ((W AND X) AND Y) AND Z

is just as fine.  So, the reassociation looks fishy at best, better get 
rid of it?  (which of the testcases breaks without it?)


Ciao,
Michael.


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-06 Thread Kai Tietz
2011/10/6 Michael Matz m...@suse.de:
 Hi,

 On Thu, 6 Oct 2011, Kai Tietz wrote:

 That's not the hole story.  The difference between TRUTH_(AND|OR)IF_EXPR
 and TRUTH_(AND|OR)_EXPR are, that for TRUTH_(AND|OR)IF_EXPR gimplifier
 creates a COND expression, but for TRUTH_(AND|OR)_EXPR it doesn't.

 Yes, of course.  That is what implements the short-circuit semantics.  But
 as Richard already mentioned I also don't understand why you do the
 reassociation at that point.  Why not simply rewrite ANDIF - AND (when
 possible, i.e. no sideeffects on arg1, and desirable, i.e. when
 LOGICAL_OP_NON_SHORT_CIRCUIT, and simple_operand(arg1)) and let other
 folders do reassociation?  I ask because your comments states to
 transform:

  ((W AND X) ANDIF Y) ANDIF Z
 into
  (W AND X) ANDIF (Y AND Z)

 (under condition that Y and Z are simple operands).

 In fact you don't check the form of arg0,0, i.e. the W AND X here.
 Independend of that it doesn't make sense, because if Y and Z are easy
 (simple and no side-effects), then Y AND Z is too, and therefore you
 should transform this (if at all) into:

  (W AND X) AND (Y AND Z)

 at which point this association doesn't make sense anymore, as

Sorry, exactly this doesn't happen, due an ANDIF isn't simple, and
therefore it isn't transformed into and AND.


  ((W AND X) AND Y) AND Z

 is just as fine.  So, the reassociation looks fishy at best, better get
 rid of it?  (which of the testcases breaks without it?)

None.  I had this implemented first.  But Richard was concerned about
making non-IF conditions too long.I understand that point that it
might not that good to always modify unconditionally to AND/OR chain.
For example

if (a1  a2  a3    a100) return 1;

would be packed by this patch into 50 branches.   If we would modify
all of them into AND, then we would calculate for all 100 values the
result, even if the first a1 is zero.  This doesn't improve speed
pretty well.

But you are right, that from the point of reassociation optimization
it could be in some cases more profitable to have packed all elements
into on AND-chain.

Regards,
Kai


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-06 Thread Jakub Jelinek
On Thu, Oct 06, 2011 at 05:28:36PM +0200, Kai Tietz wrote:
 None.  I had this implemented first.  But Richard was concerned about
 making non-IF conditions too long.I understand that point that it
 might not that good to always modify unconditionally to AND/OR chain.
 For example
 
 if (a1  a2  a3    a100) return 1;
 
 would be packed by this patch into 50 branches.   If we would modify
 all of them into AND, then we would calculate for all 100 values the
 result, even if the first a1 is zero.  This doesn't improve speed
 pretty well.
 
 But you are right, that from the point of reassociation optimization
 it could be in some cases more profitable to have packed all elements
 into on AND-chain.

Yeah.  Perhaps we should break them up after reassoc2, or on the other side
teach reassoc (or some other pass) to be able to do the optimizations
on a series of GIMPLE_COND with no side-effects in between.
See e.g. PR46309, return a == 3 || a == 1 || a == 2 || a == 4;
isn't optimized into (a - 1U)  4U, although it could, if branch cost
cause it to be broken up into several GIMPLE_COND stmts.
Or if user writes:
  if (a == 3)
return 1;
  if (a == 1)
return 1;
  if (a == 2)
return 1;
  if (a == 4)
return 1;
  return 0;
(more probably using enums).

Jakub


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-06 Thread Michael Matz
Hi,

On Thu, 6 Oct 2011, Kai Tietz wrote:

  at which point this association doesn't make sense anymore, as
 
 Sorry, exactly this doesn't happen, due an ANDIF isn't simple, and 
 therefore it isn't transformed into and AND.

Right ...

   ((W AND X) AND Y) AND Z
 
  is just as fine.  So, the reassociation looks fishy at best, better get
  rid of it?  (which of the testcases breaks without it?)
 
 None.  I had this implemented first.  But Richard was concerned about
 making non-IF conditions too long.I understand that point that it
 might not that good to always modify unconditionally to AND/OR chain.

... and I see that (that's why the transformation should be desirable for 
some definition of desirable, which probably includes and RHS not too 
long chain).  As it stands right now your transformation seems to be a 
fairly ad-hoc try at avoiding this problem.  That's why I wonder why to do 
the reassoc at all?  Which testcases break _without_ the reassociation, 
i.e. with only rewriting ANDIF - AND at the outermost level?


Ciao,
Michael.

Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-06 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/06/11 09:37, Jakub Jelinek wrote:
 On Thu, Oct 06, 2011 at 05:28:36PM +0200, Kai Tietz wrote:
 None.  I had this implemented first.  But Richard was concerned
 about making non-IF conditions too long.I understand that
 point that it might not that good to always modify
 unconditionally to AND/OR chain. For example
 
 if (a1  a2  a3    a100) return 1;
 
 would be packed by this patch into 50 branches.   If we would
 modify all of them into AND, then we would calculate for all 100
 values the result, even if the first a1 is zero.  This doesn't
 improve speed pretty well.
 
 But you are right, that from the point of reassociation
 optimization it could be in some cases more profitable to have
 packed all elements into on AND-chain.
 
 Yeah.  Perhaps we should break them up after reassoc2, or on the
 other side teach reassoc (or some other pass) to be able to do the
 optimizations on a series of GIMPLE_COND with no side-effects in
 between. See e.g. PR46309, return a == 3 || a == 1 || a == 2 || a
 == 4; isn't optimized into (a - 1U)  4U, although it could, if
 branch cost cause it to be broken up into several GIMPLE_COND
 stmts. Or if user writes: if (a == 3) return 1; if (a == 1) return
 1; if (a == 2) return 1; if (a == 4) return 1; return 0; (more
 probably using enums).
I haven't followed this thread as closely as perhaps I should; what
I'm seeing discussed now looks a lot like condition merging and I'm
pretty sure there's some research in this area that might guide us.
multi-variable condition merging is the term the authors used.

jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOjeYFAAoJEBRtltQi2kC7eFMIALjFM/GIg1DDryU59EoFQe5A
x7pvx3FSlcjLWeyIlzYJvWF4wGybRNNXp5qziIedO6qp0Z/06VvCU07A10VoWSig
/EFufo5l+Jef5s1d0mA6mBJ9A52HDL2ipOK8YDQbVzJWqHdaXLrrzUni3wGwcUVs
v3UIi5OevjRhJ55fRVxBcReJKF6YAzxFDxqOnVGAbf9R3BEJ2T9JW2CLhIcd/T1L
D9K+6YymHaN9eYh7B7gPKG88q+5JjcStHuMQODKSAegt3T4iP9CH/G5dV8u95Y+q
6mxo8gOGAwYR7N/U6fuXRaGJEzWSdrqRy2EBF5B7+Rt6lSWXdfzUEBusT24i67A=
=HIrU
-END PGP SIGNATURE-


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-06 Thread Kai Tietz
2011/10/6 Michael Matz m...@suse.de:
 Hi,

 On Thu, 6 Oct 2011, Kai Tietz wrote:

  at which point this association doesn't make sense anymore, as

 Sorry, exactly this doesn't happen, due an ANDIF isn't simple, and
 therefore it isn't transformed into and AND.

 Right ...

   ((W AND X) AND Y) AND Z
 
  is just as fine.  So, the reassociation looks fishy at best, better get
  rid of it?  (which of the testcases breaks without it?)

 None.  I had this implemented first.  But Richard was concerned about
 making non-IF conditions too long.    I understand that point that it
 might not that good to always modify unconditionally to AND/OR chain.

 ... and I see that (that's why the transformation should be desirable for
 some definition of desirable, which probably includes and RHS not too
 long chain).  As it stands right now your transformation seems to be a
 fairly ad-hoc try at avoiding this problem.  That's why I wonder why to do
 the reassoc at all?  Which testcases break _without_ the reassociation,
 i.e. with only rewriting ANDIF - AND at the outermost level?

I don't do here reassociation in inner.  See that patch calls
build2_loc, and not fold_build2_loc anymore.  So it doesn't retries to
associate in inner anymore (which might be something of interest for
the issue Jakub mentioned).

There is no test actual failing AFAICS.  I just noticed
size-differences by this.  Nevertheless it might be better to enhance
reassociation pass to break-up (and repropagate) GIMPLE_CONDs with
non-side-effect, as Jakub suggested.

The other chance might be here to allow deeper chains then two
elements within one AND/OR element, but this would be architecture
dependent.  For x86 -as example- used instruction cycles for a common
for branching would suggest that it might be profitable to have here 3
or 4 leafs within one AND|OR chain.  But for sure on other
architectures the amount of leafs varies.

Regards,
Kai