Re: [PATCH] Improved re-association of signed arithmetic (was: Inefficient end-of-loop value computation)

Richard Guenther Tue, 22 May 2012 04:15:15 -0700

On Fri, May 18, 2012 at 10:27 PM, Ulrich Weigand <uweig...@de.ibm.com> wrote:
> Richard Guenther wrote:
>> On Thu, Mar 8, 2012 at 3:29 PM, Ulrich Weigand <uweig...@de.ibm.com> wrote:
>> > - Should I try to improve forwprop to handle casts and additional re-
>> > association cases until it handles the above expression?
>>
>> Yes, ideally by trying to sub-divide this task into separate profitable
>> transforms.  Maybe Andrew can chime in here?
>
> I finally got some time to look into this in detail.  The various special-
> case transforms in associate_plusminus all transform a plus/minus expression
> tree into either a single operand, a negated operand, or a single plus or
> minus of two operands.  This is valid as long as we can prove that the
> newly introduced expression can never overflow (if we're doing signed
> arithmetic).  This is true in those special cases due to either:
>
>  - If all sub-expressions of the contracted plus/minus tree are themselves
>   performed in signed arithmetic and thus are guaranteed to never overflow,
>   their result equals the "true" arithmetic result if we were to perform
>   the computation in the actual integers with unlimited precision.
>
>   Now, "true" arithmetic is associative.  Thus, if we re-associate the
>   expression tree such that everything cancels out and just a single
>   operation A +/- B (or -A) remains, the "true" result of that operation
>   must equal the true result of the original expression -- which means
>   in particular that it lies within the range of the underlying data
>   type, and hence that final operation itself cannot overflow.
>
>  - In the case of introducing a negation, we only need to be sure that
>   its operand can never be the minimum value of its type range.  This
>   can on occasion be proven via a form of value range tracking.
>
>   For example, when performing the transformation ~A + 1 --> -A, we note
>   that ~A cannot be INT_MAX, since we can add 1 to it without overflow;
>   and therefore A cannot be INT_MIN.
>
>
> Now, using those two rules, we can actually prove many more cases where
> reassociation is possible.  For example, we can transform (A + (B + C)) - C
> to A + B  (due to the first rule).   We can also transform the original
> expression resulting from end-of-loop value computation
>   (A + 1) + (int) ((unsigned) ~A + (unsigned) B)
> into just "B" -- this can never overflow since we aren't even introducing
> any new expression.
>
> The following patch rewrites associate_plusminus to remove all the
> explicitly coded special cases, and instead performs a scan of the
> plus/minus tree similar to what is done in tree-ssa-reassoc (and also
> in simplify-rtx for that matter).  If this results in an expression
> tree that collapses to just a single operand, or just a single newly
> introduced operation, and -in the latter case- one of the two rules
> above ensure the new operation is safe, the transformation is performed.
>
> This still passes all reassoc tests, and in fact allows to remove XFAILs
> from two of them.  It also catches the end-of-loop value computation case.
>
> Tested on i386-linux with no regressions.
>
> OK for mainline?


The point of the special-cases in forwprop was to make them fast to
detect - forwprop should be a pattern-matching thing, much like
combine on RTL.

So, instead of changing forwprop this way can you adjust tree-ssa-reassoc.c
to (again) associate !TYPE_OVERFLOW_WRAPS operations but make
sure we throw away results that would possibly introduce undefined overflow
for !TYPE_OVERFLOW_WRAPS types?  There is a reassoc pass after
loop optimizations, so that should fix it as well, no?

Thanks,
Richard.

> Bye,
> Ulrich
>
>
> ChangeLog:
>
>        gcc/
>        * tree-ssa-forwprop.c (enum plus_minus_op_range,
>        struct plus_minus_op_data, struct plus_minus_data): New types.
>        (next_plus_minus_op, remove_plus_minus_op, add_plus_minus_op,
>        add_plus_minus_ops, setup_plus_minus_ops): New functions.
>        (associate_plusminus): Reimplement.
>
>        gcc/testsuite/
>        * gcc.dg/tree-ssa/reassoc-2.c: Remove XFAIL.
>        * gcc.dg/tree-ssa/reassoc-9.c: Update test, remove XFAIL.
>        * gcc.dg/tree-ssa/reassoc-23.c: Update test.
>        * gcc.dg/tree-ssa/reassoc-26.c: New test.
>
>
> Index: gcc/testsuite/gcc.dg/tree-ssa/reassoc-23.c
> ===================================================================
> *** gcc/testsuite/gcc.dg/tree-ssa/reassoc-23.c  (revision 187653)
> --- gcc/testsuite/gcc.dg/tree-ssa/reassoc-23.c  (working copy)
> ***************
> *** 1,5 ****
>  /* { dg-do compile } */
> ! /* { dg-options "-O2 -fdump-tree-reassoc1" } */
>
>  unsigned int
>  foo(unsigned int a, unsigned int b, unsigned int c, unsigned int d,
> --- 1,5 ----
>  /* { dg-do compile } */
> ! /* { dg-options "-O2 -fdump-tree-optimized" } */
>
>  unsigned int
>  foo(unsigned int a, unsigned int b, unsigned int c, unsigned int d,
> *************** foo(unsigned int a, unsigned int b, unsi
> *** 13,17 ****
>    return e;
>  }
>
> ! /* { dg-final { scan-tree-dump-times "= 20" 1 "reassoc1"} } */
> ! /* { dg-final { cleanup-tree-dump "reassoc1" } } */
> --- 13,17 ----
>    return e;
>  }
>
> ! /* { dg-final { scan-tree-dump-times "return 20" 1 "optimized"} } */
> ! /* { dg-final { cleanup-tree-dump "optimized" } } */
> Index: gcc/testsuite/gcc.dg/tree-ssa/reassoc-26.c
> ===================================================================
> *** gcc/testsuite/gcc.dg/tree-ssa/reassoc-26.c  (revision 0)
> --- gcc/testsuite/gcc.dg/tree-ssa/reassoc-26.c  (revision 0)
> ***************
> *** 0 ****
> --- 1,17 ----
> + /* { dg-do compile } */
> + /* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> + int
> + foo (int start, int end)
> + {
> +   int i;
> +
> +   for (i = start; i < end; i++)
> +     ;
> +
> +   return i;
> + }
> +
> + /* Verify that the end-of-loop value is simplified to just "end".  */
> + /* { dg-final { scan-tree-dump-times "i_. = end_." 1 "optimized"} } */
> + /* { dg-final { cleanup-tree-dump "optimized" } } */
> Index: gcc/testsuite/gcc.dg/tree-ssa/reassoc-2.c
> ===================================================================
> *** gcc/testsuite/gcc.dg/tree-ssa/reassoc-2.c   (revision 187653)
> --- gcc/testsuite/gcc.dg/tree-ssa/reassoc-2.c   (working copy)
> *************** int b0, b1, b2, b3, b4,e;
> *** 14,20 ****
>    return e;
>  }
>
> ! /* We can't reassociate the expressions due to undefined signed overflow.  
> */
> !
> ! /* { dg-final { scan-tree-dump-times "return 0" 1 "optimized" { xfail *-*-* 
> } } } */
>  /* { dg-final { cleanup-tree-dump "optimized" } } */
> --- 14,18 ----
>    return e;
>  }
>
> ! /* { dg-final { scan-tree-dump-times "return 0" 1 "optimized" } } */
>  /* { dg-final { cleanup-tree-dump "optimized" } } */
> Index: gcc/testsuite/gcc.dg/tree-ssa/reassoc-9.c
> ===================================================================
> *** gcc/testsuite/gcc.dg/tree-ssa/reassoc-9.c   (revision 187653)
> --- gcc/testsuite/gcc.dg/tree-ssa/reassoc-9.c   (working copy)
> ***************
> *** 1,5 ****
>  /* { dg-do compile } */
> ! /* { dg-options "-O2 -fdump-tree-reassoc1" } */
>
>  int main(int a, int b, int c, int d, int e, int f, int g, int h)
>  {
> --- 1,5 ----
>  /* { dg-do compile } */
> ! /* { dg-options "-O2 -fdump-tree-optimized" } */
>
>  int main(int a, int b, int c, int d, int e, int f, int g, int h)
>  {
> *************** int main(int a, int b, int c, int d, int
> *** 11,18 ****
>    return e;
>  }
>
> ! /* We can always re-associate to a final constant but the current
> !    implementation does not allow easy roll-back without IL changes.  */
> !
> ! /* { dg-final { scan-tree-dump-times "= 20" 1 "reassoc1" { xfail *-*-* } } 
> } */
> ! /* { dg-final { cleanup-tree-dump "reassoc1" } } */
> --- 11,15 ----
>    return e;
>  }
>
> ! /* { dg-final { scan-tree-dump-times "return 20" 1 "optimized" } } */
> ! /* { dg-final { cleanup-tree-dump "optimized" } } */
> Index: gcc/tree-ssa-forwprop.c
> ===================================================================
> *** gcc/tree-ssa-forwprop.c     (revision 187653)
> --- gcc/tree-ssa-forwprop.c     (working copy)
> *************** simplify_bitwise_binary (gimple_stmt_ite
> *** 2188,2462 ****
>  }
>
>
> ! /* Perform re-associations of the plus or minus statement STMT that are
> !    always permitted.  Returns true if the CFG was changed.  */
>
>  static bool
> ! associate_plusminus (gimple_stmt_iterator *gsi)
>  {
> !   gimple stmt = gsi_stmt (*gsi);
> !   tree rhs1 = gimple_assign_rhs1 (stmt);
> !   tree rhs2 = gimple_assign_rhs2 (stmt);
> !   enum tree_code code = gimple_assign_rhs_code (stmt);
> !   bool changed;
>
> !   /* We can't reassociate at all for saturating types.  */
> !   if (TYPE_SATURATING (TREE_TYPE (rhs1)))
> !     return false;
>
> !   /* First contract negates.  */
> !   do
>      {
> !       changed = false;
>
> !       /* A +- (-B) -> A -+ B.  */
> !       if (TREE_CODE (rhs2) == SSA_NAME)
>        {
> !         gimple def_stmt = SSA_NAME_DEF_STMT (rhs2);
> !         if (is_gimple_assign (def_stmt)
> !             && gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR
> !             && can_propagate_from (def_stmt))
> !           {
> !             code = (code == MINUS_EXPR) ? PLUS_EXPR : MINUS_EXPR;
> !             gimple_assign_set_rhs_code (stmt, code);
> !             rhs2 = gimple_assign_rhs1 (def_stmt);
> !             gimple_assign_set_rhs2 (stmt, rhs2);
> !             gimple_set_modified (stmt, true);
> !             changed = true;
> !           }
>        }
>
> !       /* (-A) + B -> B - A.  */
> !       if (TREE_CODE (rhs1) == SSA_NAME
> !         && code == PLUS_EXPR)
>        {
> !         gimple def_stmt = SSA_NAME_DEF_STMT (rhs1);
> !         if (is_gimple_assign (def_stmt)
> !             && gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR
> !             && can_propagate_from (def_stmt))
>            {
> !             code = MINUS_EXPR;
> !             gimple_assign_set_rhs_code (stmt, code);
> !             rhs1 = rhs2;
> !             gimple_assign_set_rhs1 (stmt, rhs1);
> !             rhs2 = gimple_assign_rhs1 (def_stmt);
> !             gimple_assign_set_rhs2 (stmt, rhs2);
> !             gimple_set_modified (stmt, true);
> !             changed = true;
>            }
>        }
>      }
> -   while (changed);
>
> !   /* We can't reassociate floating-point or fixed-point plus or minus
> !      because of saturation to +-Inf.  */
> !   if (FLOAT_TYPE_P (TREE_TYPE (rhs1))
> !       || FIXED_POINT_TYPE_P (TREE_TYPE (rhs1)))
> !     goto out;
> !
> !   /* Second match patterns that allow contracting a plus-minus pair
> !      irrespective of overflow issues.
> !
> !       (A +- B) - A       ->  +- B
> !       (A +- B) -+ B      ->  A
> !       (CST +- A) +- CST  ->  CST +- A
> !       (A + CST) +- CST   ->  A + CST
> !       ~A + A             ->  -1
> !       ~A + 1             ->  -A
> !       A - (A +- B)       ->  -+ B
> !       A +- (B +- A)      ->  +- B
> !       CST +- (CST +- A)  ->  CST +- A
> !       CST +- (A +- CST)  ->  CST +- A
> !       A + ~A             ->  -1
>
> !      via commutating the addition and contracting operations to zero
> !      by reassociation.  */
>
> !   if (TREE_CODE (rhs1) == SSA_NAME)
>      {
> !       gimple def_stmt = SSA_NAME_DEF_STMT (rhs1);
> !       if (is_gimple_assign (def_stmt) && can_propagate_from (def_stmt))
>        {
> !         enum tree_code def_code = gimple_assign_rhs_code (def_stmt);
> !         if (def_code == PLUS_EXPR
> !             || def_code == MINUS_EXPR)
> !           {
> !             tree def_rhs1 = gimple_assign_rhs1 (def_stmt);
> !             tree def_rhs2 = gimple_assign_rhs2 (def_stmt);
> !             if (operand_equal_p (def_rhs1, rhs2, 0)
> !                 && code == MINUS_EXPR)
> !               {
> !                 /* (A +- B) - A -> +- B.  */
> !                 code = ((def_code == PLUS_EXPR)
> !                         ? TREE_CODE (def_rhs2) : NEGATE_EXPR);
> !                 rhs1 = def_rhs2;
> !                 rhs2 = NULL_TREE;
> !                 gimple_assign_set_rhs_with_ops (gsi, code, rhs1, NULL_TREE);
> !                 gcc_assert (gsi_stmt (*gsi) == stmt);
> !                 gimple_set_modified (stmt, true);
> !               }
> !             else if (operand_equal_p (def_rhs2, rhs2, 0)
> !                      && code != def_code)
> !               {
> !                 /* (A +- B) -+ B -> A.  */
> !                 code = TREE_CODE (def_rhs1);
> !                 rhs1 = def_rhs1;
> !                 rhs2 = NULL_TREE;
> !                 gimple_assign_set_rhs_with_ops (gsi, code, rhs1, NULL_TREE);
> !                 gcc_assert (gsi_stmt (*gsi) == stmt);
> !                 gimple_set_modified (stmt, true);
> !               }
> !             else if (TREE_CODE (rhs2) == INTEGER_CST
> !                      && TREE_CODE (def_rhs1) == INTEGER_CST)
> !               {
> !                 /* (CST +- A) +- CST -> CST +- A.  */
> !                 tree cst = fold_binary (code, TREE_TYPE (rhs1),
> !                                         def_rhs1, rhs2);
> !                 if (cst && !TREE_OVERFLOW (cst))
> !                   {
> !                     code = def_code;
> !                     gimple_assign_set_rhs_code (stmt, code);
> !                     rhs1 = cst;
> !                     gimple_assign_set_rhs1 (stmt, rhs1);
> !                     rhs2 = def_rhs2;
> !                     gimple_assign_set_rhs2 (stmt, rhs2);
> !                     gimple_set_modified (stmt, true);
> !                   }
> !               }
> !             else if (TREE_CODE (rhs2) == INTEGER_CST
> !                      && TREE_CODE (def_rhs2) == INTEGER_CST
> !                      && def_code == PLUS_EXPR)
> !               {
> !                 /* (A + CST) +- CST -> A + CST.  */
> !                 tree cst = fold_binary (code, TREE_TYPE (rhs1),
> !                                         def_rhs2, rhs2);
> !                 if (cst && !TREE_OVERFLOW (cst))
> !                   {
> !                     code = PLUS_EXPR;
> !                     gimple_assign_set_rhs_code (stmt, code);
> !                     rhs1 = def_rhs1;
> !                     gimple_assign_set_rhs1 (stmt, rhs1);
> !                     rhs2 = cst;
> !                     gimple_assign_set_rhs2 (stmt, rhs2);
> !                     gimple_set_modified (stmt, true);
> !                   }
> !               }
> !           }
> !         else if (def_code == BIT_NOT_EXPR
> !                  && INTEGRAL_TYPE_P (TREE_TYPE (rhs1)))
>            {
> !             tree def_rhs1 = gimple_assign_rhs1 (def_stmt);
> !             if (code == PLUS_EXPR
> !                 && operand_equal_p (def_rhs1, rhs2, 0))
> !               {
> !                 /* ~A + A -> -1.  */
> !                 code = INTEGER_CST;
> !                 rhs1 = build_int_cst_type (TREE_TYPE (rhs2), -1);
> !                 rhs2 = NULL_TREE;
> !                 gimple_assign_set_rhs_with_ops (gsi, code, rhs1, NULL_TREE);
> !                 gcc_assert (gsi_stmt (*gsi) == stmt);
> !                 gimple_set_modified (stmt, true);
> !               }
> !             else if (code == PLUS_EXPR
> !                      && integer_onep (rhs1))
> !               {
> !                 /* ~A + 1 -> -A.  */
> !                 code = NEGATE_EXPR;
> !                 rhs1 = def_rhs1;
> !                 rhs2 = NULL_TREE;
> !                 gimple_assign_set_rhs_with_ops (gsi, code, rhs1, NULL_TREE);
> !                 gcc_assert (gsi_stmt (*gsi) == stmt);
> !                 gimple_set_modified (stmt, true);
> !               }
>            }
>        }
>      }
>
> !   if (rhs2 && TREE_CODE (rhs2) == SSA_NAME)
>      {
> !       gimple def_stmt = SSA_NAME_DEF_STMT (rhs2);
> !       if (is_gimple_assign (def_stmt) && can_propagate_from (def_stmt))
> !       {
> !         enum tree_code def_code = gimple_assign_rhs_code (def_stmt);
> !         if (def_code == PLUS_EXPR
> !             || def_code == MINUS_EXPR)
> !           {
> !             tree def_rhs1 = gimple_assign_rhs1 (def_stmt);
> !             tree def_rhs2 = gimple_assign_rhs2 (def_stmt);
> !             if (operand_equal_p (def_rhs1, rhs1, 0)
> !                 && code == MINUS_EXPR)
> !               {
> !                 /* A - (A +- B) -> -+ B.  */
> !                 code = ((def_code == PLUS_EXPR)
> !                         ? NEGATE_EXPR : TREE_CODE (def_rhs2));
> !                 rhs1 = def_rhs2;
> !                 rhs2 = NULL_TREE;
> !                 gimple_assign_set_rhs_with_ops (gsi, code, rhs1, NULL_TREE);
> !                 gcc_assert (gsi_stmt (*gsi) == stmt);
> !                 gimple_set_modified (stmt, true);
> !               }
> !             else if (operand_equal_p (def_rhs2, rhs1, 0)
> !                      && code != def_code)
> !               {
> !                 /* A +- (B +- A) -> +- B.  */
> !                 code = ((code == PLUS_EXPR)
> !                         ? TREE_CODE (def_rhs1) : NEGATE_EXPR);
> !                 rhs1 = def_rhs1;
> !                 rhs2 = NULL_TREE;
> !                 gimple_assign_set_rhs_with_ops (gsi, code, rhs1, NULL_TREE);
> !                 gcc_assert (gsi_stmt (*gsi) == stmt);
> !                 gimple_set_modified (stmt, true);
> !               }
> !             else if (TREE_CODE (rhs1) == INTEGER_CST
> !                      && TREE_CODE (def_rhs1) == INTEGER_CST)
> !               {
> !                 /* CST +- (CST +- A) -> CST +- A.  */
> !                 tree cst = fold_binary (code, TREE_TYPE (rhs2),
> !                                         rhs1, def_rhs1);
> !                 if (cst && !TREE_OVERFLOW (cst))
>                    {
> !                     code = (code == def_code ? PLUS_EXPR : MINUS_EXPR);
> !                     gimple_assign_set_rhs_code (stmt, code);
> !                     rhs1 = cst;
> !                     gimple_assign_set_rhs1 (stmt, rhs1);
> !                     rhs2 = def_rhs2;
> !                     gimple_assign_set_rhs2 (stmt, rhs2);
> !                     gimple_set_modified (stmt, true);
> !                   }
> !               }
> !             else if (TREE_CODE (rhs1) == INTEGER_CST
> !                      && TREE_CODE (def_rhs2) == INTEGER_CST)
> !               {
> !                 /* CST +- (A +- CST) -> CST +- A.  */
> !                 tree cst = fold_binary (def_code == code
> !                                         ? PLUS_EXPR : MINUS_EXPR,
> !                                         TREE_TYPE (rhs2),
> !                                         rhs1, def_rhs2);
> !                 if (cst && !TREE_OVERFLOW (cst))
> !                   {
> !                     rhs1 = cst;
> !                     gimple_assign_set_rhs1 (stmt, rhs1);
> !                     rhs2 = def_rhs1;
> !                     gimple_assign_set_rhs2 (stmt, rhs2);
> !                     gimple_set_modified (stmt, true);
>                    }
>                }
>            }
> !         else if (def_code == BIT_NOT_EXPR
> !                  && INTEGRAL_TYPE_P (TREE_TYPE (rhs2)))
> !           {
> !             tree def_rhs1 = gimple_assign_rhs1 (def_stmt);
> !             if (code == PLUS_EXPR
> !                 && operand_equal_p (def_rhs1, rhs1, 0))
> !               {
> !                 /* A + ~A -> -1.  */
> !                 code = INTEGER_CST;
> !                 rhs1 = build_int_cst_type (TREE_TYPE (rhs1), -1);
> !                 rhs2 = NULL_TREE;
> !                 gimple_assign_set_rhs_with_ops (gsi, code, rhs1, NULL_TREE);
> !                 gcc_assert (gsi_stmt (*gsi) == stmt);
> !                 gimple_set_modified (stmt, true);
> !               }
>            }
>        }
>      }
> --- 2188,2661 ----
>  }
>
>
> ! /* Data structures to hold plus/minus operand information during
> !    re-association.  */
> !
> ! /* Range information.  We only track whether we are sure that an operand
> !    cannot equal the minimum value of its type (RANGE_NOT_MIN), or that it
> !    cannot equal the maximum value of its type (RANGE_NOT_MAX).  */
> !
> ! enum plus_minus_op_range
> ! {
> !   RANGE_UNKNOWN = 0,
> !   RANGE_NOT_MIN,
> !   RANGE_NOT_MAX
> ! };
> !
> ! /* A single operand.  OP is the operand, NEG is true if the operand needs
> !    to be negated, and RANGE holds operand range information we were able
> !    to track.  */
> !
> ! struct plus_minus_op_data
> ! {
> !   tree op;
> !   bool neg;
> !   enum plus_minus_op_range range;
> ! };
> !
> ! /* All operands of the expression.  The value of the expression is 
> considered
> !    the sum of all operands, negated if indicated.  There is no implicit 
> order
> !    of the operands; this data structure is only used in contexts where we
> !    know addition is associative.  */
> !
> ! struct plus_minus_data
> ! {
> !   /* Operand array and number of current operands.  We do not bother to
> !      dynamically resize the array; 8 operands ought to be enough for the
> !      vast majority of cases.  */
> !   struct plus_minus_op_data ops[8];
> !   unsigned int n_ops;
> !
> !   /* Used for iterating through all operands.  */
> !   unsigned int curr_op;
> !
> !   /* True if we have successfully applied some simplification operation.  */
> !   bool simplified;
> !
> !   /* True if we ran into a failure (e.g. the ops array overflowed).  The
> !      rest of the data structure is no longer reliable in that case.  */
> !   bool failed;
> ! };
> !
> ! /* Iterate through the OPS operand array.  Return false if we are already
> !    at the end of the array.  Otherwise, return true and set OP to the index
> !    of the next operand to be considered.  */
>
>  static bool
> ! next_plus_minus_op (struct plus_minus_data *ops, unsigned int *op)
>  {
> !   if (ops->curr_op < ops->n_ops)
> !     {
> !       *op = ops->curr_op++;
> !       return true;
> !     }
> !
> !   return false;
> ! }
> !
> ! /* Remove operand with index OP from the OPS operand array.  */
> !
> ! static void
> ! remove_plus_minus_op (struct plus_minus_data *ops, unsigned int op)
> ! {
> !   if (op + 1 < ops->n_ops)
> !     memmove (&ops->ops[op], &ops->ops[op + 1],
> !            sizeof (struct plus_minus_op_data) * (ops->n_ops - op - 1));
> !
> !   ops->n_ops--;
> !
> !   if (op < ops->curr_op)
> !     ops->curr_op--;
> !
> !   ops->simplified = true;
> ! }
>
> ! /* Add NEW_OP at the end of the OPS operand array.  If possible, perform
> !    simplifications that are guaranteed to leave the total value of the
> !    expression encoded by OPS unchanged:
> !      - cancel NEW_OP against an equivalent but negated operand
> !      - simplify constant integer operands (without overflow).
> !    Assumes operands can be freely re-associated.  */
> !
> ! static void
> ! add_plus_minus_op (struct plus_minus_data *ops,
> !                  struct plus_minus_op_data *new_op)
> ! {
> !   struct plus_minus_op_data cst_op;
> !   unsigned int i;
> !
> !   if (ops->failed)
> !     return;
>
> !   if (TREE_CODE (new_op->op) == INTEGER_CST && new_op->neg)
>      {
> !       tree cst = fold_unary_to_constant (NEGATE_EXPR,
> !                                        TREE_TYPE (new_op->op), new_op->op);
> !       if (cst && !TREE_OVERFLOW (cst))
> !       {
> !         new_op = &cst_op;
> !         cst_op.op = cst;
> !         cst_op.neg = false;
> !         cst_op.range = RANGE_UNKNOWN;
> !         ops->simplified = true;
> !       }
> !     }
>
> !   for (i = 0; i < ops->n_ops; i++)
> !     {
> !       if (operand_equal_p (new_op->op, ops->ops[i].op, 0)
> !         && new_op->neg != ops->ops[i].neg)
>        {
> !         remove_plus_minus_op (ops, i);
> !         ops->simplified = true;
> !         return;
>        }
>
> !       if (TREE_CODE (new_op->op) == INTEGER_CST && !new_op->neg
> !         && TREE_CODE (ops->ops[i].op) == INTEGER_CST && !ops->ops[i].neg)
>        {
> !         tree cst = fold_binary (PLUS_EXPR, TREE_TYPE (new_op->op),
> !                                 new_op->op, ops->ops[i].op);
> !         if (cst && !TREE_OVERFLOW (cst))
>            {
> !             if (integer_zerop (cst))
> !               remove_plus_minus_op (ops, i);
> !             else
> !               ops->ops[i].op = cst;
> !
> !             ops->simplified = true;
> !             return;
>            }
>        }
>      }
>
> !   if (ops->n_ops < ARRAY_SIZE (ops->ops))
> !     {
> !       ops->ops[ops->n_ops++] = *new_op;
> !       return;
> !     }
> !
> !   /* If we run out of room in the array, give up.  This should
> !      almost never happen.  */
> !   ops->n_ops = 0;
> !   ops->failed = true;
> !   return;
> ! }
> !
> ! /* Add up to two operands LHS/RHS (as indicated by N_OPS) to the
> !    OPS operand array.  If OUTER_NEG is true, operands are to be
> !    negated before adding them to the array.  */
> !
> ! static void
> ! add_plus_minus_ops (struct plus_minus_data *ops,
> !                   unsigned int n_ops,
> !                   struct plus_minus_op_data *lhs,
> !                   struct plus_minus_op_data *rhs,
> !                   bool outer_neg)
> ! {
> !   lhs->neg ^= outer_neg;
> !   rhs->neg ^= outer_neg;
> !
> !   if (n_ops >= 1)
> !     add_plus_minus_op (ops, lhs);
> !   if (n_ops >= 2)
> !     add_plus_minus_op (ops, rhs);
> ! }
> !
> ! /* Extract plus/minus operands from STMT.  Stores up to two operands in
> !    LHS and RHS and returns the number of operands stored.  If no operands
> !    could be extracted, returns 0.
> !
> !    The routine guarantees that a plus/minus expression formed from LHS
> !    and RHS will evaluate to the same value as STMT, using modulo arithmetic.
> !    If STMT is of integral type and TRUE_ARITHMETIC is true, the routine
> !    guarantees in addition that this property holds true when performing
> !    operations in "true" integer arithmetic.
>
> !    OUTER_RANGE encodes range information known about the result of STMT;
> !    this may be used to infer range data on LHS and/or RHS.  */
>
> ! static unsigned int
> ! setup_plus_minus_ops (gimple stmt, struct plus_minus_op_data *lhs,
> !                     struct plus_minus_op_data *rhs,
> !                     enum plus_minus_op_range outer_range,
> !                     bool true_arithmetic)
> ! {
> !   enum tree_code this_code = gimple_assign_rhs_code (stmt);
> !   tree type = TREE_TYPE (gimple_assign_lhs (stmt));
> !
> !   switch (this_code)
>      {
> !     case PLUS_EXPR:
> !     case MINUS_EXPR:
> !       /* If we have an integral type where overflow wraps, we can only
> !        guarantee that LHS plus RHS equal STMT in modulo arithmetic,
> !        so fail if "true" arithmetic is requested.  For integral types
> !        where overflow does *not* wrap, we can assume that STMT does
> !        not overflow, and thus the equality holds in "true" arithmetic
> !        as well.
> !
> !        For floating point types, we -strictly speaking- could never
> !        extract operands due to rounding effect and saturation at
> !        +/-Inf.  However, the -flag-associative-math setting directs
> !        the compiler to ignore such effect; respect it.
> !
> !        Fixed point types can never be re-associated.  */
> !       if ((INTEGRAL_TYPE_P (type)
> !          && (!TYPE_OVERFLOW_WRAPS (type) || !true_arithmetic))
> !         || (FLOAT_TYPE_P (type)
> !             && flag_associative_math))
> !       {
> !         lhs->op = gimple_assign_rhs1 (stmt);
> !         lhs->neg = false;
> !         lhs->range = RANGE_UNKNOWN;
> !
> !         rhs->op = gimple_assign_rhs2 (stmt);
> !         rhs->neg = (this_code == MINUS_EXPR);
> !         rhs->range = RANGE_UNKNOWN;
> !
> !         /* For integer types that cannot overflow, the fact that a constant
> !            is added to / subtracted from an operand allows us to deduce
> !            range information about that operand.  */
> !         if (INTEGRAL_TYPE_P (type) && !TYPE_OVERFLOW_WRAPS (type)
> !             && TREE_CODE (rhs->op) == INTEGER_CST)
> !           {
> !             int sgn = tree_int_cst_sgn (rhs->op);
> !             if (sgn != 0)
> !               {
> !                 bool neg_p = (sgn < 0) ^ (this_code == MINUS_EXPR);
> !                 lhs->range = neg_p? RANGE_NOT_MIN : RANGE_NOT_MAX;
> !               }
> !           }
> !
> !         return 2;
> !       }
> !       break;
> !
> !     case NEGATE_EXPR:
> !       /* For integral types, we can extract a negation in the same set
> !        of circumstances we could extract a PLUS or MINUS, see above.
> !
> !        We can always extract a negation for types that use a separate
> !        sign bit: floating-point types and signed fixed-point types.  */
> !       if ((INTEGRAL_TYPE_P (type)
> !          && (!TYPE_OVERFLOW_WRAPS (type) || !true_arithmetic))
> !         || FLOAT_TYPE_P (type)
> !         || (FIXED_POINT_TYPE_P (type) && !TYPE_UNSIGNED (type)))
> !       {
> !         lhs->op = gimple_assign_rhs1 (stmt);
> !         lhs->neg = true;
> !         lhs->range = RANGE_UNKNOWN;
> !
> !         /* For integer types that cannot overflow, the fact that a
> !            negation is performed allows us to deduce range information
> !            about its operand.  */
> !         if (INTEGRAL_TYPE_P (type) && !TYPE_OVERFLOW_WRAPS (type))
> !           lhs->range = RANGE_NOT_MIN;
> !
> !         return 1;
> !       }
> !       break;
> !
> !     case BIT_NOT_EXPR:
> !       /* We transform ~A into -A - 1.  This is allowed only for integral
> !        types, and only in modulo arithmetic.  */
> !       if (INTEGRAL_TYPE_P (type) && !true_arithmetic)
>        {
> !         rhs->op = build_int_cst_type (type, -1);
> !         rhs->neg = false;
> !         rhs->range = RANGE_UNKNOWN;
> !
> !         lhs->op = gimple_assign_rhs1 (stmt);
> !         lhs->neg = true;
> !         lhs->range = RANGE_UNKNOWN;
> !
> !         /* For all integer types, BIT_NOT_EXPR transforms the maximum
> !            value of its range to the minmum value and vice versa.  */
> !         if (outer_range == RANGE_NOT_MIN)
> !           lhs->range = RANGE_NOT_MAX;
> !         else if (outer_range == RANGE_NOT_MAX)
> !           lhs->range = RANGE_NOT_MIN;
> !
> !         return 2;
> !       }
> !       break;
> !
> !     CASE_CONVERT:
> !       /* We look through conversions between integral types of the same
> !        precision.  This is in general allowed only in modulo arithmetic.
> !        This case is used in particular to handle constructs like
> !                 (int) ((unsigned) A + (unsigned) B)
> !        which are on occasion produced by other optimization passes that
> !        want to avoid introducing undefined overflows.  */
> !       if (INTEGRAL_TYPE_P (type) && !true_arithmetic)
> !       {
> !         tree op = gimple_assign_rhs1 (stmt);
> !
> !         if (TREE_TYPE (op) && INTEGRAL_TYPE_P (TREE_TYPE (op))
> !             && TYPE_PRECISION (type) == TYPE_PRECISION (TREE_TYPE (op)))
>            {
> !             lhs->op = op;
> !             lhs->neg = false;
> !             lhs->range = RANGE_UNKNOWN;
> !
> !             return 1;
>            }
>        }
> +       break;
> +
> +     default:
> +       break;
>      }
>
> !   return 0;
> ! }
> !
> ! /* Perform re-associations of the plus or minus statement STMT that are
> !    always permitted.  Returns true if the CFG was changed.  */
> !
> ! static bool
> ! associate_plusminus (gimple_stmt_iterator *gsi)
> ! {
> !   gimple stmt = gsi_stmt (*gsi);
> !   tree type = TREE_TYPE (gimple_assign_lhs (stmt));
> !   struct plus_minus_data ops;
> !   struct plus_minus_op_data lhs, rhs;
> !   unsigned int n_ops;
> !   bool no_overflow;
> !   bool true_arithmetic;
> !   unsigned int i;
> !   int pass;
> !
> !   /* The type of STMT determines whether we are allowed to create new
> !      operations in that type that may introduce overflows.  */
> !   no_overflow = (INTEGRAL_TYPE_P (type) && !TYPE_OVERFLOW_WRAPS (type));
> !
> !   /* Initialize OPS array from STMT.  */
> !   memset (&ops, 0, sizeof ops);
> !   n_ops = setup_plus_minus_ops (stmt, &lhs, &rhs, RANGE_UNKNOWN, 
> no_overflow);
> !   add_plus_minus_ops (&ops, n_ops, &lhs, &rhs, false);
> !
> !   /* Iterate over OPS array and perform simplifications.  If we cannot
> !      introduce new overflows, we first perform only simplifications that
> !      are valid in "true" arithmetic, and in a second pass then perform
> !      all simplifications valid in modulo arithmetic.  If we *are* allowed
> !      to introduce new overflows, we skip the first pass.  */
> !   for (pass = 0; pass < 2; pass++)
>      {
> !       if (pass == 0)
> !       true_arithmetic = no_overflow;
> !       else if (true_arithmetic)
> !       true_arithmetic = false;
> !       else
> !       break;
> !
> !       ops.curr_op = 0;
> !       while (next_plus_minus_op (&ops, &i))
> !       {
> !         tree this_op = ops.ops[i].op;
> !         bool this_neg = ops.ops[i].neg;
> !         enum plus_minus_op_range this_range = ops.ops[i].range;
> !
> !         /* For each operand in the array that is an SSA_NAME, see whether
> !            we can extract additional plus/minus operands from its def.  */
> !         if (TREE_CODE (this_op) == SSA_NAME)
> !           {
> !             gimple def_stmt = SSA_NAME_DEF_STMT (this_op);
> !             if (is_gimple_assign (def_stmt)
> !                 && can_propagate_from (def_stmt))
> !               {
> !                 n_ops = setup_plus_minus_ops (def_stmt, &lhs, &rhs,
> !                                               this_range, true_arithmetic);
> !                 if (n_ops > 0)
>                    {
> !                     remove_plus_minus_op (&ops, i);
> !                     add_plus_minus_ops (&ops, n_ops, &lhs, &rhs, this_neg);
>                    }
>                }
>            }
> !
> !         /* After every simplification, check whether OPS array can now be
> !            represented by a new statement.  */
> !
> !         /* If we have just a single, non-negated operand left, we replace
> !            the whole expression by that operand.  */
> !         if (ops.n_ops == 1 && !ops.ops[0].neg
> !             && useless_type_conversion_p (TREE_TYPE (ops.ops[0].op), type))
> !           {
> !             gimple_assign_set_rhs_with_ops (gsi, TREE_CODE (ops.ops[0].op),
> !                                             ops.ops[0].op, NULL_TREE);
> !             gcc_assert (gsi_stmt (*gsi) == stmt);
> !             gimple_set_modified (stmt, true);
> !             goto out;
> !           }
> !
> !         /* If we have a single *negated* operand left, check whether we are
> !            allowed to introduce a new NEGATE_EXPR.  We can do that if:
> !               - we may introduce new overflows,
> !               - all simplifications were performed in "true" arithmetic,
> !                 because then the true value of the negated operand is in
> !                 the original range of the type, or
> !               - we were able to otherwise prove the operand cannot be the
> !                 minimum value of its type.  */
> !         if (ops.n_ops == 1 && ops.ops[0].neg
> !             && useless_type_conversion_p (TREE_TYPE (ops.ops[0].op), type)
> !             && (!no_overflow || true_arithmetic
> !                 || ops.ops[0].range == RANGE_NOT_MIN))
> !           {
> !             gimple_assign_set_rhs_with_ops (gsi, NEGATE_EXPR,
> !                                             ops.ops[0].op, NULL_TREE);
> !             gcc_assert (gsi_stmt (*gsi) == stmt);
> !             gimple_set_modified (stmt, true);
> !             goto out;
> !           }
> !
> !         /* If we have two operands left (and some simplifcation took place,
> !            so we're not simply looking at the original two operands), and
> !            at most one of them is negated, check whether we are allowed to
> !            introduce a new PLUS_EXPR or MINUS_EXPR.  This follows the same
> !            rules as above, except that range information doesn't help.
> !
> !            Note that even after we've replaced the original expression with
> !            a new PLUS or MINUS, we continue further simplications -- maybe
> !            we are still able to find an even simpler equivalence.  */
> !         if (ops.n_ops == 2 && ops.simplified
> !             && !ops.ops[0].neg && !ops.ops[1].neg
> !             && useless_type_conversion_p (TREE_TYPE (ops.ops[0].op), type)
> !             && useless_type_conversion_p (TREE_TYPE (ops.ops[1].op), type)
> !             && (!no_overflow || true_arithmetic))
> !           {
> !             gimple_assign_set_rhs_code (stmt, PLUS_EXPR);
> !             gimple_assign_set_rhs1 (stmt, ops.ops[0].op);
> !             gimple_assign_set_rhs2 (stmt, ops.ops[1].op);
> !             gimple_set_modified (stmt, true);
> !             ops.simplified = false;
> !           }
> !
> !         if (ops.n_ops == 2 && ops.simplified
> !             && !ops.ops[0].neg && ops.ops[1].neg
> !             && useless_type_conversion_p (TREE_TYPE (ops.ops[0].op), type)
> !             && useless_type_conversion_p (TREE_TYPE (ops.ops[1].op), type)
> !             && (!no_overflow || true_arithmetic))
> !           {
> !             gimple_assign_set_rhs_code (stmt, MINUS_EXPR);
> !             gimple_assign_set_rhs1 (stmt, ops.ops[0].op);
> !             gimple_assign_set_rhs2 (stmt, ops.ops[1].op);
> !             gimple_set_modified (stmt, true);
> !             ops.simplified = false;
> !           }
> !
> !         if (ops.n_ops == 2 && ops.simplified
> !             && ops.ops[0].neg && !ops.ops[1].neg
> !             && useless_type_conversion_p (TREE_TYPE (ops.ops[0].op), type)
> !             && useless_type_conversion_p (TREE_TYPE (ops.ops[1].op), type)
> !             && (!no_overflow || true_arithmetic))
> !           {
> !             gimple_assign_set_rhs_code (stmt, MINUS_EXPR);
> !             gimple_assign_set_rhs1 (stmt, ops.ops[1].op);
> !             gimple_assign_set_rhs2 (stmt, ops.ops[0].op);
> !             gimple_set_modified (stmt, true);
> !             ops.simplified = false;
>            }
>        }
>      }
>
> --
>  Dr. Ulrich Weigand
>  GNU Toolchain for Linux on System z and Cell BE
>  ulrich.weig...@de.ibm.com
>

Re: [PATCH] Improved re-association of signed arithmetic (was: Inefficient end-of-loop value computation)

Reply via email to