Hi, I made a mistake when improving loop niter analysis for NE_EXPR exit condition. It can results in wrong code as reported in PR72817/PR73450. In previous patch, it simplifies below condition: base <= FINAL ; step > 0 base >= FINAL ; step < 0 into: base - step < FINAL ; step > 0 && base - step doesn't underflow base - step > FINAL ; step < 0 && base - step doesn't overflow But this is not enough. Since we adjusted IV.base to "new_base = IV.base - IV.step" in the condition, we need to check if |FINAL - new_base| is multiple of |step| for the adjusted base.
This patch fixes the issue as well as revises the comment. Bootstrap and test on x86_64. Is it OK? Thanks, bin 2016-08-11 Bin Cheng <bin.ch...@arm.com> PR tree-optimization/72817 PR tree-optimization/73450 * tree-ssa-loop-niter.c (number_of_iterations_ne): Check multiple_of_p for adjusted IV.base. gcc/testsuite/ChangeLog 2016-08-11 Bin Cheng <bin.ch...@arm.com> PR tree-optimization/72817 PR tree-optimization/73450 * gcc.dg/tree-ssa/pr72817.c: New test. * gcc.dg/tree-ssa/pr73450.c: New test.
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c index c740ffa..8851862 100644 --- a/gcc/tree-ssa-loop-niter.c +++ b/gcc/tree-ssa-loop-niter.c @@ -999,33 +999,36 @@ number_of_iterations_ne (struct loop *loop, tree type, affine_iv *iv, mpz_clear (max); /* Compute no-overflow information for the control iv. This can be - proven when below two conditions hold. - - 1) |FINAL - base| is an exact multiple of step. - 2) IV evaluates toward FINAL at beginning, i.e: + proven when below two conditions are satisfied: + 1) IV evaluates toward FINAL at beginning, i.e: base <= FINAL ; step > 0 base >= FINAL ; step < 0 - Note the first condition holds, the second can be then relaxed - to below condition. + 2) |FINAL - base| is an exact multiple of step. + + Unfortunately, it's hard to prove above conditions after pass loop-ch + because loop with exit condition (IV != FINAL) usually will be guarded + by initial-condition (IV.base - IV.step != FINAL). In this case, we + can alternatively try to prove below conditions: + + 1') IV evaluates toward FINAL at beginning, i.e: + new_base = base - step < FINAL ; step > 0 + && base - step doesn't underflow + new_base = base - step > FINAL ; step < 0 + && base - step doesn't overflow - base - step < FINAL ; step > 0 - && base - step doesn't underflow - base - step > FINAL ; step < 0 - && base - step doesn't overflow + 2') |FINAL - new_base| is an exact multiple of step. - The relaxation is important because after pass loop-ch, loop - with exit condition (IV != FINAL) will usually be guarded by - pre-condition (IV.base - IV.step != FINAL). Please refer to - PR34114 as an example. + Please refer to PR34114 as an example of loop-ch's impact, also refer + to PR72817 as an example why condition 2') is necessary. - Also note, for NE_EXPR, base equals to FINAL is a special case, in + Note, for NE_EXPR, base equals to FINAL is a special case, in which the loop exits immediately, and the iv does not overflow. */ if (!niter->control.no_overflow && (integer_onep (s) || multiple_of_p (type, c, s))) { - tree t, cond, relaxed_cond = boolean_false_node; + tree t, cond, new_c, relaxed_cond = boolean_false_node; if (tree_int_cst_sign_bit (iv->step)) { @@ -1039,8 +1042,12 @@ number_of_iterations_ne (struct loop *loop, tree type, affine_iv *iv, if (integer_nonzerop (t)) { t = fold_build2 (MINUS_EXPR, type, iv->base, iv->step); - relaxed_cond = fold_build2 (GT_EXPR, boolean_type_node, - t, final); + new_c = fold_build2 (MINUS_EXPR, niter_type, + fold_convert (niter_type, t), + fold_convert (niter_type, final)); + if (multiple_of_p (type, new_c, s)) + relaxed_cond = fold_build2 (GT_EXPR, boolean_type_node, + t, final); } } } @@ -1056,8 +1063,12 @@ number_of_iterations_ne (struct loop *loop, tree type, affine_iv *iv, if (integer_nonzerop (t)) { t = fold_build2 (MINUS_EXPR, type, iv->base, iv->step); - relaxed_cond = fold_build2 (LT_EXPR, boolean_type_node, - t, final); + new_c = fold_build2 (MINUS_EXPR, niter_type, + fold_convert (niter_type, final), + fold_convert (niter_type, t)); + if (multiple_of_p (type, new_c, s)) + relaxed_cond = fold_build2 (LT_EXPR, boolean_type_node, + t, final); } } } diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr72817.c b/gcc/testsuite/gcc.dg/tree-ssa/pr72817.c new file mode 100644 index 0000000..290e096 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr72817.c @@ -0,0 +1,13 @@ +/* { dg-do run } */ +/* { dg-options "-O3" } */ + +char a; +short b; + +int main () +{ + for (a = 3; a != -1; a -= 5) + while (b) + ; + return 0; +} diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr73450.c b/gcc/testsuite/gcc.dg/tree-ssa/pr73450.c new file mode 100644 index 0000000..7dd44db --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr73450.c @@ -0,0 +1,14 @@ +/* { dg-do run } */ +/* { dg-options "-O3" } */ + +int a; +char b; +int main() { + char c = 0; + for (; c != 3; c = c + 7) { + a = b & a; + if (a) + break; + } + return 0; +}