[Bug tree-optimization/107176] [10/11/12/13 Regression] Wrong code at -Os on x86_64-pc-linux-gnu

2022-10-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107176

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |10.5
  Component|c   |tree-optimization
   Keywords||wrong-code
  Known to fail||7.1.0
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
Summary|Wrong code at -O0/-Os on|[10/11/12/13 Regression]
   |x86_64-pc-linux-gnu |Wrong code at -Os on
   ||x86_64-pc-linux-gnu
   Last reconfirmed||2022-10-06
  Known to work||6.3.0

--- Comment #1 from Andrew Pinski  ---
Confirmed.
Here is a better testcase:
#include 

int a;
long b;
static inline long c(unsigned d) { 
return d; 
}
static inline void e(int d) { 
a = d; 
}
int main() {
  b = 0;
  for (; b < 1; b = c(b - 90) + 90 + 1)
;
  e(b >> 2);
  printf("%d\n", a);
  if (a != 1073741824)
__builtin_abort();
}

[Bug tree-optimization/107176] [10/11/12/13 Regression] Wrong code at -Os on x86_64-pc-linux-gnu

2022-10-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107176

--- Comment #2 from Richard Biener  ---
final value replacement:
  b_lsm.8_26 = PHI 
 with expr: 1
 final stmt:
  b_lsm.8_26 = 1;

where

(get_scalar_evolution
  (scalar = b_lsm.8_15)
  (scalar_evolution = {0, +, 1}_1))
(chrec_apply
  (varying_loop = 1)
  (chrec = {0, +, 1}_1)
  (x = 1)
  (res = 1))

and

   [local count: 955630225]:
  _1 = (unsigned int) b_lsm.8_15;
  _2 = _1 + 4294967206;
  # RANGE [irange] long int [0, 4294967295] NONZERO 0x
  _12 = (long int) _2;
  # RANGE [irange] long int [91, 4294967386] NONZERO 0x1
  _3 = _12 + 91;

   [local count: 1073741824]:
  # b_lsm.8_15 = PHI <0(2), _3(3)>

[Bug tree-optimization/107176] [10/11/12/13 Regression] Wrong code at -Os on x86_64-pc-linux-gnu since r7-2012-g43aabfcfd4139e4c

2022-10-07 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107176

Martin Liška  changed:

   What|Removed |Added

Summary|[10/11/12/13 Regression]|[10/11/12/13 Regression]
   |Wrong code at -Os on|Wrong code at -Os on
   |x86_64-pc-linux-gnu |x86_64-pc-linux-gnu since
   ||r7-2012-g43aabfcfd4139e4c
   Keywords|needs-bisection |
 CC||amker at gcc dot gnu.org,
   ||marxin at gcc dot gnu.org

--- Comment #3 from Martin Liška  ---
Started with r7-2012-g43aabfcfd4139e4c.

[Bug tree-optimization/107176] [10/11/12/13 Regression] Wrong code at -Os on x86_64-pc-linux-gnu since r7-2012-g43aabfcfd4139e4c

2022-10-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107176

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #4 from Richard Biener  ---
I will try to have a closer look.

[Bug tree-optimization/107176] [10/11/12/13 Regression] Wrong code at -Os on x86_64-pc-linux-gnu since r7-2012-g43aabfcfd4139e4c

2022-10-21 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107176

--- Comment #5 from Richard Biener  ---
 [local count: 1073741824]:
# b_lsm.8_15 = PHI <0(2), _3(3)>
# b_lsm_flag.9_17 = PHI <0(2), 1(3)>
if (b_lsm.8_15 <= 0)
  goto ; [89.00%]
else
  goto ; [11.00%]

 [local count: 955630225]:
_1 = (unsigned int) b_lsm.8_15;
_2 = _1 + 4294967206;
_12 = (long int) _2;
_3 = _12 + 91;
goto ;

for this loop we get

(number_of_iterations_in_loop =
(analyze_scalar_evolution
  (loop_nb = 1)
  (scalar = b_lsm.8_15)
(get_scalar_evolution
  (scalar = b_lsm.8_15)
  (scalar_evolution = ))
(analyze_initial_condition
  (loop_phi_node =
b_lsm.8_15 = PHI <0(2), _3(3)>
)
  (init_cond = 0))
(analyze_evolution_in_loop
  (loop_phi_node = b_lsm.8_15 = PHI <0(2), _3(3)>
)
(add_to_evolution
  (loop_nb = 1)
  (chrec_before = 0)
  (to_add = 91)
  (res = {0, +, 91}_1))
(add_to_evolution
  (loop_nb = 1)
  (chrec_before = {0, +, 91}_1)
  (to_add = 4294967206)
  (res = {0, +, 1}_1))
Estimating # of iterations of loop 1
(number_of_iterations_in_loop =
(analyze_scalar_evolution
  (loop_nb = 1)
  (scalar = b_lsm.8_15)
(get_scalar_evolution
  (scalar = b_lsm.8_15)
  (scalar_evolution = ))
(analyze_initial_condition
  (loop_phi_node =
b_lsm.8_15 = PHI <0(2), _3(3)>
)
  (init_cond = 0))
(analyze_evolution_in_loop
  (loop_phi_node = b_lsm.8_15 = PHI <0(2), _3(3)>
)
(add_to_evolution
  (loop_nb = 1)
  (chrec_before = 0)
  (to_add = 91)
  (res = {0, +, 91}_1))
(add_to_evolution
  (loop_nb = 1)
  (chrec_before = {0, +, 91}_1)
  (to_add = 4294967206)
  (res = {0, +, 1}_1))
  (evolution_function = (long int) {0, +, 1}_1))
(set_scalar_evolution
  instantiated_below = 2
  (scalar = b_lsm.8_15)
  (scalar_evolution = (long int) {0, +, 1}_1))
)

and that's wrong I think.  (long int)((unsigned int)b + -90u) + 91 isn't
the same as (long int) {0, +, 1} with the evolution in unsigned int.
It's the evolution of the value in _unsigned int_.

The key somehow happens in follow_ssa_edge_expr where we do

  /* Match an assignment under the form:
 "a = b +- ...". 
 Use tail-recursion for the simple case.  */
  *evolution_of_loop = add_to_evolution
  (loop->num, chrec_convert (type, *evolution_of_loop,
 at_stmt),
   code, rhs1, at_stmt);
  expr = rhs0;
  goto tail_recurse; 

which converts the long int { 0, +, 91 }_1 to unsigned int and will
then add -90u to that.

We are building the CHREC "backwards", we start with '(long)0', the initial
condition but then build the CHREC from the backedge def, first
{(long)0, +, 91}, then upon _12 = (long int)_2 we're feeding the
{(long)0, +, 91} to the _2 = _1 + -90u handler above which simply
truncates the "current" CHREC and the outer recursion will simply
convert that back.  But that's not how the CHREC expression build and
simplification should work?

The most significant hint is probably that when reaching the initial
truncation '_1 = (unsigned int) b_lsm.8_15;' we get there with a
CHREC that is unsigned int already!

So it's possibly wrong to "post-recurse" to rhs0 here.  Doing

diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
index 9f30f78cb5d..b481c488ca8 100644
--- a/gcc/tree-scalar-evolution.cc
+++ b/gcc/tree-scalar-evolution.cc
@@ -1265,6 +1265,9 @@ tail_recurse:
   if (TREE_CODE (rhs0) == SSA_NAME
  && (TREE_CODE (rhs1) != SSA_NAME || code == MINUS_EXPR))
{
+
+   t_bool res = follow_ssa_edge_expr (loop, at_stmt, rhs0, halting_phi,
+  evolution_of_loop, limit);
  /* Match an assignment under the form:
 "a = b +- ...".
 Use tail-recursion for the simple case.  */
@@ -1272,8 +1275,9 @@ tail_recurse:
  (loop->num, chrec_convert (type, *evolution_of_loop,
 at_stmt),
   code, rhs1, at_stmt);
- expr = rhs0;
- goto tail_recurse;
+ //expr = rhs0;
+ //goto tail_recurse;
+ return res;
}
   /* Else search for the SCC in both rhs0 and rhs1.  */
   return follow_ssa_edge_binary (loop, at_stmt, type, rhs0, code, rhs1,

fixes the testcase.

[Bug tree-optimization/107176] [10/11/12/13 Regression] Wrong code at -Os on x86_64-pc-linux-gnu since r7-2012-g43aabfcfd4139e4c

2022-10-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107176

--- Comment #6 from Richard Biener  ---
Hmm, and interestingly this fix causes PR66375 to regress again:

FAIL: gcc.dg/torture/pr66375.c   -O2  execution test
...

OK, so that's because my suggested fix is exactly an (incomplete) reversal
of the fix for PR66375 (r6-1346-gb9b79ba4264cf6) :/

[Bug tree-optimization/107176] [10/11/12/13 Regression] Wrong code at -Os on x86_64-pc-linux-gnu since r7-2012-g43aabfcfd4139e4c

2022-10-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107176

--- Comment #7 from Richard Biener  ---
(In reply to Richard Biener from comment #2)
> final value replacement:
>   b_lsm.8_26 = PHI 
>  with expr: 1
>  final stmt:
>   b_lsm.8_26 = 1;
> 
> where
> 
> (get_scalar_evolution
>   (scalar = b_lsm.8_15)
>   (scalar_evolution = {0, +, 1}_1))
> (chrec_apply
>   (varying_loop = 1)
>   (chrec = {0, +, 1}_1)
>   (x = 1)
>   (res = 1))
> 
> and
> 
>[local count: 955630225]:
>   _1 = (unsigned int) b_lsm.8_15;
>   _2 = _1 + 4294967206;
>   # RANGE [irange] long int [0, 4294967295] NONZERO 0x
>   _12 = (long int) _2;
>   # RANGE [irange] long int [91, 4294967386] NONZERO 0x1
>   _3 = _12 + 91;
> 
>[local count: 1073741824]:
>   # b_lsm.8_15 = PHI <0(2), _3(3)>

For comparison in PR66375 we have

 :
# c_6 = PHI <0(2), c_12(3)>
a.1_5 = a;
if (a.1_5 <= 12)
  goto ; [INV]
else
  goto ; [INV]

 :
_1 = (signed char) c_6;
_2 = (int) _1;
c_12 = _2 + -11;
_4 = a.1_5 + 1;
a = _4;
goto ;

so the backedge definition is

  (long)((unsigned)IV + -90u) + 91

vs.

  (int)(signed char)IV + -11


I think the issue is the CONVERT_EXPR handling in follow_ssa_edge_expr where
it isn't all that clear in which case we can analyze the evolution in the
narrower or the wider type.  In the case in this PR we mishandled the
middle conversion while in the older case we mishandle the "initial"
conversion.  I suspect that trying to optimize things on-the-fly is
difficult (and reasoning about the relevant cases there).

I've tried (again) to more correctly have the current evolution tentative
until we hit the loop PHI again when following the use-def chain from the
latch definition, but then we don't even know whether we will have an
evolution in the end (tried { initial, +, scev_not_known }).  Going fully
symbolic will lead to the issue pointed out in comment 6 of PR66375,
we'd get { (int)(signed char)0, +, -11 } which isn't what we want.

If, as in this bug, we have two evolutions in different types, we probably
have to give up.  Maybe we need to think of a PLUS as { unknown, +, val }
and instead fend off "wrongly" typed evolutions when we reach the
loop PHI node again.  But then the PR66375 case _does_ have an expression
correctly describing the evolution of the IV.  Just in this PRs case there
is none I think.  Or rather, I guess it would be
{ (long){ 0u, +, -90u }_1, +, 90 }_1 but that wouldn't be affine at least.

[Bug tree-optimization/107176] [10/11/12/13 Regression] Wrong code at -Os on x86_64-pc-linux-gnu since r7-2012-g43aabfcfd4139e4c

2022-10-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107176

Richard Biener  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=42512

--- Comment #8 from Richard Biener  ---
PR42512 is also related and its "fix" is even worse...  it's worse in that
it hides wrong cases but also throws away correct ones.  But it also
shows that we still mishandle that testcase with or without the fix(es).

It seems that the intent of the SCC analysis in interpret_loop_phi is to
look at the linear operation (or the chain of linear operations) associated
in a way that it matches

 (T2)(init + ev)

from which then (T2) { init, +, ev } follows.

So when

 PHI' = (int)(signed char)PHI + 11

we directly match (int)(signed char){ init, +, 11 }

and for

 PHI' = (long)((unsigned)PHI + -90u) + 91

we'd have to associate the outer + 91 which we can't do (but effectively
do due to the present bug).

[Bug tree-optimization/107176] [10/11/12/13 Regression] Wrong code at -Os on x86_64-pc-linux-gnu since r7-2012-g43aabfcfd4139e4c

2022-10-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107176

--- Comment #9 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:4c5b1160776382772fc0a33130dfaf621699fdbf

commit r13-3486-g4c5b1160776382772fc0a33130dfaf621699fdbf
Author: Richard Biener 
Date:   Mon Oct 24 08:52:12 2022 +0200

tree-optimization/107176 - SCEV analysis association issue

The following fixes a wrong-code issue caused by SCEV analysis
associating an addition due trying to use tail-recursion in
follow_ssa_edge_expr.  That causes us to apply a conversion at
the wrong point and thus miscompute the scalar evolution of
an induction variable.  This reverts the PR66375 fix and
revisits the PR42512 fix by keeping the evolution symbolic
up to the point we process the first linear function when
we then can check for the supported cases and substitute the
whole symbolic expression with the built chrec substituting
the proper initial value.

To simplify passing around things and to clarify scoping of
the involved functions this change wraps the SCEV DFS walking
code into a class.

PR tree-optimization/107176
PR tree-optimization/66375
PR tree-optimization/42512
* tree-scalar-evolution.cc (follow_ssa_edge_expr): Revert
the PR66375 fix, do not not associate PLUS_EXPR to be able
to use tail-recursion.
(follow_ssa_edge_binary): Likewise.
(interpret_loop_phi): Revert PR42512 fix, do not throw
away analyze_evolution_in_loop result after the fact.
(follow_ssa_edge_expr): When reaching halting_phi initalize
the evolution to the symbolic value of the PHI result.
(add_to_evolution_1): When adding the first evolution verify
we can handle the expression wrapping the symbolic evolution
and replace that in full using the initial condition.
(class scev_dfs): New, contains ...
(follow_ssa_edge_expr, follow_ssa_edge_binary,
follow_ssa_edge_in_condition_phi_branch,
follow_ssa_edge_in_condition_phi,
follow_ssa_edge_inner_loop_phi,
add_to_evolution, add_to_evolution_1): ... these with
loop and halting_phi arguments in class data.
(scev_dfs::get_ev): New toplevel DFS entry, start with
a chrec_dont_know evolution.
(analyze_evolution_in_loop): Use scev_dfs.

* gcc.dg/torture/pr107176.c: New testcase.