[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (number_of_iterations)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 Bug 5 depends on bug 57511, which changed state. Bug 57511 Summary: [4.8 Regression] Missing SCEV final value replacement https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57511 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (number_of_iterations)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED --- Comment #13 from Richard Biener 2012-12-18 13:29:52 UTC --- Fixed.
[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (number_of_iterations)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 --- Comment #12 from Richard Biener 2012-12-18 13:12:39 UTC --- Author: rguenth Date: Tue Dec 18 13:12:34 2012 New Revision: 194578 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=194578 Log: 2012-12-18 Richard Biener PR tree-optimization/5 * tree-ssa-loop-niter.c (idx_infer_loop_bounds): Properly analyze evolution of the index for the loop it is used in. * tree-scalar-evolution.c (instantiate_scev_name): Take inner loop we will be creating a chrec for. Generalize fix for PR40281 and prune invalid SCEVs. (instantiate_scev_poly): Likewise - pass down inner loop we will be creating a chrec for. (instantiate_scev_binary): Take and pass through inner loop. (instantiate_array_ref): Likewise. (instantiate_scev_convert): Likewise. (instantiate_scev_not): Likewise. (instantiate_scev_3): Likewise. (instantiate_scev_2): Likewise. (instantiate_scev_1): Likewise. (instantiate_scev_r): Likewise. (resolve_mixers): Adjust. (instantiate_scev): Likewise. * gcc.dg/torture/pr5.c: New testcase. * gcc.dg/vect/vect-iv-11.c: Adjust. Added: trunk/gcc/testsuite/gcc.dg/torture/pr5.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/vect/vect-iv-11.c trunk/gcc/tree-scalar-evolution.c trunk/gcc/tree-ssa-loop-niter.c
[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (number_of_iterations)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 --- Comment #11 from Richard Biener 2012-12-17 12:14:18 UTC --- Ok, I'm confused by the following: : (loop 1 header) # lxp_1 = PHI <0(2), lxp_24(12)> t_9 = pol_x[lxp_1]; _10 = (long int) lxp_1; _11 = _10 * 4; l_12 = _11 + -1; goto ; : _15 = S_2 + l_12; _16 = coef_x[_15]; _17 = S_2 + -1; _18 = s[_17]; _19 = _18 * t_9; _20 = _16 + _19; coef_x[_15] = _20; S_22 = S_2 + 1; : (loop 2 header) # S_2 = PHI <1(3), S_22(4)> if (S_2 <= 4) goto ; else goto ; : lxp_24 = lxp_1 + 1; if (lxp_1 != 1) goto ; else goto ; : goto ; and SCEV says _15 is {l_12 + 1, +, 1}_2 which looks correct. But then it does (chrec_apply (varying_loop = 2) (chrec = {l_12 + 1, +, 1}_2) (x = 4) (res = l_12 + 5)) and magically, via l_12 being {-1, +, 4}_1 (also correct) arrives at (instantiate_scev (instantiate_below = 2) (evolution_loop = 1) (chrec = {4, +, 4}_1) (res = {4, +, 4}_1)) huh? So to it _15 is {4, +, 4}_1 (not sure what is considered "initial" in terms of scalar evolution with a value varying in an inner loop). This is what infer_loop_bounds_from_undefined derives the bogus bound for loop 1 from. To my eyes _15 should be {0, + 4}_1! Maybe it doesn't really make sense to ask for the evolution of something defined in loop N with respect to an outer loop M? If we change idx_infer_loop_bounds with Index: tree-ssa-loop-niter.c === --- tree-ssa-loop-niter.c (revision 194552) +++ tree-ssa-loop-niter.c (working copy) @@ -2671,7 +2671,12 @@ idx_infer_loop_bounds (tree base, tree * upper = false; } - ev = instantiate_parameters (loop, analyze_scalar_evolution (loop, *idx)); + struct loop *dloop = loop_containing_stmt (data->stmt); + if (!dloop) +return true; + + ev = analyze_scalar_evolution (dloop, *idx); + ev = instantiate_parameters (loop, ev); init = initial_condition (ev); step = evolution_part_in_loop_num (ev, loop->num); then we obtain via {l_12 + 1, +, 1}_2, {{0, +, 4}_1, +, 1}_2 the correct solution (init == 0, step == 4). I am going to bootstrap and regtest that.
[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (number_of_iterations)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED AssignedTo|unassigned at gcc dot |rguenth at gcc dot gnu.org |gnu.org | --- Comment #10 from Richard Biener 2012-12-17 10:14:37 UTC --- I'll have a more detailled look.
[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (number_of_iterations)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 --- Comment #9 from Richard Biener 2012-12-14 14:11:59 UTC --- The unrolling puts __builtin_unreachable ()s into the inner duplicated loops: First iteration, good: : # lxp_30 = PHI <0(2)> t_32 = pol_x[lxp_30]; _33 = (long int) lxp_30; _34 = _33 * 4; l_35 = _34 + -1; : # S_36 = PHI <1(16), S_45(18)> if (S_36 <= 4) goto ; else goto ; : _38 = S_36 + l_35; _39 = coef_x[_38]; _40 = S_36 + -1; _41 = s[_40]; _42 = _41 * t_32; _43 = _39 + _42; coef_x[_38] = _43; S_45 = S_36 + 1; goto ; : lxp_46 = lxp_30 + 1; : Peeled iteration, bad: : # lxp_1 = PHI t_9 = pol_x[lxp_1]; _10 = (long int) lxp_1; _11 = _10 * 4; l_12 = _11 + -1; goto ; : _13 = S_3 + l_12; __builtin_unreachable (); _14 = coef_x[_13]; _15 = S_3 + -1; _16 = s[_15]; _17 = _16 * t_9; _18 = _14 + _17; __builtin_unreachable (); coef_x[_13] = _18; S_20 = S_3 + 1; : # S_3 = PHI <1(3), S_20(4)> if (S_3 <= 4) goto ; else goto ; : lxp_21 = lxp_1 + 1; if (1 == 0) goto ; else goto ; : : __builtin_unreachable (); : __asm__ __volatile__("" : : "r" &coef_x : "memory"); goto ; Note that the outer loop body looks ok - it is the inner body that get's mangled in a bogus way. That's because the loop bounds derived from the inner loop accesses are bogus: remove_exits_and_undefined_stmts (loop=0x767ec770, npeeled=1) at /space/rguenther/src/svn/trunk/gcc/tree-ssa-loop-ivcanon.c:481 481 bool changed = false; (gdb) n 483 for (elt = loop->bounds; elt; elt = elt->next) (gdb) 488 if (!elt->is_exit (gdb) p elt $8 = (nb_iter_bound *) 0x76925e38 (gdb) p *elt $9 = {stmt = 0x76905c80, bound = {low = 0, high = 0}, is_exit = false, next = 0x76925dc0} (gdb) p elt->stmt $10 = (gimple) 0x76905c80 (gdb) call debug_gimple_stmt ($10) # .MEM_19 = VDEF <.MEM_6> coef_x[_13] = _18; as far as I can see even with lxp == 1 we have at most an index of 4 + 1*4 - 1. Statement _14 = coef_x[_13]; is executed at most 0 (bounded by 0) + 1 times in loop 1. _that's_ bogus.
[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (number_of_iterations)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 --- Comment #8 from Richard Biener 2012-12-14 13:49:02 UTC --- (In reply to comment #7) > I've tried to rewrite this as C, but managed to turn it into something that is > miscompiled at a different spot. The fortran testcase starts having > __builtin_unreachable () in it in the *.cunroll pass, this one already in > *.cunrolli pass. Still, I believe it doesn't do any out of bounds access > anywhere. -O2 on x86_64-linux. > > double s[4] = { 1.0, 2.0, 3.0, 4.0 }, pol_x[2] = { 5.0, 6.0 }; > > __attribute__((noinline)) int > foo (void) > { > double coef_x[8] = { 0, 0, 0, 0, 0, 0, 0, 0 }; > int lxp = 0; > if (lxp <= 1) > do > { > double t = pol_x[lxp]; > long S; > long l = lxp * 4L - 1; > for (S = 1; S <= 4; S++) > coef_x[S + l] = coef_x[S + l] + s[S - 1] * t; > } > while (lxp++ != 1); > asm volatile ("" : : "r" (coef_x) : "memory"); > for (lxp = 0; lxp < 8; lxp++) > if (coef_x[lxp] != ((lxp & 3) + 1) * (5.0 + (lxp >= 4))) > __builtin_abort (); > return 1; > } > > int > main () > { > asm volatile ("" : : : "memory"); > if (!foo ()) > __builtin_abort (); > return 0; > } > > Works with r193067, fails with r193100, haven't tried to bisect exactly, but > would guess this is r193098 again. For the outer loop it prints: > > Analyzing # of iterations of loop 1 > exit condition [0, + , 1](no_overflow) != 1 > bounds on difference of bases: 1 ... 1 > result: > # of iterations 1, bounded by 1 > Loop 1 iterates 1 times. > Loop 1 iterates at most 1 times. > > but that is wrong, the outer loop iterates exactly 2 times. That's latch block executions, so one latch block execution is correct.
[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (number_of_iterations)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 Jakub Jelinek changed: What|Removed |Added Priority|P3 |P1 Summary|[4.8 Regression]|[4.8 Regression] |miscompilation at -O2 |miscompilation at -O2 |(tree-pre?) |(number_of_iterations)