[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (tree-pre?)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #7 from Jakub Jelinek jakub at gcc dot gnu.org 2012-12-07 09:44:40 UTC --- I've tried to rewrite this as C, but managed to turn it into something that is miscompiled at a different spot. The fortran testcase starts having __builtin_unreachable () in it in the *.cunroll pass, this one already in *.cunrolli pass. Still, I believe it doesn't do any out of bounds access anywhere. -O2 on x86_64-linux. double s[4] = { 1.0, 2.0, 3.0, 4.0 }, pol_x[2] = { 5.0, 6.0 }; __attribute__((noinline)) int foo (void) { double coef_x[8] = { 0, 0, 0, 0, 0, 0, 0, 0 }; int lxp = 0; if (lxp = 1) do { double t = pol_x[lxp]; long S; long l = lxp * 4L - 1; for (S = 1; S = 4; S++) coef_x[S + l] = coef_x[S + l] + s[S - 1] * t; } while (lxp++ != 1); asm volatile ( : : r (coef_x) : memory); for (lxp = 0; lxp 8; lxp++) if (coef_x[lxp] != ((lxp 3) + 1) * (5.0 + (lxp = 4))) __builtin_abort (); return 1; } int main () { asm volatile ( : : : memory); if (!foo ()) __builtin_abort (); return 0; } Works with r193067, fails with r193100, haven't tried to bisect exactly, but would guess this is r193098 again. For the outer loop it prints: Analyzing # of iterations of loop 1 exit condition [0, + , 1](no_overflow) != 1 bounds on difference of bases: 1 ... 1 result: # of iterations 1, bounded by 1 Loop 1 iterates 1 times. Loop 1 iterates at most 1 times. but that is wrong, the outer loop iterates exactly 2 times. bb 3: # lxp_1 = PHI 0(2), lxp_21(15) ... bb 6: lxp_21 = lxp_1 + 1; if (lxp_1 != 1) goto bb 15; else goto bb 7; bb 15: goto bb 3; If it used lxp_21 in the condition, that would be correct, but it uses the previous value of the IV.
[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (tree-pre?)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 --- Comment #4 from Jan Hubicka hubicka at gcc dot gnu.org 2012-12-02 09:59:35 UTC --- Hmm, this seems to be caused by Forced statement unreachable: pretmp_516 = coef_x[pretmp_515]; Forced statement unreachable: pretmp_513 = coef_x[pretmp_512]; Forced statement unreachable: pretmp_479 = coef_x[pretmp_478]; I am not exactly fortran guru. Can someone double check that there are no out of bounds accesses into these? Honza
[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (tree-pre?)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 --- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-12-02 10:11:34 UTC --- (In reply to comment #4) Hmm, this seems to be caused by Forced statement unreachable: pretmp_516 = coef_x[pretmp_515]; Forced statement unreachable: pretmp_513 = coef_x[pretmp_512]; Forced statement unreachable: pretmp_479 = coef_x[pretmp_478]; I am not exactly fortran guru. Can someone double check that there are no out of bounds accesses into these? Honza I'm pretty sure there are no out-of-bounds. In particular coef_x is easy to check, it is only used as coef_x(:,lxp) where lxp is the loop bound 0..lp consistent with its def. Of course maybe the FE does something inconsistent ? Also this runs fine: fortran -O0 -fsanitize=address PR5.f90 ; ./a.out
[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (tree-pre?)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 --- Comment #6 from Jan Hubicka hubicka at ucw dot cz 2012-12-02 11:03:53 UTC --- I'm pretty sure there are no out-of-bounds. In particular coef_x is easy to check, it is only used as coef_x(:,lxp) where lxp is the loop bound 0..lp consistent with its def. Of course maybe the FE does something inconsistent ? Also this runs fine: fortran -O0 -fsanitize=address PR5.f90 ; ./a.out Hmm, I saw similar weird cases generated by the frontend. coef_x is array of 8 elements real(kind=8) coef_x[8]; in loop analyzis we do: Induction variable (integer(kind=8)) 4 + 4 * iteration does not wrap in statement pretmp_516 = coef_x[pretmp_515]; in loop 4. Statement pretmp_516 = coef_x[pretmp_515]; is executed at most 0 (bounded by 0) + 1 times in loop 4. This is true, pretmp_512 would be 8 at the second iteration of the loop. later we conclude Loop 4 iterates 1 times. Loop 4 iterates at most 1 times. BB: 9, after_exit: 0 size: 0 # DEBUG lxp = lxp_4 size: 0 _137 = (integer(kind=8)) lxp_4; size: 1 _140 = _137 + pretmp_508; size: 1 _142 = *pol_x_141(D)[_140]; size: 1 _143 = _137 * 4; size: 1 _144 = _143 + -1; BB: 10, after_exit: 0 size: 1 _146 = S.25_279 + _144; size: 1 _150 = _142 * prephitmp_520; size: 1 _151 = _150 + prephitmp_517; size: 1 coef_x[_146] = _151; size: 1 S.25_153 = S.25_279 + 1; size: 1 ivtmp_162 = ivtmp_91 - 1; size: 2 if (ivtmp_162 == 0) goto bb12 or bb11 BB: 11, after_exit: 0 size: 1 pretmp_515 = _144 + S.25_153; size: 1 pretmp_516 = coef_x[pretmp_515]; size: 1 pretmp_518 = S.25_153 + -1; size: 1 pretmp_519 = s[pretmp_518]; BB: 12, after_exit: 0 size: 0 # DEBUG lxp = lxp_4 + 1 size: 1 ivtmp_109 = ivtmp_163 - 1; size: 2 if (ivtmp_109 == 0) goto bb 13 or exit BB: 13, after_exit: 1 size: 1 lxp_154 = lxp_4 + 1; size: 0 pretmp_506 = (integer(kind=8)) lxp_154; size: 1 pretmp_509 = pretmp_506 * 4; size: 1 pretmp_510 = pretmp_509 + -1; size: 1 pretmp_512 = pretmp_510 + 1; size: 1 pretmp_513 = coef_x[pretmp_512]; Unrolled loop 4 completely (duplicated 1 times). Exit condition of peeled iterations was eliminated. Last iteration exit edge was proved true. So the curious statements are in bb11. Adding unreachable calls makes CSE to eventually turn condition in the second copy of BB10 to always just to BB 12 that seem all right to me. Perhaps cascaded unrolling confuse some of the exits... Honza
[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (tree-pre?)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added CC||Joost.VandeVondele at mat ||dot ethz.ch Summary|[4.8 Regression]|[4.8 Regression] |miscompilation at -O2 |miscompilation at -O2 ||(tree-pre?) --- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-12-01 15:53:17 UTC --- Using -O2 -fno-tree-pre fixes the testcase. Using -O1 -ftree-pre leads to an infinite loop at runtime.
[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (tree-pre?)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 Dominique d'Humieres dominiq at lps dot ens.fr changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2012-12-01 Ever Confirmed|0 |1 --- Comment #2 from Dominique d'Humieres dominiq at lps dot ens.fr 2012-12-01 17:03:37 UTC --- Revision 192891 (2012-10-28) is OK. Revision 193261 (2012-11-06) is not.
[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (tree-pre?)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 H.J. Lu hjl.tools at gmail dot com changed: What|Removed |Added CC||hubicka at gcc dot gnu.org Target Milestone|--- |4.8.0 --- Comment #3 from H.J. Lu hjl.tools at gmail dot com 2012-12-01 19:17:36 UTC --- It is caused by revision 193098: http://gcc.gnu.org/ml/gcc-cvs/2012-11/msg00045.html