[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #29 from jakub at gcc dot gnu dot org 2008-10-20 23:14 --- Closing as INVALID, as using -ffast-math is wrong for calculix, at least for the distilled testcase from it. In the testcase +0 vs. -0 makes very big difference (atan2 acts very similarly to copysign) and so compiling with an option that says to the compiler that +0 vs. -0 is insignificant may result in very unexpected results. Most probably compiling calculix with -O -ffast-math -fsigned-zeros could work. -- jakub at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #28 from jakub at gcc dot gnu dot org 2008-10-20 17:27 --- -fno-math-errno was needed too to get it optimized out, with that it works. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #27 from dberlin at gcc dot gnu dot org 2008-10-20 16:22 --- Subject: Re: [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math Err, works for me with -O2 -ffast-math Replaced D.1587_48 - D.1591_50 with prephitmp.17_60 in D.1600_23 = D.1587_48 - D.1591_50; Replaced ABS_EXPR with prephitmp.17_62 in D.1601_24 = ABS_EXPR ; Replaced __builtin_pow (cn_35, 2.0e+0) with 1.0e+0 in D.1583_36 = __builtin_pow (cn_35, 2.0e+0); Replaced 1.0e+0 - D.1583_36 with 0.0 in D.1584_37 = 1.0e+0 - D.1583_36; Replaced __builtin_sqrt (D.1584_37) with 0.0 in D.1585_38 = __builtin_sqrt (D.1584_37); Replaced __builtin_atan2 (D.1585_38, cn_35) with prephitmp.17_1 in D.1586_39 = __builtin_atan2 (D.1585_38, cn_35); and -O2 -funsafe-math-optimizations Replaced D.1587_48 - D.1591_50 with prephitmp.17_60 in D.1600_23 = D.1587_48 - D.1591_50; Replaced ABS_EXPR with prephitmp.17_62 in D.1601_24 = ABS_EXPR ; Replaced __builtin_pow (cn_35, 2.0e+0) with 1.0e+0 in D.1583_36 = __builtin_pow (cn_35, 2.0e+0); Replaced 1.0e+0 - D.1583_36 with 0.0 in D.1584_37 = 1.0e+0 - D.1583_36; Replaced __builtin_sqrt (D.1584_37) with 0.0 in D.1585_38 = __builtin_sqrt (D.1584_37); Replaced __builtin_atan2 (D.1585_38, cn_35) with prephitmp.17_1 in D.1586_39 = __builtin_atan2 (D.1585_38, cn_35); Are you sure you updated? On Mon, Oct 20, 2008 at 11:10 AM, jakub at gcc dot gnu dot org <[EMAIL PROTECTED]> wrote: > > > --- Comment #26 from jakub at gcc dot gnu dot org 2008-10-20 15:10 > --- > On the #c11 testcase with -O2 -funsafe-math-optimizations I still see > # cn_38 = PHI <-1.0e+0(4), 1.0e+0(11)> > D.1262_39 = __builtin_pow (cn_38, 2.0e+0); > D.1263_41 = 1.0e+0 - D.1262_39; > D.1264_42 = __builtin_sqrt (D.1263_41); > before and after PRE, so the recent pre change doesn't handle it. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #26 from jakub at gcc dot gnu dot org 2008-10-20 15:10 --- On the #c11 testcase with -O2 -funsafe-math-optimizations I still see # cn_38 = PHI <-1.0e+0(4), 1.0e+0(11)> D.1262_39 = __builtin_pow (cn_38, 2.0e+0); D.1263_41 = 1.0e+0 - D.1262_39; D.1264_42 = __builtin_sqrt (D.1263_41); before and after PRE, so the recent pre change doesn't handle it. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #25 from dberlin at gcc dot gnu dot org 2008-10-16 23:30 --- Subject: Re: [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math I fixed the PRE issue with builtin_pow here. :) On Wed, Oct 15, 2008 at 2:50 PM, dberlin at dberlin dot org <[EMAIL PROTECTED]> wrote: > > > --- Comment #24 from dberlin at gcc dot gnu dot org 2008-10-15 18:50 > --- > Subject: Re: [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math > > On Wed, Oct 15, 2008 at 2:43 PM, rguenther at suse dot de > <[EMAIL PROTECTED]> wrote: >> >> >> --- Comment #23 from rguenther at suse dot de 2008-10-15 18:43 --- >> Subject: Re: [4.4 Regression] calculix gets >> wrong answer for -O1 -ffast-math >> >> On Wed, 15 Oct 2008, rguenther at suse dot de wrote: >> >>> --- Comment #22 from rguenther at suse dot de 2008-10-15 18:33 --- >>> Subject: Re: [4.4 Regression] calculix gets >>> wrong answer for -O1 -ffast-math >>> >>> On Wed, 15 Oct 2008, dberlin at dberlin dot org wrote: >>> >>> > --- Comment #21 from dberlin at gcc dot gnu dot org 2008-10-15 17:55 >>> > --- >>> > Subject: Re: [4.4 Regression] calculix gets wrong answer for -O1 >>> > -ffast-math >>> > >>> > > >>> > > It already does (I fixed that recently), but we only phi-translate >>> > > during >>> > > insertion and we >>> > > don't insert for that case, as obviously there is no partial redundancy. >>> > >>> > True, but if it discovered all the new phi arguments would be constant >>> > it used to create a new phi node with the new constant values and let >>> > eliminate replace the old calculation with the new phi node. >>> > >>> > Maybe it only did this if all the constants ended up the same value, >>> > but it would be trivial to do it if all the arguments are constant, >>> > regardless of whether they are the same value. >>> > :) >>> >>> Well, we already do for >>> >>> int foo (int b) >>> { >>> double i; >>> if (b) >>> i = 4; >>> else >>> i = 9; >>> return __builtin_sqrt(i); >>> } >>> >>> : >>> # i_1 = PHI <4.0e+0(5), 9.0e+0(3)> >>> # prephitmp.11_7 = PHI <2.0e+0(5), 3.0e+0(3)> >>> # prephitmp.12_8 = PHI <2(5), 3(3)> >>> D.1238_5 = prephitmp.11_7; >>> D.1237_6 = prephitmp.12_8; >>> return D.1237_6; >> >> Ok, for return __builtin_pow (i, i) it doesn't work because we >> do not register phi-translations that result in constants (translating >> i does) and then we run into the if (seen) guard and fail to >> phi-translate. > > > >> >> Either we should register phi-translations for them > > We should be. > > REmember we used to always create names for constants, and then we > removed that because the constants were valid arguments for GIMPLE > expressions anyway. > > Now that we don't always produce NAME, we should be allowing > registration of translations that result in CONSTANT. > Otherwise we will also miss partial redundancies where one phi > arguments evaluates to constant, as well, because when it comes time > to look it up, it will come up with no translation, and we will assume > it's nnot partially redundant. > > --Dan > > > -- > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449 > > --- You are receiving this mail because: --- > You are on the CC list for the bug, or are watching someone who is. > -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #24 from dberlin at gcc dot gnu dot org 2008-10-15 18:50 --- Subject: Re: [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math On Wed, Oct 15, 2008 at 2:43 PM, rguenther at suse dot de <[EMAIL PROTECTED]> wrote: > > > --- Comment #23 from rguenther at suse dot de 2008-10-15 18:43 --- > Subject: Re: [4.4 Regression] calculix gets > wrong answer for -O1 -ffast-math > > On Wed, 15 Oct 2008, rguenther at suse dot de wrote: > >> --- Comment #22 from rguenther at suse dot de 2008-10-15 18:33 --- >> Subject: Re: [4.4 Regression] calculix gets >> wrong answer for -O1 -ffast-math >> >> On Wed, 15 Oct 2008, dberlin at dberlin dot org wrote: >> >> > --- Comment #21 from dberlin at gcc dot gnu dot org 2008-10-15 17:55 >> > --- >> > Subject: Re: [4.4 Regression] calculix gets wrong answer for -O1 >> > -ffast-math >> > >> > > >> > > It already does (I fixed that recently), but we only phi-translate during >> > > insertion and we >> > > don't insert for that case, as obviously there is no partial redundancy. >> > >> > True, but if it discovered all the new phi arguments would be constant >> > it used to create a new phi node with the new constant values and let >> > eliminate replace the old calculation with the new phi node. >> > >> > Maybe it only did this if all the constants ended up the same value, >> > but it would be trivial to do it if all the arguments are constant, >> > regardless of whether they are the same value. >> > :) >> >> Well, we already do for >> >> int foo (int b) >> { >> double i; >> if (b) >> i = 4; >> else >> i = 9; >> return __builtin_sqrt(i); >> } >> >> : >> # i_1 = PHI <4.0e+0(5), 9.0e+0(3)> >> # prephitmp.11_7 = PHI <2.0e+0(5), 3.0e+0(3)> >> # prephitmp.12_8 = PHI <2(5), 3(3)> >> D.1238_5 = prephitmp.11_7; >> D.1237_6 = prephitmp.12_8; >> return D.1237_6; > > Ok, for return __builtin_pow (i, i) it doesn't work because we > do not register phi-translations that result in constants (translating > i does) and then we run into the if (seen) guard and fail to > phi-translate. > > Either we should register phi-translations for them We should be. REmember we used to always create names for constants, and then we removed that because the constants were valid arguments for GIMPLE expressions anyway. Now that we don't always produce NAME, we should be allowing registration of translations that result in CONSTANT. Otherwise we will also miss partial redundancies where one phi arguments evaluates to constant, as well, because when it comes time to look it up, it will come up with no translation, and we will assume it's nnot partially redundant. --Dan -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #23 from rguenther at suse dot de 2008-10-15 18:43 --- Subject: Re: [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math On Wed, 15 Oct 2008, rguenther at suse dot de wrote: > --- Comment #22 from rguenther at suse dot de 2008-10-15 18:33 --- > Subject: Re: [4.4 Regression] calculix gets > wrong answer for -O1 -ffast-math > > On Wed, 15 Oct 2008, dberlin at dberlin dot org wrote: > > > --- Comment #21 from dberlin at gcc dot gnu dot org 2008-10-15 17:55 > > --- > > Subject: Re: [4.4 Regression] calculix gets wrong answer for -O1 > > -ffast-math > > > > > > > > It already does (I fixed that recently), but we only phi-translate during > > > insertion and we > > > don't insert for that case, as obviously there is no partial redundancy. > > > > True, but if it discovered all the new phi arguments would be constant > > it used to create a new phi node with the new constant values and let > > eliminate replace the old calculation with the new phi node. > > > > Maybe it only did this if all the constants ended up the same value, > > but it would be trivial to do it if all the arguments are constant, > > regardless of whether they are the same value. > > :) > > Well, we already do for > > int foo (int b) > { > double i; > if (b) > i = 4; > else > i = 9; > return __builtin_sqrt(i); > } > > : > # i_1 = PHI <4.0e+0(5), 9.0e+0(3)> > # prephitmp.11_7 = PHI <2.0e+0(5), 3.0e+0(3)> > # prephitmp.12_8 = PHI <2(5), 3(3)> > D.1238_5 = prephitmp.11_7; > D.1237_6 = prephitmp.12_8; > return D.1237_6; Ok, for return __builtin_pow (i, i) it doesn't work because we do not register phi-translations that result in constants (translating i does) and then we run into the if (seen) guard and fail to phi-translate. Either we should register phi-translations for them or not do that seen test for expr->kind == NAME (it shouldn't recurse for that, no?). Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #22 from rguenther at suse dot de 2008-10-15 18:33 --- Subject: Re: [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math On Wed, 15 Oct 2008, dberlin at dberlin dot org wrote: > --- Comment #21 from dberlin at gcc dot gnu dot org 2008-10-15 17:55 > --- > Subject: Re: [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math > > > > > It already does (I fixed that recently), but we only phi-translate during > > insertion and we > > don't insert for that case, as obviously there is no partial redundancy. > > True, but if it discovered all the new phi arguments would be constant > it used to create a new phi node with the new constant values and let > eliminate replace the old calculation with the new phi node. > > Maybe it only did this if all the constants ended up the same value, > but it would be trivial to do it if all the arguments are constant, > regardless of whether they are the same value. > :) Well, we already do for int foo (int b) { double i; if (b) i = 4; else i = 9; return __builtin_sqrt(i); } : # i_1 = PHI <4.0e+0(5), 9.0e+0(3)> # prephitmp.11_7 = PHI <2.0e+0(5), 3.0e+0(3)> # prephitmp.12_8 = PHI <2(5), 3(3)> D.1238_5 = prephitmp.11_7; D.1237_6 = prephitmp.12_8; return D.1237_6; at least. Somebody needs to look why it doesn't happen for the testcase posted. Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #21 from dberlin at gcc dot gnu dot org 2008-10-15 17:55 --- Subject: Re: [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math > > It already does (I fixed that recently), but we only phi-translate during > insertion and we > don't insert for that case, as obviously there is no partial redundancy. True, but if it discovered all the new phi arguments would be constant it used to create a new phi node with the new constant values and let eliminate replace the old calculation with the new phi node. Maybe it only did this if all the constants ended up the same value, but it would be trivial to do it if all the arguments are constant, regardless of whether they are the same value. :) --Dan -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #20 from law at redhat dot com 2008-10-15 17:36 --- Subject: Re: [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math jakub at gcc dot gnu dot org wrote: > --- Comment #14 from jakub at gcc dot gnu dot org 2008-10-15 09:08 > --- > The problem is that thread_across_edge figures out that the fabs (al[0] - > al[1]) > < 1.e-5 test is unnecessary, always yields false for +-1.0, by substituting > the > values in record_temporary_equivalences_from_stmts_at_dest, but doesn't > actually optimize all the computations to constants. It's not safe to actually optimize the computations because the equivalences we use may be specific to a path through the CFG. One could easily argue that when these situations arise we've actually identified a missed optimization in PRE. Or one could argue that the block in question ought to be a candidate for duplication and tacking onto the end of its predecessor blocks (super-block formation) which would expose the partial redundancy at the cost of duplicating statements. I've generally not been a fan of super-block formation as, IMHO, PRE catches the vast majority of things super-block formation would and PRE doesn't have the code expansion problems that the super-block approach does. [ ... ] > is quite clear and this testcase definitely relies on +0 vs. -0 difference > heavily. > > So I'd say this should be closed as INVALID. > > Or kept open as an enhancement request for PRE. jeff -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #19 from rguenther at suse dot de 2008-10-15 13:18 --- Subject: Re: [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math On Wed, 15 Oct 2008, dberlin at dberlin dot org wrote: > > > --- Comment #18 from dberlin at gcc dot gnu dot org 2008-10-15 13:06 > --- > Subject: Re: [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math > > Making PRE do this is somewhat trivial. > Just extend fully_constant_expression to fold builtins, like it used > to, and it should just DTRT. It already does (I fixed that recently), but we only phi-translate during insertion and we don't insert for that case, as obviously there is no partial redundancy. Richard. > On Wed, Oct 15, 2008 at 5:38 AM, jakub at gcc dot gnu dot org > <[EMAIL PROTECTED]> wrote: > > > > > > --- Comment #16 from jakub at gcc dot gnu dot org 2008-10-15 09:38 > > --- > > After discussion with richi, I'd like to turn this into an enhancement > > request > > for 4.5 to extend PRE/SCCVN to be able to optimize that: > > # cn_43 = PHI <-1.0e+0(4), 1.0e+0(3)> > > D.1267_44 = __builtin_pow (cn_43, 2.0e+0); > > into a constant. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #18 from dberlin at gcc dot gnu dot org 2008-10-15 13:06 --- Subject: Re: [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math Making PRE do this is somewhat trivial. Just extend fully_constant_expression to fold builtins, like it used to, and it should just DTRT. On Wed, Oct 15, 2008 at 5:38 AM, jakub at gcc dot gnu dot org <[EMAIL PROTECTED]> wrote: > > > --- Comment #16 from jakub at gcc dot gnu dot org 2008-10-15 09:38 > --- > After discussion with richi, I'd like to turn this into an enhancement request > for 4.5 to extend PRE/SCCVN to be able to optimize that: > # cn_43 = PHI <-1.0e+0(4), 1.0e+0(3)> > D.1267_44 = __builtin_pow (cn_43, 2.0e+0); > into a constant. > > > -- > > jakub at gcc dot gnu dot org changed: > > What|Removed |Added > > CC||dberlin at gcc dot gnu dot > ||org, rguenth at gcc dot gnu > ||dot org > Target Milestone|4.4.0 |4.5.0 > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449 > > --- You are receiving this mail because: --- > You are on the CC list for the bug, or are watching someone who is. > -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #17 from jakub at gcc dot gnu dot org 2008-10-15 09:47 --- Testcase where only one constant, not two alternative constants, are on the entry of the threaded bb. extern int printf (const char *, ...); void __attribute__((noinline)) test (double cn, int *neig) { double tt, al[3]; *neig = 3; if (__builtin_fabs(cn) > 1.) cn = -1.; tt = __builtin_atan2 (__builtin_sqrt (1. - __builtin_pow (cn, 2.)), cn) * 3.333e-1; al[0] = __builtin_cos (tt); al[1] = __builtin_cos (2.0943951023931944 + tt); al[2] = __builtin_cos (4.1887902047863879 + tt); if ((__builtin_fabs ( al[0] - al[1]) < 1.e-5) || (__builtin_fabs (al[0] - al[2]) < 1.e-5) || (__builtin_fabs (al[1] - al[2]) < 1.e-5)) *neig = 2; } int main () { int neig; test (-1.0, &neig); printf ("neig = %d\n", neig); if (neig != 2) __builtin_abort (); test (1.0, &neig); printf ("neig = %d\n", neig); if (neig != 2) __builtin_abort (); test (-2.0, &neig); printf ("neig = %d\n", neig); if (neig != 2) __builtin_abort (); return 0; } At -O2 -funsafe-math-optimizations, before ccp3 we have: D.1262_38 = __builtin_pow (-1.0e+0, 2.0e+0); D.1263_40 = 1.0e+0 - D.1262_38; D.1264_41 = __builtin_sqrt (D.1263_40); D.1265_43 = __builtin_atan2 (D.1264_41, -1.0e+0); tt_45 = D.1265_43 * 3.33314829616256247390992939472198486328125e-1; D.1266_46 = __builtin_cos (tt_45); D.1267_47 = tt_45 + 2.094395102393194374457152662216685712337493896484375e+0; D.1268_48 = __builtin_cos (D.1267_47); D.1269_49 = tt_45 + 4.1887902047863878607358856243081390857696533203125e+0; D.1270_50 = __builtin_cos (D.1269_49); and neither ccp3, nor pre is able to optimize it out, only fab optimizes the first call (pow) and then dom2 handles the rest. -- jakub at gcc dot gnu dot org changed: What|Removed |Added Target Milestone|4.5.0 |4.4.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #16 from jakub at gcc dot gnu dot org 2008-10-15 09:38 --- After discussion with richi, I'd like to turn this into an enhancement request for 4.5 to extend PRE/SCCVN to be able to optimize that: # cn_43 = PHI <-1.0e+0(4), 1.0e+0(3)> D.1267_44 = __builtin_pow (cn_43, 2.0e+0); into a constant. -- jakub at gcc dot gnu dot org changed: What|Removed |Added CC||dberlin at gcc dot gnu dot ||org, rguenth at gcc dot gnu ||dot org Target Milestone|4.4.0 |4.5.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #15 from jakub at gcc dot gnu dot org 2008-10-15 09:17 --- FYI, if you compile with -O1 -ffast-math -fsigned-zeros, then it works correctly. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #14 from jakub at gcc dot gnu dot org 2008-10-15 09:08 --- The problem is that thread_across_edge figures out that the fabs (al[0] - al[1]) < 1.e-5 test is unnecessary, always yields false for +-1.0, by substituting the values in record_temporary_equivalences_from_stmts_at_dest, but doesn't actually optimize all the computations to constants. If the threading duplicated block is used just for one constant, not two, then fab optimizes the __builtin_pow call into a constant and dom2 optimizes the rest into a constant. But as we have two different constants leading to the same block, nothing in GCC optimizes it out, and given -fno-signed-zeros and the testcase very much depending on the sign of zeros, the outcome is different from what the compiler expected. While GCC perhaps could optimize: # cn_43 = PHI <-1.0e+0(3), 1.0e+0(9)> D.1267_44 = __builtin_pow (cn_43, 2.0e+0); D.1268_46 = 1.0e+0 - D.1267_44; D.1269_47 = __builtin_sqrt (D.1268_46); into a constant with some smarter fab or dom hack for multiple constants, it can't already optimize the following: D.1270_49 = __builtin_atan2 (D.1269_47, cn_43); or D.1270_49 = __builtin_atan2 (0.0, cn_43); because that yields different values for -1 and 1. While even the partial optimization would cure this testcase and is perhaps an useful enhancement, I believe it is just wrong to compile this part of calculix with -ffast-math and you get what you deserve. `-fno-signed-zeros' Allow optimizations for floating point arithmetic that ignore the signedness of zero. IEEE arithmetic specifies the behavior of distinct +0.0 and -0.0 values, which then prohibits simplification of expressions such as x+0.0 or 0.0*x (even with `-ffinite-math-only'). This option implies that the sign of a zero result isn't significant. is quite clear and this testcase definitely relies on +0 vs. -0 difference heavily. So I'd say this should be closed as INVALID. -- jakub at gcc dot gnu dot org changed: What|Removed |Added CC||law at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #13 from jakub at gcc dot gnu dot org 2008-10-14 17:35 --- The different results between -O1 -f{,no}unsafe-math-optimizations are because this testcase relies heavily on signed zeros, and with with fast math 0 and -0 aren't considered to make a difference, you get at least different value of tt depending on it (for cn -1.0 it can be either pi/3 or -pi/3, depending on whether sqrt returned -0 or 0). But no matter whether tt is pi/3 or -pi/3, two of the results should be 0.5 and so the test should succeed. But dom1 when it decides to duplicate the bb's (once for the passed in cn, once for +-1.0) omits unexpectedly the first conditional (i.e. set *neig = 2 if fabs (al[0] - al[1]) < 1.e-5). It does that both when -funsafe-math-optimizations and -fno-unsafe-math-optimizations, but given that for -funsafe-math-optimizations tt is -pi/3, al[0] is 0.5 and al[1] is 0.5, so this is fatal, while for non-fast math al[0] is 0.5 and al[2] is 0.5, so it doesn't care. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #12 from pinskia at gcc dot gnu dot org 2008-09-24 00:21 --- The one thing which we should do after sra_early is another pass_rename_ssa_copies so we get more correct variable/debug names. (but that is not the issue). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #11 from janis at gcc dot gnu dot org 2008-09-24 00:16 --- Created an attachment (id=16398) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16398&action=view) yet another C testcase I still don't understand what's going on, but have a new testcase that demonstrates a few things. By looking at tree dumps and the generated code I see that sometimes one of the comparisions of al[0], al[1], and al[2] is skipped, which I don't understand, but apparently which one is skipped is affected by -ffast-math, or is perhaps a Heisenbug that just looks as if it's affected by -ffast-math. Some interesting output from the test: elm3b149% $GCC -v Using built-in specs. Target: powerpc64-linux Configured with: /home/janis/gcc_trunk_anonsvn/gcc/configure --prefix=/home/janis/tools/gcc-trunk-anonsvn --build=powerpc64-linux --host=powerpc64-linux --target=powerpc64-linux --with-cpu=default32 --with-as=/home/janis/tools/binutils-20080903/bin/as --with-ld=/home/janis/tools/binutils-20080903/bin/ld --enable-threads=posix --enable-shared --enable-__cxa_atexit --enable-languages=c,c++,fortran --with-gmp=/home/janis/tools/gmp-mpfr-32 --with-mpfr=/home/janis/tools/gmp-mpfr-32 --with-long-double-128 --enable-secureplt --disable-libstdcxx-pch Thread model: posix gcc version 4.4.0 20080923 (experimental) [trunk revision 140601] (GCC) elm3b149% $GCC -O1 -ffast-math 37449-3.c -lm && a.out cn = -2 neig = 3 Aborted elm3b149% $GCC -O1 37449-3.c -lm && a.out cn = -2 neig = 2 elm3b149% $GCC -DDBG -O1 -ffast-math 37449-3.c -lm && a.out cn = -2 tt= -1.0472 al[0] = 0.5 al[1] = 0.5 al[2] = -1 neig = 2 elm3b149% $GCC -DDBG -O1 37449-3.c -lm && a.out cn = -2 tt= 1.0472 al[0] = 0.5 al[1] = -1 al[2] = 0.5 neig = 2 elm3b149% $GCC -DCN=-1. -O1 -ffast-math 37449-3.c -lm && a.out cn = -1 neig = 2 This won't make sense without looking at the testcase, but the test should get the same result whenever cn = -1., but it gets different results for -1. and -2., going through different paths through the generated code. Notice also that the values of the array al[] are switched around depending on the options used. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #10 from janis at gcc dot gnu dot org 2008-09-22 23:08 --- The new testcase passes with "-O1 -funsafe-math-optimizations -fno-tree-dominator-opts". The dom1 dump for "-O1 -funsafe-math-optimizations" twice reports "Invalid sum of incoming frequencies". -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #9 from pinskia at gcc dot gnu dot org 2008-09-22 22:45 --- The one thing I noticed is that fsel is used in the -ffast-math case and it does a subtraction. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #8 from janis at gcc dot gnu dot org 2008-09-22 22:12 --- Created an attachment (id=16382) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16382&action=view) small C testcase that fails with current trunk This version of the small C testcase fails with current mainline with "-O1 -ffast-math" and its behavior changed between r134831 and r134933. -- janis at gcc dot gnu dot org changed: What|Removed |Added Attachment #16365|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #7 from janis at gcc dot gnu dot org 2008-09-20 00:26 --- Sigh. My nifty small C testcase doesn't fail with current mainline. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #6 from janis at gcc dot gnu dot org 2008-09-19 22:19 --- Created an attachment (id=16365) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16365&action=view) minimized C testcase I don't yet understand what's going on but was able to come up with a relatively small executable test case in C. It fails with "-O1 -funsafe-math-optimizations" for r134833 but passes with those options for r134831. With r134833 it passes without -funsafe-math-optimizations. This testcase doesn't care about -ftree-fre. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #5 from janis at gcc dot gnu dot org 2008-09-10 17:52 --- I made some mistakes in my previous comments; -ftree-fre is part of -O1, and what I meant to say is that calculix gets wrong results for "-O1 -ffast-math" but correct results for "-O1 -fno-tree-fre -ffast-math.h". The reghunt result for the -O1 failure made no sense, but it turned out that the test had failed for another reason and the failure actually starts with this patch: http://gcc.gnu.org/viewcvs?view=rev&rev=134832 r134833 | espindola | 2008-04-30 17:21:55 -funsafe-math-optimizations causes lots of operations to be turned into builtins,so I'll play with the testcase more with that in mind. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #4 from bergner at gcc dot gnu dot org 2008-09-10 15:14 --- Sorry, ignore my Comment #3. It should have been posted to a different bugzilla entry. :( -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #3 from bergner at gcc dot gnu dot org 2008-09-10 01:33 --- With a mainline from today, it fails for me at -O2. Looking into it, it's foo() that is miscompiled (I broke the 3 functions into their own files and recompiled them), It's also the last element of results (ie, results[19] that miscompares (141 versus expected value of 190). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449
[Bug tree-optimization/37449] [4.4 Regression] calculix gets wrong answer for -O1 -ffast-math
--- Comment #2 from janis at gcc dot gnu dot org 2008-09-09 23:40 --- There's a Heisenbug involved here. A reghunt for failures with "-O1 -ftree-fre -funsafe-math-optimizations" came up with a nonsensical result, and I can sometimes get it to fail with only "-O1 -funsafe-math-optimizations". -- janis at gcc dot gnu dot org changed: What|Removed |Added Summary|[4.4 Regression] calculix |[4.4 Regression] calculix |gets wrong answer for -O1 - |gets wrong answer for -O1 - |ftree-pre -ffast-math |ffast-math http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37449