------- Comment #11 from meissner at linux dot vnet dot ibm dot com 2010-01-05 18:40 ------- Subject: Re: October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math
On Tue, Jan 05, 2010 at 04:03:17PM -0000, rguenther at suse dot de wrote: > > > ------- Comment #9 from rguenther at suse dot de 2010-01-05 16:03 ------- > Subject: Re: October 23rd change to tree-ssa-pre.c > breaks calculix on powerpc with -ffast-math > > On Tue, 5 Jan 2010, segher at kernel dot crashing dot org wrote: > > > ------- Comment #7 from segher at kernel dot crashing dot org 2010-01-05 > > 15:57 ------- > > With -fno-signed-zeroes, a-b*c is transformed to -(b*c-a), which is a > > machine > > instruction. If the result would have been +0 before, it now is -0. > > The code then takes the sqrt() of that; sqrt(-0) is -0. This then is > > passed to atan2(), which has a discontinuity at 0, and we get wildly > > diverging results. > > > > The compiler does nothing wrong here; the transformation is perfectly > > valid. > > > > A solution might be to transform e.g. atan2(x,y) into atan2(+0.+x,+0.+y) > > when -fno-signed-zeroes is in effect (and of course somehow make sure > > those additions aren't optimised away). Similar for other math library > > functions with discontinuities at +/- 0. > > Right. Just it might be simpler with -fno-signed-zeros to > transform a-b*c to 0 + -(b*c-a). Of course if the result was -0 > before then it will be +0 after either variant (and the atan2 > discontinuity would still happen even with your fix). > > Thus whatever "fix" the underlying problem is surely that calculix > is not really -fno-signed-zeros safe. Can't we get lucky again > as before by trying to recover the PRE code change? I've come to the conclusion that the compiler is doing the correct action in terms fo the FMA, and that we should not remove the current optimizations. I suspect the AMD/Intel folks will see this also when the AVX/FMA4 hardware shows up since they also have FMA instructions that generate -0.0 in this case. I don't recall when I was doing the old SSE5 support whether we had run the tests through the simulator with -ffast-math or not. However, I dunno whether there should be a -fno-signed-zeroes version of atan2 that does not give a different result for the function atan2 (-0.0, 1.0) than for atan2 (0.0, 1.0). I know in the past, we've floated ideas about having a fast math library that doesn't worry about Nans/negative 0/etc. Or whether Fortran needs such a wrapper since I don't believe signed zeroes are a Fortran concept. For my SPEC runs, I now use the GNU ld --wrap function for atan2, and 'fix' the negative zeroes by adding 0.0: static double zero = 0.0; extern double __real_atan2 (double, double); double __wrap_atan2 (double x, double y) { return __real_atan2 (x + zero, y + zero); } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42286