[Bug rtl-optimization/323] optimized code gives strange floating point results

pepalogik at seznam dot cz Tue, 20 May 2008 10:15:24 -0700


------- Comment #109 from pepalogik at seznam dot cz  2008-05-20 16:59 -------
I also encountered such problems and was going to report it as a bug in GCC...
But in the GCC bug (not) reporting guide, there is fortunately a link to this
page and here (comment #96) is a link to David Monniaux's paper about
floating-point computations. This explains it closely but it is maybe too long.
I have almost read it and hope I have understood it properly. So I'll give a
brief explanation (for those who don't know it yet) of the reasons of such a
strange behaviour. Then I'll assess where the bug actually is (in GCC or CPU).
Then I'll write the solution (!) and finally a few recommendations to the GCC
team.


EXPLANATION
The x87 FPU was originally designed in (or before) 1980. I think that's why it
is quite simple: it has only one unit for all FP data types. Of course, the
precision must be of the widest type, which is the 80-bit long double.
Consider you have a program, where all the FP variables are of the type double.
You perform some FP operations and one of them is e.g. 1e-300/1e300, which
results in 1e-600. Despite this value cannot be held by a "double", it is
stored in an 80-bit FPU register as the result. Consider you use the variable
"x" to hold that result. If the program has been compiled with optimization,
the value need not be stored in RAM. So, say, it is still in the register.
Consider you need x to be nonzero, so you perform the test x != 0. Since 1e-600
is not zero, the test yields true. While you perform some other computations,
the value is moved to RAM and converted to 0 because x is of type "double". Now
you want to use your certainly nonzero x... Hard luck :-(
Note that if the result doesn't have its corresponding variable and you perform
the test directly on an expression, the problem can come to light even without
optimization.
It could seem that performing all FP operations in extended precision can bring
benefits only. But it introduces a serious pitfall: moving a value may change
the value!!!

WHERE'S THE BUG
This is really not a GCC bug. The bug is actually in the x87 FPU because it
doesn't obey the IEEE standard.

SOLUTION
The x87 FPU is still present in contemporary processors (including AMD) due to
compatibility. I think most of PC software still uses it. But new processors
have also another FPU, called SSE, and this do obey the IEEE. GCC in 32-bit
mode compiles for x87 by default but it is able to compile for the SSE, too. So
the solution is to add these options to the compilation command:
-march=* -msse -mfpmath=sse
Yes, this definitely resolves the problem - but not for all processors. The *
can be one of the following: pentium3, pentium3m, pentium-m, pentium4,
pentium4m, prescott, nocona, athlon-4, athlon-xp, athlon-mp, k8, opteron,
athlon64, athlon-fx and c3-2 (I'm unsure about athlon and athlon-tbird). Beside
-msse, you can also add some of -mmmx, -msse2, -msse3 and -m3dnow, if the CPU
supports them (see GCC doc or CPU doc).
If you wish to compile for processors which don't have SSE, you have a few
possibilities:
(1) A very simple solution: Use long double everywhere. (But be careful when
transfering binary data in long double format between computers because this
format is not standardized and so the concrete bit representations vary between
different CPU architectures.)
(2) A partial but simple solution: Do comparisons on volatile variables only.
(3) A similar solution: Try to implement a "discard_extended_precision"
function suggested by Egon in comment #88.
(4) A complex solution: Before doing any mathematical operation or comparison,
put the operands into variables and put also the result to a variable (i.e.
don't use complex expressions). For example, instead of { c = 2*(a+b); } ,
write { double s = a+b; c = 2*s; } . I'm unsure about arrays but I think they
should be OK. When you have modified your code in this manner, then compile it
either without optimization or, when using optimization, use -ffloat-store. In
order to avoid double rounding (i.e. rounding twice), it is also good to
decrease the FPU precision by changing its control word in the beginning of
your program (see comment #60). Then you should also apply -frounding-math.
(5) A radical solution: Find a job/hobby where computers are not used at all.

RECOMMENDATIONS
I think this problem is really serious and general. Therefore, programmers
should be warned soon enough.
This recommendation should be addressed especially to authors of programming
coursebooks. But I think there could also be a paragraph about it in the GCC
documentation (I haven't read it wholly but it doesn't seem there's any warning
against x87). And, of course, there should be a warning in the bug reporting
guide (http://gcc.gnu.org/bugs.html). It's fine there's a link to this page
(Bug 323) but the example with (int)(a/b) is insufficient. It only demonstrates
that real numbers are often not represented exactly in the computer. It doesn't
demonstrate the x87 pitfall. Hence there should be an example such as the
initial code of this "GCC bug 323 report". Because when one sees the example
with (int)(a/b), he can say "It's trivial" and not click the link (as I did the
first time).

EPILOGUE
I hope my effort of writing this "comment #109" will be helpful for many
people.
If you want more info, read the David Monniaux's work or something else about
FPUs.
Thanks to David Monniaux.


-- 

pepalogik at seznam dot cz changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pepalogik at seznam dot cz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

[Bug rtl-optimization/323] optimized code gives strange floating point results

Reply via email to