------- Comment #109 from pepalogik at seznam dot cz 2008-05-20 16:59 ------- I also encountered such problems and was going to report it as a bug in GCC... But in the GCC bug (not) reporting guide, there is fortunately a link to this page and here (comment #96) is a link to David Monniaux's paper about floating-point computations. This explains it closely but it is maybe too long. I have almost read it and hope I have understood it properly. So I'll give a brief explanation (for those who don't know it yet) of the reasons of such a strange behaviour. Then I'll assess where the bug actually is (in GCC or CPU). Then I'll write the solution (!) and finally a few recommendations to the GCC team.
EXPLANATION The x87 FPU was originally designed in (or before) 1980. I think that's why it is quite simple: it has only one unit for all FP data types. Of course, the precision must be of the widest type, which is the 80-bit long double. Consider you have a program, where all the FP variables are of the type double. You perform some FP operations and one of them is e.g. 1e-300/1e300, which results in 1e-600. Despite this value cannot be held by a "double", it is stored in an 80-bit FPU register as the result. Consider you use the variable "x" to hold that result. If the program has been compiled with optimization, the value need not be stored in RAM. So, say, it is still in the register. Consider you need x to be nonzero, so you perform the test x != 0. Since 1e-600 is not zero, the test yields true. While you perform some other computations, the value is moved to RAM and converted to 0 because x is of type "double". Now you want to use your certainly nonzero x... Hard luck :-( Note that if the result doesn't have its corresponding variable and you perform the test directly on an expression, the problem can come to light even without optimization. It could seem that performing all FP operations in extended precision can bring benefits only. But it introduces a serious pitfall: moving a value may change the value!!! WHERE'S THE BUG This is really not a GCC bug. The bug is actually in the x87 FPU because it doesn't obey the IEEE standard. SOLUTION The x87 FPU is still present in contemporary processors (including AMD) due to compatibility. I think most of PC software still uses it. But new processors have also another FPU, called SSE, and this do obey the IEEE. GCC in 32-bit mode compiles for x87 by default but it is able to compile for the SSE, too. So the solution is to add these options to the compilation command: -march=* -msse -mfpmath=sse Yes, this definitely resolves the problem - but not for all processors. The * can be one of the following: pentium3, pentium3m, pentium-m, pentium4, pentium4m, prescott, nocona, athlon-4, athlon-xp, athlon-mp, k8, opteron, athlon64, athlon-fx and c3-2 (I'm unsure about athlon and athlon-tbird). Beside -msse, you can also add some of -mmmx, -msse2, -msse3 and -m3dnow, if the CPU supports them (see GCC doc or CPU doc). If you wish to compile for processors which don't have SSE, you have a few possibilities: (1) A very simple solution: Use long double everywhere. (But be careful when transfering binary data in long double format between computers because this format is not standardized and so the concrete bit representations vary between different CPU architectures.) (2) A partial but simple solution: Do comparisons on volatile variables only. (3) A similar solution: Try to implement a "discard_extended_precision" function suggested by Egon in comment #88. (4) A complex solution: Before doing any mathematical operation or comparison, put the operands into variables and put also the result to a variable (i.e. don't use complex expressions). For example, instead of { c = 2*(a+b); } , write { double s = a+b; c = 2*s; } . I'm unsure about arrays but I think they should be OK. When you have modified your code in this manner, then compile it either without optimization or, when using optimization, use -ffloat-store. In order to avoid double rounding (i.e. rounding twice), it is also good to decrease the FPU precision by changing its control word in the beginning of your program (see comment #60). Then you should also apply -frounding-math. (5) A radical solution: Find a job/hobby where computers are not used at all. RECOMMENDATIONS I think this problem is really serious and general. Therefore, programmers should be warned soon enough. This recommendation should be addressed especially to authors of programming coursebooks. But I think there could also be a paragraph about it in the GCC documentation (I haven't read it wholly but it doesn't seem there's any warning against x87). And, of course, there should be a warning in the bug reporting guide (http://gcc.gnu.org/bugs.html). It's fine there's a link to this page (Bug 323) but the example with (int)(a/b) is insufficient. It only demonstrates that real numbers are often not represented exactly in the computer. It doesn't demonstrate the x87 pitfall. Hence there should be an example such as the initial code of this "GCC bug 323 report". Because when one sees the example with (int)(a/b), he can say "It's trivial" and not click the link (as I did the first time). EPILOGUE I hope my effort of writing this "comment #109" will be helpful for many people. If you want more info, read the David Monniaux's work or something else about FPUs. Thanks to David Monniaux. -- pepalogik at seznam dot cz changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |pepalogik at seznam dot cz http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323 ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]