Package: gcc-4.1 Version: 4.1.1-21 Severity: normal
I am profiling an application that performs a lot of log2(.) operations. I wanted to use -funsafe-math-optimizations to have log2 executed by the FPU directly, instead of using a function call to the c library. Strangely enough, although wall-clock time and CPU time go down, the number of seconds that gprof reports now roughly quadruples, exceeding the total CPU time by a factor of two. The attached c++ file reproduces the problem. I get ** run without -funsafe-math-optimizations ** g++ -pg test.cpp && time ./a.out && gprof ./a.out 8.01531e+09 real 0m42.432s user 0m42.143s sys 0m0.148s Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls Ts/call Ts/call name 100.00 13.11 13.11 main 0.00 13.11 0.00 1 0.00 0.00 global constructors keyed to main 0.00 13.11 0.00 1 0.00 0.00 __static_initialization_and_destruction_0(int, int) ** run with -funsafe-math-optimizations ** g++ -funsafe-math-optimizations -pg test.cpp && time ./a.out && gprof ./a.out 8.01531e+09 real 0m30.165s user 0m29.898s sys 0m0.060s Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls Ts/call Ts/call name 100.00 74.87 74.87 main 0.00 74.87 0.00 1 0.00 0.00 global constructors keyed to main 0.00 74.87 0.00 1 0.00 0.00 __static_initialization_and_destruction_0(int, int) Notice that when enabling -funsafe-math-optimizations - wall clock time goes down from 43 to 31 sec. - gprof time goes up from 14 to 75 sec. - it is strange that * without -funsafe-math-optimizations the gprof time (14 sec) is much lower than wall clock time (43 sec) * with -funsafe-math-optimizations the gprof time (75 sec) is much higher than wall clock time (31 sec) the first (*) can be explained by samples that occur in the log2 system call. the second (*) can not be correct. I suspect that the instrumentation code that gcc inserts into the compiled program expects the FPU to be in a specific IEEE/ansi state, which it is not when the -funsafe-math-optimizations flag is used. With kind regards, Wouter Koolen-Wijkstra -- System Information: Debian Release: lenny/sid APT prefers testing APT policy: (990, 'testing'), (500, 'unstable'), (500, 'stable') Architecture: i386 (i686) Kernel: Linux 2.6.18v10 (SMP w/2 CPU cores; PREEMPT) Locale: LANG=en_US, LC_CTYPE=en_US (charmap=ISO-8859-1) Shell: /bin/sh linked to /bin/bash Versions of packages gcc-4.1 depends on: ii binutils 2.17-3 The GNU assembler, linker and bina ii cpp-4.1 4.1.1-21 The GNU C preprocessor ii gcc-4.1-base 4.1.1-21 The GNU Compiler Collection (base ii libc6 2.3.6.ds1-13 GNU C Library: Shared libraries ii libgcc1 1:4.1.1-21 GCC support library ii libssp0 4.1.1-21 GCC stack smashing protection libr Versions of packages gcc-4.1 recommends: ii libc6-dev 2.3.6.ds1-13 GNU C Library: Development Librari pn libmudflap0-dev <none> (no description available) -- no debconf information
#include <cmath> #include <iostream> using namespace std; int main(int argc, const char ** argv) { // do something that takes noticeable time (~30 sec) // and that does many floating point operations double d = 0; for (int i = 1; i < 300000000; i++) { d += log2(i); } cout << d << endl; return 0; }