http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59486
Bug ID: 59486 Summary: math functions take more cycles after running any Intel AVX function Product: gcc Version: unknown Status: UNCONFIRMED Severity: major Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: kayan4096 at gmail dot com Created attachment 31427 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31427&action=edit the .i file when using any AVX256 instruction, the "AVX upper state" becomes "dirty", which results in a performance hit to all legacy library calls. This is documented in the Intel Optimization Manual. gcc should clean the YMM register after using AVX. for the attached foo.i the result we get are: round res 31999997 total cycles 224725952 CPI 22 round res 31999997 total cycles 1900864520 CPI 190 while the expected results are: round res 31999997 total cycles 224725952 CPI 22 round res 31999997 total cycles 224725952 CPI 22 This is also described here: http://stackoverflow.com/questions/20545539/math-functions-takes-more-cycles-after-running-any-intel-avx-function "gcc -v -save-temps -Wall -mavx -lm foo.c" output: Using built-in specs. Target: x86_64-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux Thread model: posix gcc version 4.4.6 20110731 (Red Hat 4.4.6-3) (GCC) COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-mavx' '-mtune=generic' /usr/libexec/gcc/x86_64-redhat-linux/4.4.6/cc1 -E -quiet -v foo.c -mavx -mtune=generic -Wall -fpch-preprocess -o foo.i ignoring nonexistent directory "/usr/lib/gcc/x86_64-redhat-linux/4.4.6/include-fixed" ignoring nonexistent directory "/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../x86_64-redhat-linux/include" #include "..." search starts here: #include <...> search starts here: /usr/local/include /usr/lib/gcc/x86_64-redhat-linux/4.4.6/include /usr/include End of search list. COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-mavx' '-mtune=generic' /usr/libexec/gcc/x86_64-redhat-linux/4.4.6/cc1 -fpreprocessed foo.i -quiet -dumpbase foo.c -mavx -mtune=generic -auxbase foo -Wall -version -o foo.s GNU C (GCC) version 4.4.6 20110731 (Red Hat 4.4.6-3) (x86_64-redhat-linux) compiled by GNU C version 4.4.6 20110731 (Red Hat 4.4.6-3), GMP version 4.3.1, MPFR version 2.4.1. GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: 11bca756726d0c8e79657fd5bb53575a COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-mavx' '-mtune=generic' as -V -Qy -msse2avx -o foo.o foo.s GNU assembler version 2.20.51.0.2 (x86_64-redhat-linux) using BFD version version 2.20.51.0.2-5.28.el6 20091009 COMPILER_PATH=/usr/libexec/gcc/x86_64-redhat-linux/4.4.6/:/usr/libexec/gcc/x86_64-redhat-linux/4.4.6/:/usr/libexec/gcc/x86_64-redhat-linux/:/usr/lib/gcc/x86_64-redhat-linux/4.4.6/:/usr/lib/gcc/x86_64-redhat-linux/:/usr/libexec/gcc/x86_64-redhat-linux/4.4.6/:/usr/libexec/gcc/x86_64-redhat-linux/:/usr/lib/gcc/x86_64-redhat-linux/4.4.6/:/usr/lib/gcc/x86_64-redhat-linux/ LIBRARY_PATH=/usr/lib/gcc/x86_64-redhat-linux/4.4.6/:/usr/lib/gcc/x86_64-redhat-linux/4.4.6/:/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-mavx' '-mtune=generic' /usr/libexec/gcc/x86_64-redhat-linux/4.4.6/collect2 --eh-frame-hdr --build-id -m elf_x86_64 --hash-style=gnu -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64/crt1.o /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64/crti.o /usr/lib/gcc/x86_64-redhat-linux/4.4.6/crtbegin.o -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../.. -lm foo.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-redhat-linux/4.4.6/crtend.o /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64/crtn.o