http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59486

            Bug ID: 59486
           Summary: math functions take more cycles after running any
                    Intel AVX function
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: major
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: kayan4096 at gmail dot com

Created attachment 31427
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31427&action=edit
the .i file

when using any AVX256 instruction, the "AVX upper state" becomes "dirty", which
results in a performance hit to all legacy library calls. 
This is documented in the Intel Optimization Manual.

gcc should clean the YMM register after using AVX.

for the attached foo.i the result we get are:
round res 31999997 total cycles 224725952 CPI 22
round res 31999997 total cycles 1900864520 CPI 190

while the expected results are:
round res 31999997 total cycles 224725952 CPI 22
round res 31999997 total cycles 224725952 CPI 22

This is also described here:
http://stackoverflow.com/questions/20545539/math-functions-takes-more-cycles-after-running-any-intel-avx-function

"gcc -v -save-temps -Wall -mavx -lm foo.c" output:
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk
--disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib
--with-ppl --with-cloog --with-tune=generic --with-arch_32=i686
--build=x86_64-redhat-linux
Thread model: posix
gcc version 4.4.6 20110731 (Red Hat 4.4.6-3) (GCC) 
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-mavx' '-mtune=generic'
 /usr/libexec/gcc/x86_64-redhat-linux/4.4.6/cc1 -E -quiet -v foo.c -mavx
-mtune=generic -Wall -fpch-preprocess -o foo.i
ignoring nonexistent directory
"/usr/lib/gcc/x86_64-redhat-linux/4.4.6/include-fixed"
ignoring nonexistent directory
"/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../x86_64-redhat-linux/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /usr/lib/gcc/x86_64-redhat-linux/4.4.6/include
 /usr/include
End of search list.
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-mavx' '-mtune=generic'
 /usr/libexec/gcc/x86_64-redhat-linux/4.4.6/cc1 -fpreprocessed foo.i -quiet
-dumpbase foo.c -mavx -mtune=generic -auxbase foo -Wall -version -o foo.s
GNU C (GCC) version 4.4.6 20110731 (Red Hat 4.4.6-3) (x86_64-redhat-linux)
    compiled by GNU C version 4.4.6 20110731 (Red Hat 4.4.6-3), GMP version
4.3.1, MPFR version 2.4.1.
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 11bca756726d0c8e79657fd5bb53575a
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-mavx' '-mtune=generic'
 as -V -Qy -msse2avx -o foo.o foo.s
GNU assembler version 2.20.51.0.2 (x86_64-redhat-linux) using BFD version
version 2.20.51.0.2-5.28.el6 20091009
COMPILER_PATH=/usr/libexec/gcc/x86_64-redhat-linux/4.4.6/:/usr/libexec/gcc/x86_64-redhat-linux/4.4.6/:/usr/libexec/gcc/x86_64-redhat-linux/:/usr/lib/gcc/x86_64-redhat-linux/4.4.6/:/usr/lib/gcc/x86_64-redhat-linux/:/usr/libexec/gcc/x86_64-redhat-linux/4.4.6/:/usr/libexec/gcc/x86_64-redhat-linux/:/usr/lib/gcc/x86_64-redhat-linux/4.4.6/:/usr/lib/gcc/x86_64-redhat-linux/
LIBRARY_PATH=/usr/lib/gcc/x86_64-redhat-linux/4.4.6/:/usr/lib/gcc/x86_64-redhat-linux/4.4.6/:/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-mavx' '-mtune=generic'
 /usr/libexec/gcc/x86_64-redhat-linux/4.4.6/collect2 --eh-frame-hdr --build-id
-m elf_x86_64 --hash-style=gnu -dynamic-linker /lib64/ld-linux-x86-64.so.2
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64/crt1.o
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64/crti.o
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/crtbegin.o
-L/usr/lib/gcc/x86_64-redhat-linux/4.4.6
-L/usr/lib/gcc/x86_64-redhat-linux/4.4.6
-L/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64 -L/lib/../lib64
-L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../.. -lm foo.o
-lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s
--no-as-needed /usr/lib/gcc/x86_64-redhat-linux/4.4.6/crtend.o
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64/crtn.o

Reply via email to