------- Comment #22 from uros at kss-loka dot si  2006-06-27 05:49 -------
(In reply to comment #21)

> Note that you are running the opposite of my test case: SSE vs SSE rather than
> x87 vs x87.  This whole bug report is about x87 performance.  You can get more
> detail on why I want x87 in my messages above, particularly comment #11, but
> single precision is indeed the place where SSE cannot compete with the x87
> unit.  To see it, put the flags back the way I had them in the attachment, and
> you'll see that gcc 3 is much faster.  Also, you should find in single

Hm, these are x87 results:

/usr/local.uros/gcc34/bin/gcc -DREPS=1000 -fomit-frame-pointer -O -DTYPE=float
-c mmbench.c
/usr/local.uros/gcc34/bin/gcc -DREPS=1000 -fomit-frame-pointer -O -c
sgemm_atlas.c
/usr/local.uros/gcc34/bin/gcc -DREPS=1000 -fomit-frame-pointer -O -o xsmm_gcc
mmbench.o sgemm_atlas.o
rm -f *.o
/usr/local.uros/bin/gcc -DREPS=1000 -fomit-frame-pointer -O -DTYPE=float -c
mmbench.c
/usr/local.uros/bin/gcc -DREPS=1000 -fomit-frame-pointer -O -c sgemm_atlas.c
/usr/local.uros/bin/gcc -DREPS=1000 -fomit-frame-pointer -O -o xsmm_gc4
mmbench.o sgemm_atlas.o
rm -f *.o
echo "GCC 3.x     single performance:"
GCC 3.x     single performance:
./xsmm_gcc
ALGORITHM     NB   REPS        TIME      MFLOPS
=========  =====  =====  ==========  ==========

atlasmm       60   1000       0.141     3072.00

echo "GCC 4.x     single performance:"
GCC 4.x     single performance:
./xsmm_gc4
ALGORITHM     NB   REPS        TIME      MFLOPS
=========  =====  =====  ==========  ==========

atlasmm       60   1000       0.143     3029.92


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27827

Reply via email to