Do you guys have any sort of empirical evidence that scalar SSE2 math is faster than plain old x87?
I ask because every time I tried compiling FFTW with gcc -m32 -mfpmath=sse, the result has been invariably slower than the vanilla x87 compilation. (I am talking about scalar arithmetic here. FFTW also supports SSE2 2-way vector arithmetic, which is of course faster.) I also remember trying similar experiments with other numerical code in the Pentium 4 dark ages, with similar results. I don't see any reason why this should be the case, and maybe this is just a problem of gcc, but I don't think you should automatically assume that SSE2 math is faster without running a few experiments first. Regards, Matteo Frigo _______________________________________________ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs