On Fri, Jun 22, 2007 at 01:16:54PM +0100, Simon Marlow wrote:
Philip Armstrong wrote:
IIRC, it is possible to issue an instruction to the x86 FP unit which
makes all operations work on 64-bit Doubles, even though there are
80-bits available internally. Which then means there's no requirement
to spill intermediate results to memory in order to get the rounding
correct.

For some background on why GHC doesn't do this, see the comment "MORE FLOATING POINT MUSINGS..." in

  http://darcs.haskell.org/ghc/compiler/nativeGen/MachInstrs.hs

Twisty. I guess 'slow, but correct, with switches to go faster at the
price of correctness' is about the best option.

You probably want SSE2. If I ever get around to finishing it, the GHC native code generator will be able to generate SSE2 code on x86 someday, like it currently does for x86-64. For now, to get good FP performance on x86, you probably want

  -fvia-C -fexcess-precision -optc-mfpmath=sse2

Reading the gcc manpage, I think you mean -optc-msse2
-optc-mfpmath=sse. -mfpmath=sse2 doesn't appear to be an option.

(I note in passing that the ghc darcs head produces binaries from
ray.hs which are about 15% slower than ghc 6.6.1 ones btw. Same
optimisation options used both times.)

cheers, Phil

--
http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to