[Haskell-cafe] Re: Haskell version of ray tracer code is much slower than the original ML

Philip Armstrong Fri, 22 Jun 2007 06:13:25 -0700

On Fri, Jun 22, 2007 at 01:16:54PM +0100, Simon Marlow wrote:

Philip Armstrong wrote:

IIRC, it is possible to issue an instruction to the x86 FP unit which
makes all operations work on 64-bit Doubles, even though there are
80-bits available internally. Which then means there's no requirement
to spill intermediate results to memory in order to get the rounding
correct.

For some background on why GHC doesn't do this, see the comment "MOREFLOATING POINT MUSINGS..." in


  http://darcs.haskell.org/ghc/compiler/nativeGen/MachInstrs.hs


Twisty. I guess 'slow, but correct, with switches to go faster at the
price of correctness' is about the best option.

You probably want SSE2. If I ever get around to finishing it, the GHCnative code generator will be able to generate SSE2 code on x86 someday,like it currently does for x86-64. For now, to get good FP performance onx86, you probably want
  -fvia-C -fexcess-precision -optc-mfpmath=sse2


Reading the gcc manpage, I think you mean -optc-msse2
-optc-mfpmath=sse. -mfpmath=sse2 doesn't appear to be an option.

(I note in passing that the ghc darcs head produces binaries from
ray.hs which are about 15% slower than ghc 6.6.1 ones btw. Same
optimisation options used both times.)

cheers, Phil

--
http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: Haskell version of ray tracer code is much slower than the original ML

Reply via email to