On 25/07/2008, at 12:42 PM, Duncan Coutts wrote:

Of course then it means we need to have enough work to do. Indeed we
need quite a bit just to break even because each core is relatively
stripped down without all the out-of-order execution etc.

I don't think that will hurt too much. The code that GHC emits is very regular and the basic blocks tend to be small. A good proportion of it is just for copying data between the stack and the heap. On the upside, it's all very clean and amenable to some simple peephole optimization / compile time reordering.

I remember someone telling me that one of the outcomes of the Itanium project was that they didn't get the (low level) compile-time optimizations to perform as well as they had hoped. The reasoning was that a highly speculative/out-of-order processor with all the trimmings had a lot more dynamic information about the state of the program, and could make decisions on the fly which were better than what you could ever get statically at compile time. -- does anyone have a reference for this?

Anyway, this problem is moot with GHC code. There's barely any instruction level parallelism to exploit anyway, but adding an extra hardware thread is just a `par` away.

To quote a talk from that paper earlier: "GHC programs turn an Athlon into a 486 with a high clock speed!"

Ben.


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to