On 01/03/2009, at 04:49, Don Stewart wrote:

So now, since we've gone to such effort to produce a tiny loop like, this, can't we unroll it just a little? Sadly, my attempts to get GCC to trigger
its loop unroller on this guy haven't succeeded. -funroll-loops and
-funroll-all-loops doesn't  touch it,

That's because the C produced by GHC doesn't look like a loop to GCC. This can be fixed but given that we are moving away from -fvia-C anyway, it probably isn't worth doing.

Anyone think of a way to apply Claus' TH unroller, or somehow convince GCC it is worth unrolling this guy, so we get the win of both aggressive high level
fusion, and aggressive low level loop optimisations?

The problem with low-level loop optimisations is that in general, they should be done at a low level. Core is much too early for this. To find out whether and how much to unroll a particular loop, you must take things like register pressure and instruction scheduling into account. IMO, the backend is the only reasonable place to do these optimisations. Using an exisiting backend like LLVM would really help here.

Roman


_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Reply via email to