On 01/03/2009, at 04:49, Don Stewart wrote:
So now, since we've gone to such effort to produce a tiny loop like,
this,
can't we unroll it just a little? Sadly, my attempts to get GCC to
trigger
its loop unroller on this guy haven't succeeded. -funroll-loops and
-funroll-all-loops doesn't touch it,
That's because the C produced by GHC doesn't look like a loop to GCC.
This can be fixed but given that we are moving away from -fvia-C
anyway, it probably isn't worth doing.
Anyone think of a way to apply Claus' TH unroller, or somehow
convince GCC
it is worth unrolling this guy, so we get the win of both aggressive
high level
fusion, and aggressive low level loop optimisations?
The problem with low-level loop optimisations is that in general, they
should be done at a low level. Core is much too early for this. To
find out whether and how much to unroll a particular loop, you must
take things like register pressure and instruction scheduling into
account. IMO, the backend is the only reasonable place to do these
optimisations. Using an exisiting backend like LLVM would really help
here.
Roman
_______________________________________________
Glasgow-haskell-users mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users