Hey everyone,

Just out of curiosity, what work is being done in the data parallel
haskell / repa projects regarding cache locality?  The reason I am
asking is because, as I understand it, the biggest bottleneck on today's
processors are cache misses, and the reason why optimized
platform-specific linear algebra libraries perform well is because they
divide the data into chunks that are optimally sized for the cache in
order to maximize the number of operations performed per memory access. 
I wouldn't expect data parallel haskell/repa to automatically know what
the perfect chunking strategy should be on each platform, but are there
any plans being made at all to do something like this?

(To be explicit, this isn't meant as a criticism;  I'm just curious and
am interested in seeing discussion on this topic by those more
knowledgeable than I.  :-) )

Thanks!
Greg


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to