patperry: > Last month Anatoly Yakovenko published some disturbing numbers about > the Haskell BLAS bindings I wrote being significantly slower than > using plain C. I wanted to let everyone know that I've closed the > performance gap, and now for doing ten million dot products, the > overhead for using Haskell instead of C is about 0.6 seconds on my > machine, regardless of the size of the vectors. The next version will > incorporate the changes. If you can't wait for a formal release, the > darcs repository is at http://www-stat.stanford.edu/~patperry/code/blas/ > > Anyone interested in more details can check out my blog: > http://quantile95.com/2008/07/24/addressing-haskell-blas-performance-issues/ > > Thanks everyone for the input on this (especially Anatoly). If any > else finds any performance discrepancies, please let me know and I > will do whatever I can to fix them. >
Great work, Patrick! So if I read correctly, the main change was to flatten the representation (and thus in loops the vector's structure will be unpacked and kept in registers, which isn't possible for sum types). -- Don _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe