On 2 August 2010 19:49, David Simcha <[email protected]> wrote: > Oh, also, I don't think that cache effects are the main bottleneck because > switching to single-precision floats for both input and output has a > negligible effect on performance even though it cuts the size of the working > set in half.
Interesting. Still, I think that because of the way FFT works, once you're bigger than the cache, nearly every memory access will be a cache miss. It could be that although the memory footprint halves, the number of cache misses remains constant. Anyway, the reason I posted the link was not so much to help with implementation, but more because it gives a great feel for what's involved in a "state of the art" FFT library. I suspect there's a sweet spot with high convenience, small code size, and good-enough performance. _______________________________________________ phobos mailing list [email protected] http://lists.puremagic.com/mailman/listinfo/phobos
