On Tuesday, 25 September 2012 at 18:33:45 UTC, jerro wrote:
I think there is one thing in this code that will hurt
performance much, much, more than allocations. This code will
compute elements of the result lazily. So each time you want
to read an element from the resulting range, O(log(n))
functions passed to map() will need to be computed. The
problem is that each of those functions computes sine and
cosine, so sine and cosine need to be computed O(log(n)) times
for each element. To get all n elements, you will need to
compute them O(n log(n)). Because computing sine and cosine is
about two orders of magnitude slower than multiplication and
division, this will be very slow.
I was wrong about the complexity. Because each element of the
result depends on all the elements of the argument range, you
actually need O(n) function calls to compute each element of
the result and O(n*n) function calls(and sine and cosine
computations) to compute all of them. You would need to use
memoization to get reasonable complexity.
Great point, I hadn't really thought about the laziness before --
I was doing this in Python and just thought it might be fun to
translate it to D. :P
I'd like to add that I don't think you can make this work the
way you meant it to. The problem is that you return a chain
when the length is 2, a chain of chains when the length is 4,
and so on. What fft of length n would actually need to return
is a binary tree of ranges with n leaves. The size of memory
needed for the value returned from fft therefore depends on the
length of the range it was given as an argument. So you need to
either use heap allocated memory for the return type, or the
return type needs to depend on the parameter range's length,
which would mean that the parameter range's length needs to be
a template parameter.
Ohh huh... sounds like you're right, lemme think about it a bit
more though.
Thanks! :)