On Tuesday, September 22, 2015 at 6:51:50 AM UTC-4, Sheehan Olver wrote:
>
> I get the following timings with the various FFTW routines, where I ran 
> each line multiple times to make sure it was accurate.  Why are REDFT00 and 
> RODFT00 almost 10x slower?
>

This is because a REDFT00 (DCT-I) of n points is exactly equivalent to a 
DFT of N=2(n-1) points with real-even data.  Although FFTW uses a different 
algorithm, not a size-N complex FFT, the fast algorithms are essentially 
equivalent to pruned versions of the FFT algorithms for N, and depend on 
the factorization of N, not of n.  Similarly for RODFT00 (DST-I), except 
N=2(n+1) in that case.  In contrast, the other DCT/DST types correspond to 
DFTs of length N=2n or N=4n or N=8n, so their speed depends on the 
factorization of n.

If n=100000 as in your example, then n-1 has large prime factors (41, 271), 
so the FFT algorithms are much slower, although they are still O(n log n). 
  If you try n=100001, you will notice that REDFT00 becomes much faster 
(but other DCT types become slower).

This is mentioned in the FFTW manual:

http://www.fftw.org/doc/Real-even_002fodd-DFTs-_0028cosine_002fsine-transforms_0029.html

"FFTW is most efficient when N is a product of small factors; note that 
this *differs* from the factorization of the physical size n for REDFT00 
and RODFT00!"

Reply via email to