pfft 0.1

2012-07-20 Thread jerro

I'm announcing the release of pfft, a fast FFT written in D.

Code: https://github.com/jerro/pfft/

Downloads: https://github.com/jerro/pfft/downloads

Documentation: http://jerro.github.com/pfft/doc/pfft.pfft.html

Benchmarks: http://jerro.github.com/pfft/benchmarks/


Re: pfft 0.1

2012-07-20 Thread bearophile

jerro:


I'm announcing the release of pfft, a fast FFT written in D.


Everything looks nice. Are you using "in", 
pure/nothrow/immutable/const enough?


Is it worth changing the Phobos API to something similar to this 
API?


Bye,
bearophile


Re: pfft 0.1

2012-07-20 Thread jerro

On Friday, 20 July 2012 at 23:36:37 UTC, bearophile wrote:

jerro:


I'm announcing the release of pfft, a fast FFT written in D.


Everything looks nice. Are you using "in", 
pure/nothrow/immutable/const enough?


Not yet, but I plan to add those.

Is it worth changing the Phobos API to something similar to 
this API?


Pfft already includes an API that is compatible with the
Phobos one (pfft.stdapi). But I don't think it makes any sense
to change the Phobos API to something like pfft.pfft. The main
difference between the two is that std.numeric.Fft uses
interleaved format for sequences of complex numbers and
pfft.pfft uses a split format. Interleaved format is more
convenient in most cases and I think it's more intuitive too.
The FFT in Phobos also works on ranges, while pfft.pfft only
works on aligned arrays.

The only reason the pfft.pfft works the way it does is that it
is supposed to be an interface to the underlying implementation
with no loss of performance and that is how the implementation
works. Even if the implementation in Phobos did use split
format, I think it would still be better to use interleaved
format for the API. It's just about 30% slower and I think that
a nice API is more important for the standard library than a 30%
difference in speed.

There is one more important difference in the API between
std.numeric.Fft and pfft.pfft. With Phobos you can use one
instance of Fft for all sizes and types. With pfft.pfft, the Fft
class is parametrized with a floating point type and you need a
different instance for each size. I think what Phobos does in
this case is wrong. The fact that one class works on all
floating point types results in very poor precision when
the data consists of doubles or reals. I guess that precomputed
tables are stored as floats. You could only fix that by saving
them as reals, but that would probably hurt performance quite
a lot. The fact that one instance can be used for multiple sizes
would be a problem if we wanted to change an implementation since
not all FFT implementation can use one precomputed table for
different data sizes.



Bye,
bearophile





Re: pfft 0.1

2012-07-20 Thread bearophile

jerro:

Are all your benchmarks done on a 64 bit system?


I think what Phobos does in
this case is wrong. The fact that one class works on all
floating point types results in very poor precision when
the data consists of doubles or reals. I guess that precomputed
tables are stored as floats. You could only fix that by saving
them as reals, but that would probably hurt performance quite
a lot. The fact that one instance can be used for multiple sizes
would be a problem if we wanted to change an implementation 
since

not all FFT implementation can use one precomputed table for
different data sizes.


If you are right and this is a problem, are Andrei and others 
accepting to change this little part of Phobos? If the answer is 
positive, are you interested in creating a GIT patch that changes 
that?


Bye,
bearophile


Re: pfft 0.1

2012-07-20 Thread jerro

Are all your benchmarks done on a 64 bit system?


They are. Here's one comparison with 32 bit (for complex single
precision transform using sse)if you are interested in that:
http://imgur.com/CTuCD . The 32 bit executable is slower, probably
because there are less general purpose and SSE registers on x86
than on x86_64.

If you are right and this is a problem, are Andrei and others 
accepting to change this little part of Phobos? If the answer 
is positive, are you interested in creating a GIT patch that 
changes that?


I think that poor precision is a problem. I have checked now and
std.numeric.Fft does indeed use floats for the lookup table.
The precision problem could be solved by either changing that to
real or by changing Fft to a template and using whatever the type
parameter is. Just changing float to real doesn't require changing
the API, but would probably result in worse performance.

I haven't talked to Andrei or others about changing it, but I am
willing to write a patch that changes the API, if it would be
decided that would be the best thing to do. If it would be decided
that it's best to just change the type of the lookup table that's
trivial anyway, since it's just one typedef.


Re: pfft 0.1

2012-07-21 Thread deadalnix

On 21/07/2012 01:26, jerro wrote:

I'm announcing the release of pfft, a fast FFT written in D.

Code: https://github.com/jerro/pfft/

Downloads: https://github.com/jerro/pfft/downloads

Documentation: http://jerro.github.com/pfft/doc/pfft.pfft.html

Benchmarks: http://jerro.github.com/pfft/benchmarks/


A signal processing module in phobos sound like a lot of fun.


Re: pfft 0.1

2012-07-21 Thread jerro

I think that poor precision is a problem. I have checked now and
std.numeric.Fft does indeed use floats for the lookup table.
The precision problem could be solved by either changing that to
real or by changing Fft to a template and using whatever the 
type
parameter is. Just changing float to real doesn't require 
changing

the API, but would probably result in worse performance.


I forgot to mention one more solution (the one that pfft.stdapi 
currently uses). The class could also use lazy initialization. 
When the fft is called, check if the lookup table for that type 
is already created, create it if it isn't, store it for later and 
then compute the fft. This would  require casting away const in 
the fft() method, though.


Re: pfft 0.1

2012-07-24 Thread bearophile

jerro:


I haven't talked to Andrei or others about changing it, but I am
willing to write a patch that changes the API, if it would be
decided that would be the best thing to do.


Given your willingness to work, and the days since you wrote 
that, maybe we have to write this little proposal again in the 
main D newsgroup.


Bye,
bearophile


Re: pfft 0.1

2012-07-24 Thread Paulo Pinto

On Friday, 20 July 2012 at 23:26:46 UTC, jerro wrote:

I'm announcing the release of pfft, a fast FFT written in D.

Code: https://github.com/jerro/pfft/

Downloads: https://github.com/jerro/pfft/downloads

Documentation: http://jerro.github.com/pfft/doc/pfft.pfft.html

Benchmarks: http://jerro.github.com/pfft/benchmarks/


Nice work.

Just did the usual advertising in HN and Reddit,

http://news.ycombinator.com/item?id=4286257

http://www.reddit.com/r/programming/comments/x2tt8/pfft_a_fast_fourier_transform_written_in_d/

--
Paulo


Re: pfft 0.1

2012-07-24 Thread Andrei Alexandrescu

On 7/20/12 8:02 PM, jerro wrote:

On Friday, 20 July 2012 at 23:36:37 UTC, bearophile wrote:

Is it worth changing the Phobos API to something similar to this API?


Pfft already includes an API that is compatible with the
Phobos one (pfft.stdapi). But I don't think it makes any sense
to change the Phobos API to something like pfft.pfft. The main
difference between the two is that std.numeric.Fft uses
interleaved format for sequences of complex numbers and
pfft.pfft uses a split format. Interleaved format is more
convenient in most cases and I think it's more intuitive too.
The FFT in Phobos also works on ranges, while pfft.pfft only
works on aligned arrays.

The only reason the pfft.pfft works the way it does is that it
is supposed to be an interface to the underlying implementation
with no loss of performance and that is how the implementation
works. Even if the implementation in Phobos did use split
format, I think it would still be better to use interleaved
format for the API. It's just about 30% slower and I think that
a nice API is more important for the standard library than a 30%
difference in speed.


This all is very interesting. Even considering all potential issues and 
incompatibilities, it might be great to look into integrating Fft into 
Phobos. Do you think it would work to add it as a "high-speed 
alternative with different format and alignment requirements"? Offhand 
I'd think someone using the FFT is generally interested in speed, and a 
30% margin is sensible since the algorithm might be most of some 
applications' run time.



There is one more important difference in the API between
std.numeric.Fft and pfft.pfft. With Phobos you can use one
instance of Fft for all sizes and types. With pfft.pfft, the Fft
class is parametrized with a floating point type and you need a
different instance for each size. I think what Phobos does in
this case is wrong. The fact that one class works on all
floating point types results in very poor precision when
the data consists of doubles or reals. I guess that precomputed
tables are stored as floats. You could only fix that by saving
them as reals, but that would probably hurt performance quite
a lot. The fact that one instance can be used for multiple sizes
would be a problem if we wanted to change an implementation since
not all FFT implementation can use one precomputed table for
different data sizes.


Do you think Phobos can be fixed in a backwards-compatible way (e.g. 
provide a template and alias it to the old name)?



Andrei


Re: pfft 0.1

2012-07-24 Thread jerro
This all is very interesting. Even considering all potential 
issues and incompatibilities, it might be great to look into 
integrating Fft into Phobos. Do you think it would work to add 
it as a "high-speed alternative with different format and 
alignment requirements"? Offhand I'd think someone using the 
FFT is generally interested in speed, and a 30% margin is 
sensible since the algorithm might be most of some 
applications' run time.


If we were integrating pfft into Phobos, it would be best to
replace the current implementation completely. Pfft already
includes an API that is Phobos compatible (it deinterleaves and
copies the input to an internal buffer and then interleaves the
result and copies it to the output buffer when it's done). This
interface is slower than the in place, split format interface,
but is still a lot faster than the current Phobos implementation
if SIMD operations are available. I guess it wouldn't hurt to
also add split format, in place functions which would require
aligned arrays for those who need every bit of performance.



Do you think Phobos can be fixed in a backwards-compatible way 
(e.g. provide a template and alias it to the old name)?


After thinking about it a bit I think that it would be best to
solve this using lazy initialization. Users probably don't care
that much whether the internal data is computed in the
or just before computing the first transform. That way the API
doesn't need to be changed.




Re: pfft 0.1

2012-07-24 Thread jerro



Nice work.

Just did the usual advertising in HN and Reddit,

http://news.ycombinator.com/item?id=4286257

http://www.reddit.com/r/programming/comments/x2tt8/pfft_a_fast_fourier_transform_written_in_d/

--
Paulo


Thanks.