Re: [mpir-devel] A new FFT for the New Year

Bill Hart Fri, 01 Jan 2010 06:23:06 -0800

1. Formal release of code.

I have included two new files in the top source directory of mpir in
svn. The first is new_fft_with_flint.c, the second new_fft.c. They are
isomorphic except that the second does not need to be linked against
FLINT to run. There will be no difference in timings or results, I
just removed all the timing and test code which relied on FLINT.

Here is the command I use to build the code once you have built MPIR:

gcc -O2 new_fft.c -o fft -I/home/wbhart/mpir-trunk
-L/home/wbhart/mpir-trunk/.libs -lmpir

Modify as desired for your own system.

Please note the usual disclaimer:

This code is INCOMPLETE, INEFFICIENT, INCONSISTENT often INCORRECT,
ILL-TESTED and the original author is ILL-TEMPERED.

It is LGPL v2+ licensed and I have taken extreme care to ensure that
all the code was written from scratch by me without following someone
else's work. The four functions FFT_split_* and FFT_combine_* were
taken from FLINT, but to my knowledge were written entirely by me. I
hereby release them under the LGPL v2+.

The timing code does not print times, it merely completes a certain
number of iterations and exits. Use the linux time command to get
times.

I have run very few tests on the multiplication routine, perhaps on
the order of a few thousand integer multiplications, at very few
sizes. This code is extremely new and should be treated as quite
likely broken.

Note that the negacyclic convolution is currently disabled. I don't
even know if it passes its test code when switched on at present. I
modified a number of the FFT's since writing it, so probably not. But
it did pass it's test code. However note it does not deal with a
couple of (trivial) special cases, so is incomplete and formally
incorrect because of that.

There is an implementation of the split radix FFT contained in the
file too. I found the times were *identical* to those I got from the
radix 2 code, so I abandoned it. There is also an old implementation
of the radix 2 FFT/IFFT which I also abandoned. The basic radix 2 FFT
function is called FFT_radix2_new.

All the truncate, mfa and twiddle functions are new in that they don't
rely on the old radix 2 implementation, I believe.

Many of the code comments have been cut and pasted from earlier code
they were derived from, so will be incorrect. This is especially true
of the comments for the FFT and IFFT functions (including the truncate
and mfa functions), except for the radix2_new and split_radix
functions which I hope are about right.

Bill.

2010/1/1 Bill Hart <goodwillh...@googlemail.com>:
> As the code for this FFT is quite involved, I plan to incrementally
> release a set of notes about it, starting tomorrow morning (today
> already), which may take hours to days for me to complete. I got the
> code working for the first time this afternoon, so please be patient
> with me.
>
> Here is the rough list of topics I hope to cover in the notes:
>
> Table of Contents
> =============
>
> 1) The formal code release. I need to check all the test code still
> passes and make it easy for people to build and run. That will take an
> hour or so for me to do.
>
> 2) FFT's for nitwits (like me). How does an FFT help in multiplying
> numbers, basically, and what are the important algorithms and
> techniques.
>
> 3) Why a new FFT? This is an important question which needs answering.
> I will discuss what led me to work on a new FFT implementation, rather
> than improve or port another implementation.
>
> 4) The strategy. What does this FFT actually do? Which algorithms does
> it use?  I tried many things, most of which completely failed. In
> section 3 I will have already discussed other strategies I tried which
> did not work. Here I will discuss what did work. Whilst many of the
> details came from my head, I'm pretty sure the general outline will
> contain few surprises.
>
> 5) References. Few. I purposely tried to be creative.
>
> 8) Implementors notes. A long list of things that can be improved. In
> releasing the code publicly it is my hope that contributors will help
> make this the best FFT implementation it can be. The list will include
> many trivial items which anyone could work on, all the way through to
> fairly advanced topics which would probably require original research
> to do really well. I'll rank the items according to how hard I think
> they'd be to do.
>
> The code will be under revision control, so it is my hope that people
> will be willing to contribute to it. I believe I spent between 150 and
> 200 hours writing it, all told.
>
> Till tomorrow.
>
> Bill.
>
> 2010/1/1 Bill Hart <goodwillh...@googlemail.com>:
>> Ha! I forgot to mention, those numbers represent the size of the
>> integers in bits. I multiply two different integers with the same
>> number of bits, though that is not a limitation of the code. It will
>> accept integers of different lengths and adjust accordingly.
>>
>> Bill.
>>
>> 2010/1/1 Bill Hart <goodwillh...@googlemail.com>:
>>> I re-timed that point and I believe I may have just mistranscribed it.
>>> Here it is with the corrected time. Thanks for pointing it out!
>>>
>>> I've also included some times from the FLINT FFT, though only
>>> non-truncated ones (i.e. where FLINT uses a convolution which is an
>>> exact power of 2 length - not a limitation of FLINT, I just didn't get
>>> around to timing it at the other points yet).
>>>
>>> 8364032:  6.8s / 8.0s
>>> 9409536:  9.8s / 8.7s
>>> 10455040:  10.8s / 9.6s
>>> 11500544: 12.3s / 12.6s
>>> 12546048: 12.7s / 15.3s
>>> 13591552: 14.0s / 14.2s
>>> 14637056: 14.9s / 15.3s
>>> 15682560: 16.0s / 14.5s
>>> 16728064: 16.8s / 19.5s
>>>
>>> 20910080: 24.3s / 23.2s
>>> 25092096: 28.5s / 33.1s
>>> 29274112: 33.1s / 31.8s
>>> 33456128: 37.4s / 47.7s
>>>
>>> 519168: iters 10000: 30.7s / 32.2s / 32.3s
>>> 2084864: iters 1000: 13.9s / 15.2s / 13.0s
>>> 8364032: iters 100: 6.8s / 8.0s / 7.6s
>>> 33497088: iters 100: 37.4s / 47.7s / 37.5s
>>> 134103040: iters 10: 17.0s / 24.8s / 19.6s
>>> 536608768: iters 1: 8.6s / 9.4s / 8.6s
>>> 2146959360: iters 1: 42.3s / 40.3s / 38.9s
>>>
>>> Bill.
>>>
>>>
>>> 2009/12/31 Bill Hart <goodwillh...@googlemail.com>:
>>>> It's either a tuning issue or timing instability on the machine.
>>>>
>>>> Bill.
>>>>
>>>> On 31/12/2009, Robert Gerbicz <robert.gerb...@gmail.com> wrote:
>>>>> Hi!
>>>>>
>>>>> How is it possible that to multiple longer numbers takes less time, for
>>>>> example:
>>>>>
>>>>> 12546048: 14.7s / 15.3s
>>>>> 13591552: 14.0s / 14.2s
>>>>>
>>>>> --
>>>>>
>>>>> You received this message because you are subscribed to the Google Groups
>>>>> "mpir-devel" group.
>>>>> To post to this group, send email to mpir-de...@googlegroups.com.
>>>>> To unsubscribe from this group, send email to
>>>>> mpir-devel+unsubscr...@googlegroups.com.
>>>>> For more options, visit this group at
>>>>> http://groups.google.com/group/mpir-devel?hl=en.
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

--

You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-de...@googlegroups.com.
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en.

Re: [mpir-devel] A new FFT for the New Year

Reply via email to