Hi folks

> > > In the book _Primes and Programming_ Head's method of multiplying
> > > two numbers mod n is mentioned.  Is this actually more effiecient
> > > than simply multiplying the two numbers and taking the modulus?
> > Look at it this way. Head's method is essentially binary long
> > multiplication, with the "wrinkle" that you take the remainder modulo
> > n after each shift & add operation.
> That is going to be a *lot* slower than FFT convolution, for numbers the
size
> of the Mersenne numbers we're testing! FFT is O(n*log(n)) where n is the
number
> of limbs in the numbers being multiplied.

Limbs? It is good to know that the world has many different literal meanings
in many languages for "bits" - variety is good for us all. (The French word
for "buffer" is also, I seem to remember, rather amusing).

But enough already. One thing I'm surprised nobody has mentioned yet - is it
even in Lucas' FAQ? - is that modular reduction in the Mersenne case is
sinfully inexpensive. Even with a pure integer, no FFT implementation,
reduction mod 2^p-1 is very quick - take the lower p bits, take the rest of
the number shifted down by p bits, add them together. The size of the
original number is usually around 2p bits, so one iteration of the
shift-and-add will get the result correct to within 1 extra subtraction at
most.

As for the FFT implementation, modular reduction is less than inexpensive -
it is virtually free - in fact, it makes use of a fact that means the
algorithm can be a large factor faster than "arbitrary" arithmetic in the
Mersenne case. In a standard FFT multiplication, the numbers have to be
buffered with a large "dead zone" of zeroes - this is due to the fact that
the FFT algorithm works with a periodic signal, so data bits "wrap around"
from either end unless you have a large enough zero-buffer. The Mersenne
method uses it to it's advantage - the bits 2^p and above *should* wrap onto
the least-significant bits as part of the modular reduction. With a little
adjustment (using a non-rational arithmetic base so 2^p is in fact a power
of 2 power of the base), the modular reduction then becomes a desirable, and
free, side-effect of the multiply algorithm - not to mention, the FFT buffer
is half as large, and so calculations are implicitly faster.

Chris Nash
Lexington KY
UNITED STATES
=================================================
Still a co-discoverer of the largest known *non*-Mersenne prime.


________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Reply via email to