Re: Mersenne: Head's algorithm for multiplying mod n
At 06:51 PM 7/9/99 +0100, Brian J. Beesley wrote: >For reasonably small multi-precision numbers, Head's method is >actually very good, if you're working on a true RISC processor with >no integer multiply instruction. I started using Head's algorithm in 1981 on my Apple II. It was better than triple- and quad-precision routines. It has propagated through my software, and I've assumed that it is still preferable. I probably need to reevaluate it on a Pentium. +--+ | Jud "program first and think later" McCranie | +--+ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Re: Mersenne: Head's algorithm for multiplying mod n
On 8 Jul 99, at 23:33, Lucas Wiman wrote: > >That is going to be a *lot* slower than FFT convolution, for numbers the size > >of the Mersenne numbers we're testing! FFT is O(n*log(n)) where n is the > >number of limbs in the numbers being multiplied. Head's method is O(n^2), > >with O being slightly smaller than for straight long multiplication. A > >recursive method splitting the number in the middle, then doing a subtraction > >trick to make three instead of four submultiplies, is O(n^y), where y is > >log(3)/log(2), which is about 19/12. This is all true for numbers with many times the number of bits that can be stored in a computer register. However, find a recursive method for triple-precision arithmetic that beats the "classic" long multiplication method, and you'll definitely have earned whatever carrot may be dangled! > > Well, I wasn't talking about the massive scale of numbers used in LL tests. > I was speaking of the < 95 bit integers used in factoring, and using Head's > algorithm in the calculation of 2^p mod f. I doubt that FFT would be > worthwhile here, and would probably slow it down considerably. Too right! > Chris Nash writes: > >But enough already. One thing I'm surprised nobody has mentioned yet - is it > >even in Lucas' FAQ? - is that modular reduction in the Mersenne case is > >sinfully inexpensive. Even with a pure integer, no FFT implementation, > >reduction mod 2^p-1 is very quick - take the lower p bits, take the rest of > >the number shifted down by p bits, add them together. The size of the > >original number is usually around 2p bits, so one iteration of the > >shift-and-add will get the result correct to within 1 extra subtraction at > >most. This is also true, but we don't (often) want to check if 2^p-1 is a factor of anything ... in this case, we're trying to find factors of 2^p-1. For reasonably small multi-precision numbers, Head's method is actually very good, if you're working on a true RISC processor with no integer multiply instruction. Regards Brian Beesley Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Re: Mersenne: Head's algorithm for multiplying mod n
Pierre Abbat writes: >>> In the book _Primes and Programming_ Head's method of multiplying >>> two numbers mod n is mentioned. Is this actually more efficient >>> than simply multiplying the two numbers and taking the modulus? >> >> Look at it this way. Head's method is essentially binary long >> multiplication, with the "wrinkle" that you take the remainder modulo >> n after each shift & add operation. >That is going to be a *lot* slower than FFT convolution, for numbers the size >of the Mersenne numbers we're testing! FFT is O(n*log(n)) where n is the >number of limbs in the numbers being multiplied. Head's method is O(n^2), >with O being slightly smaller than for straight long multiplication. A >recursive method splitting the number in the middle, then doing a subtraction >trick to make three instead of four submultiplies, is O(n^y), where y is >log(3)/log(2), which is about 19/12. Well, I wasn't talking about the massive scale of numbers used in LL tests. I was speaking of the < 95 bit integers used in factoring, and using Head's algorithm in the calculation of 2^p mod f. I doubt that FFT would be worthwhile here, and would probably slow it down considerably. Chris Nash writes: >But enough already. One thing I'm surprised nobody has mentioned yet - is it >even in Lucas' FAQ? - is that modular reduction in the Mersenne case is >sinfully inexpensive. Even with a pure integer, no FFT implementation, >reduction mod 2^p-1 is very quick - take the lower p bits, take the rest of >the number shifted down by p bits, add them together. The size of the >original number is usually around 2p bits, so one iteration of the >shift-and-add will get the result correct to within 1 extra subtraction at >most. I think I might split off the parts about modular arithmetic into their own section, or in the beginning stuff section. I'll probably add this question sometime tomorrow. -Lucas Wiman Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Re: Mersenne: Head's algorithm for multiplying mod n
> Limbs? It is good to know that the world has many different literal meanings > in many languages for "bits" - variety is good for us all. (The French word > for "buffer" is also, I seem to remember, rather amusing). "Limb" is the term used in gmp for a digit in a large base (such as 2147483648) in which a number is represented in a multiple-precision package. phma Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Re: Mersenne: Head's algorithm for multiplying mod n
Hi folks > > > In the book _Primes and Programming_ Head's method of multiplying > > > two numbers mod n is mentioned. Is this actually more effiecient > > > than simply multiplying the two numbers and taking the modulus? > > Look at it this way. Head's method is essentially binary long > > multiplication, with the "wrinkle" that you take the remainder modulo > > n after each shift & add operation. > That is going to be a *lot* slower than FFT convolution, for numbers the size > of the Mersenne numbers we're testing! FFT is O(n*log(n)) where n is the number > of limbs in the numbers being multiplied. Limbs? It is good to know that the world has many different literal meanings in many languages for "bits" - variety is good for us all. (The French word for "buffer" is also, I seem to remember, rather amusing). But enough already. One thing I'm surprised nobody has mentioned yet - is it even in Lucas' FAQ? - is that modular reduction in the Mersenne case is sinfully inexpensive. Even with a pure integer, no FFT implementation, reduction mod 2^p-1 is very quick - take the lower p bits, take the rest of the number shifted down by p bits, add them together. The size of the original number is usually around 2p bits, so one iteration of the shift-and-add will get the result correct to within 1 extra subtraction at most. As for the FFT implementation, modular reduction is less than inexpensive - it is virtually free - in fact, it makes use of a fact that means the algorithm can be a large factor faster than "arbitrary" arithmetic in the Mersenne case. In a standard FFT multiplication, the numbers have to be buffered with a large "dead zone" of zeroes - this is due to the fact that the FFT algorithm works with a periodic signal, so data bits "wrap around" from either end unless you have a large enough zero-buffer. The Mersenne method uses it to it's advantage - the bits 2^p and above *should* wrap onto the least-significant bits as part of the modular reduction. With a little adjustment (using a non-rational arithmetic base so 2^p is in fact a power of 2 power of the base), the modular reduction then becomes a desirable, and free, side-effect of the multiply algorithm - not to mention, the FFT buffer is half as large, and so calculations are implicitly faster. Chris Nash Lexington KY UNITED STATES = Still a co-discoverer of the largest known *non*-Mersenne prime. Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Re: Mersenne: Head's algorithm for multiplying mod n
At 08:11 PM 7/8/99 -0400, Pierre Abbat wrote: >That is going to be a *lot* slower than FFT convolution, for numbers the size >of the Mersenne numbers we're testing! Head's algorithm is for getting x*y mod n when 0<=x,yM, where M is the largest integer you can store in a format native to the computer. I don't think it applies for Mersenne testing, but it helps a lot with Fermat-type tests (pseudoprimes, etc). +--+ | Jud "program first and think later" McCranie | +--+ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Re: Mersenne: Head's algorithm for multiplying mod n
On Thu, 08 Jul 1999, Brian J. Beesley wrote: > On 8 Jul 99, at 6:19, Lucas Wiman wrote: > > > In the book _Primes and Programming_ Head's method of multiplying > > two numbers mod n is mentioned. Is this actually more effiecient > > than simply multiplying the two numbers and taking the modulus? > > Look at it this way. Head's method is essentially binary long > multiplication, with the "wrinkle" that you take the remainder modulo > n after each shift & add operation. That is going to be a *lot* slower than FFT convolution, for numbers the size of the Mersenne numbers we're testing! FFT is O(n*log(n)) where n is the number of limbs in the numbers being multiplied. Head's method is O(n^2), with O being slightly smaller than for straight long multiplication. A recursive method splitting the number in the middle, then doing a subtraction trick to make three instead of four submultiplies, is O(n^y), where y is log(3)/log(2), which is about 19/12. phma Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Re: Mersenne: Head's algorithm for multiplying mod n
On 8 Jul 99, at 6:19, Lucas Wiman wrote: > In the book _Primes and Programming_ Head's method of multiplying > two numbers mod n is mentioned. Is this actually more effiecient > than simply multiplying the two numbers and taking the modulus? Look at it this way. Head's method is essentially binary long multiplication, with the "wrinkle" that you take the remainder modulo n after each shift & add operation. The remaindering operation here is very simple, as all you have to do is to subtract n if the result >= n. Nevertheless you're into a _lot_ of operations, involving badly- aligned data (which means doing lots of multi-precision shifts - which are slow because of dependency chains - also the "subtract n if >= n" is slow because the branch is hard to predict). Conversely, multiplying the whole number out & then taking the modulus once _may_ be inefficient, because you will probably be forced out of registers & have to do a lot of operations in memory. My factoring code compromises. It does long multiplication 32 bits at a time, creating a 128-bit product from a 96-bit by 32-bit multiplication operation, then reduces this modulo n leaving a result with at most 95 significant bits. The shifts involved are 32-bit shifts i.e. an address offset costing no CPU time. I _do_ have to reduce modulo n 3 times, but the multiplications actually dominate the time, and I can get away with using only the limited registers in the x86 processor plus four cache lines worth of memory (including storage for factor, remainder, etc.). I suppose you could call this a variant of Head's method ... note that I get only 95 bits from a 96 bit field in the same way that Head gets only (n-1) bits from an n bit field, you lose one bit precision because you have to add together the partial results of the multiplication after taking the modulus. Nevertheless Head's method is useful, especially if you're working in a language like Java that has limited support for multiprecision arithmetic, since it allows you to do arithmetic modulo n for n up to 2^30 even if you have only 32-bit signed arithmetic available. There are more time-saving tricks employed, like "early exit" from the whole stage when a partial multiplier is zero, and saving the partial products thereby reducing the number of 32-bit by 32-bit multiplies required from 9 to 6. > > If so, is it implemented in the various mersenne factoring programs > in use? I caught onto this idea when I was trying to do 2^p mod f as fast as possible for 2^64 < f < 2^80 - and got the extension to 2^95 "for free". I found it to be 10% - 20% faster than multiplying out the whole product (i.e. calculating x^2) then taking the modulus, and at least 10 times as fast as a straightforward implementation of Head's method. It will probably not be optimal for other size factors or for other processor architectures. I did, however, find that I got the best performance doing the job this way on Intel Pentium / PPro type CPUs for factors a little larger than 2^64 - which was the task briefed. Regards Brian Beesley Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Re: Mersenne: Head's algorithm for multiplying mod n
At 06:19 AM 7/8/99 -0400, you wrote: >All, >In the book _Primes and Programming_ Head's method of multiplying >two numbers mod n is mentioned. Is this actually more effiecient >than simply multiplying the two numbers and taking the modulus? > Yes, because it keeps the numbers smaller. It was originally: Method from Multiplication Modulo N, by A. K. Head, Bit 20 (1980) 115-116 +--+ | Jud "program first and think later" McCranie | +--+ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Re: Mersenne: Head's algorithm for multiplying mod n
At 06:19 AM 7/8/99 -0400, Lucas Wiman wrote: >In the book _Primes and Programming_ Head's method of multiplying >two numbers mod n is mentioned. Is this actually more effiecient >than simply multiplying the two numbers and taking the modulus? No it is actually much less efficient, but you are missing the point. In his book Peter Giblin explains why on P93. Essentially if a and b are say 62-bit integers, to compute a * b (mod m) you generate intermediate values that are up to 124-bits long which are not generally supported in Basic/Pascal/C/C++. One would have to resort to Ubasic or a Large Integer Package. The beauty of Head's algorithm is that it allows you to compute the modular product using only or two extra bits. The example in the book would allow you to compute the modular product of two 62-bit integers without intermediate values exceeding 64-bits. As far as I know most Mersenne factoring programs use on some form of Large Integer Package (LIP) to handle the size of numbers involved. Regards Alan Powell Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne: Head's algorithm for multiplying mod n
All, In the book _Primes and Programming_ Head's method of multiplying two numbers mod n is mentioned. Is this actually more effiecient than simply multiplying the two numbers and taking the modulus? If so, is it implemented in the various mersenne factoring programs in use? Thankyou, Lucas Wiman Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm