Daran wrote:
>
> The math page http://www.mersenne.org/math.htm explains how to do a stage 1
> P-1 test with given bound B1, but it omits the explanation of the
> enhansement that uses B2 in stage 2. Anyone care to explain
> how this is done?
>
> Cheers
>
> Daran G.
P-1 stage 1 computes x = a^E (mod N), where E is the product of all
primes and prime powers <= B1.
If stage 1 does not find a factor, gcd(x-1, N) = 1, we go to stage 2.
The idea behind a second stage for P-1, P+1 and ECM is that a random
number will usually have its biggest prime factor quite a bit larger
than the second biggest, so it is worthwhile to figure something out to
include a single big prime more quickly.
We can go from x^p_i to x^p_{i+1}, where p_i is the i-th prime, quickly
by multiplying by x^(p_{i-1} - p_i). The difference between consecutive
primes is relatively small, apparantly no bigger than (log(B2)^2),so we
have no problem precomputing x^n for even n to cover all the possible
differences of consecutive primes we're having in our stage 2.
For each x^p_i we compute this way, we multiply this x^p_i-1 to an
accumulator.
We find the factor f of N if there is a prime p within our stage 2
interval so that f-1 | E*p (more specifically, ord(a) mod f | E*p),
since then a^(E*p) - 1 == 0 (mod f), thus the accumulator == 0 (mod f)
and a final gcd will reveal f.
We get something like: (x is assumed intact from stage 1)
xn[j] = x^(2*j), 1 < 2*j < max(p_{i+1} - p_i), p_{i+1} <= B2
last_p = last prime done in stage 1
for B1 < p <= B2, p prime
x = x * xn[(p - last_p)/2]
accu *= x-1
last_p = p
print gcd(accu, N)
This needs two multiplies per prime in the stage 2 interval, as opposed
to about 20 squarings (for a prime around 2^20) if it had been included
in stage 1.
We can speed that up further. x^n - x^m = x^m (x^(n-m) - 1), thus we
can get the (x^p - 1) we want by a single subtraction if we can write p
as n-m. (The extra x^m factor on the right side does not hurt.)
For example, for D ~= sqrt(B2-B1) store values
xm[i] = x^i, 1 <= i < D
and
xD[i] = x^(i*D) , B1/D < i < B2/D+1
Since we have only odd p, we can make D even and store xm only for
odd i.
Now we can do
for B1 < p <= B2, p prime
accu *= (xD[i] - xm[j) where Di-j = p
print gcd(accu, N)
Thus we need only one multiply per prime in the stage 2 interval, aside
from the about 2*sqrt(B2-B1) multiplies in setting up the tables.
However, we need more storage for precomputed values.
It can be done even faster, by pairing up primes, which can reduce the
number of multiplies again by almost half.
The definitive source for info on these algorithms is Montgomery,
Speeding the Pollard and elliptic curve methods of factorization, Math.
Comp. Vol. 48, 1987, pp. 243-265.
There are other enhancements that are even faster but require much more
memory (and which I don't quite understand), try P. Montgomery & R.
Silverman, An FFT extension to the P-1 factoring algorithm, Math. Comp.
Vol. 54, 1990, pp. 839-854, also available online at
ftp://ftp.cwi.nl/pub/pmontgom/ucladissertation.psl.gz .
BTW, Prime95 uses the one-multiply variant with prime-pairing, plus
something stange called "Suyama's powers" (don't ask me), for both P-1
and ECM.
Alex
_
Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers