Bruce Leenstra wrote:

> What this list needs right now is a nice juicy math debate, so here goes:
> I was reading the faq about P-1 factoring, and it talks about constructing a
> 'q' that is the product of all primes less than B1 (with some multiples?)
> ...
>
> Right now Prime95 constructs a list of possible factors, filters it (before
> hand, or on-the-fly) and then uses the powering algorithm to calculate 2^p -
> 1 (mod q) for each factor.

Sandy Harris <[EMAIL PROTECTED]> replied:

>Off-the wall question dep't.:
>
>How does the overhead for that compare to just calculating q, the product of
>all the primes of interest, and then gcd(q, target)? If you get 1, then q
>and target have no common factors. If you get anything else, your result is
>the product of the common factors.

Actually, that's not off the wall (or out of left field, as fans
of baseball refer to it) at all. In fact, Richard Crandall mentions
such a scheme in at least one of his books (either Pomerance & Crandall
or Crandall's "Projects in Scientific Computing" or both.) Here's how
it compares to the sieve-based factoring: assume we have many candidate
factors of the number 2^p-1, having size <= q bits. (Precisely speaking,
we require an O(1) fraction of these being O(q) in size, but that's a
technicality.) Assuming q bits is on the order of a computer wordsize,
to test each one of these separately requires O(log2(p)) CPU operations
(i.e. the O(log2(p)) modular squarings of a binary powering algorithm.)

Using the gcd-based method, we multiply roughly p/q of the factor
candidates together to get a product of <= p bits, then gcd with 2^p-1.
The fastest known gcd algorithms need O(p * ((log2(p))^2),
i.e. are O(log2(p)) times as expensive as an FFT-based big-integer
multiply of numbers this size. (That doesn't seem too bad at first
glance, but the O() hides a much larger proportionality constant than
the FFT, as well.) Dividing by the number of candidates that get
processed per gcd we see that this method needs O((log2(p))^2)
operations, where we've absorbed the q (which is assumed to be
order of unity, since a computer wordsize is such) into the implied
proportionality constant. That makes this method asymptotically
slower than one-factor-candidate-at-a-time sieving, and the added
overhead of the fast gcd imposes an additional speed penalty.

Thus, this method (which does have the advantage that it can use
floating-point FFT for the large-integer multiplies of the fast gcd)
would only become attractive if integer multiply were grindingly slow
or if one were dealing with some kind of vector hardware which could
do floating-point FFT screamingly fast but were somehow constrained
to being able to do just O(1) integer multiplies per cycle. Not a likely
combination, but it's an interesting theme nonetheless.

Note that one advantage of the gcd-based method is that factoring cost
doesn't depend overmuch on the hardware wordsize, since we can break
the factor product into whichever-sized chunks are convenient for the
FFT. For the sieving method, as soon as q exceeds the hardware wordsize
by even a single bit, we get hit with a 3-4x speed penalty. But given
that the gcd method is probably around 100x slower than sieving for
q no larger than a wordsize, it's going to take some very large factor
candidates indeed to make the gcd method attractive.

But keep those off-the-wallers coming - it's that kind of thinking
that often leads to real algorithmic breakthroughs!

Cheers,
-Ernst

Reply via email to