Re: Mersenne: Re: P-1 limits

2002-07-17 Thread Richard Woods

Recently I wrote:
> Well, if a Mnumber has been trial-factored to 2^63 (with no success,
> or else we wouldn't be considering P-1 now), then P-1 will not 
> find any factors of 63 bits or less.

Tom Cage has reminded me that he and others have found factors with
P-1 that are smaller than the posted bit limit in the current
nofactor.cmp at http://www.mersenne.org/gimps.

AFAIK, the number of such findings is too small to make any significant 
difference in the probabilities calculated by the P-1 limit-choosing 
algorithm, but it would be well to remember that it can still happen.

Richard Woods

_
Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: Re: P-1 limits

2002-07-17 Thread Richard Woods

Recently I've been translating Prime95's P-1 limit-choosing algorithm 
into another language for a P-1 utility of mine.  Let me offer some 
further explanation.  (George, please point out any errors!)

Inputs to the algorithm: exponent of the Mersenne number to be factored, 
how far (as a power of 2) trial factoring of that Mnumber has reached, 
whether the P-1 effort precedes a first-time LL test or a double-check, 
how much memory is to be available during P-1 factoring, plus 
empirically-derived constants and fudge factors in various formulas.

The algorithm is independent of specific systems or CPUs.  That is, it 
does not consider or estimate actual time needed to perform an 
operation, or consider a given CPU's relative performance on integer vs. 
floating-point instructions.  Its basic unit of measure for the cost of 
a procedure is the FFT "squaring" (= a transform, a convolution, then 
another transform).  For GCDs, which do FFT multiplications that are not 
squarings, the algorithm estimates the number of FFT transforms needed 
by the GCD, then divides by two to get the equivalent number of 
squarings.

If a P-1 factoring run with particular B1 and B2 bounds will require 
175,000 FFT squarings in stage 1, a GCD that will take as much time as 
3,000 FFT squarings, and 122,000 FFT squarings in stage 2, the algorithm 
considers that P-1 run's total cost to be 300,000 -- it is assumed that 
the time required for other parts of the run can be neglected in 
comparison.

(George -- Doesn't a two-stage P-1 require two GCDs, not just one?  And 
it looks like some of the cost comparisons don't include even one GCD 
... was that a deliberate omission, or am I overlooking something?)

Dickman's function (see Knuth Vol. 2, or previous discussion on this 
list) is used to calculate the probability of the P-1 procedure's 
finding a factor.

Perhaps you've wondered what role the how-far-trial-factored input 
plays.  Well, if a Mnumber has been trial-factored to 2^63 (with no 
success, or else we wouldn't be considering P-1 now), then P-1 will not 
find any factors of 63 bits or less.  But if the Mnumber had been 
trial-factored to only 2^57, then there would still be some chance of 
finding a factor between 2^57 and 2^63 during a P-1 run, in addition to 
the chance of finding a factor above 2^63.  The algorithm takes this 
into account.

Continuing the example from above -- If that 300,000-squaring P-1 run 
has a 1.25% probability of finding a factor, then it would balance an 
L-L test costing 300,000/.0125 = 24,000,000.  L-L tests are considered 
to cost 1.015 times their number of FFT squarings, to balance the 
historically observed 1.5% rate of erroneous L-L results, so an L-L test 
on an exponent of, say, 22,000,000 would be considered to cost 
22,000,000 * 1.015 = 22,330,000 squarings.  In this case, the P-1 
limit-choosing algorithm would decide that a P-1 cost of 300,000 with 
probability of 0.0125 would not be worthwhile on an exponent of 
22,000,000, but would be worth a try on an exponent of, say, 25,000,000 
because 300,000/.0125 is less than 25,000,000 * 1.015.

The above example is somewhat backwards, though, because the algorithm 
actually considers only one given exponent's L-L cost versus the costs 
and probabilities for several combinations of B1 and B2 bounds, and 
picks out the best combination of P-1 cost and probability for that 
exponent.

Prime95 considers B1 values starting at 10,000 then going up in 
increments of 5,000.  For each B1, Prime95 considers B2 values starting 
with B2 = B1, then going up in increments of B1*0.25.  So, for B1 = 
10,000, Prime95 calculates costs and probabilities for B2 = 10,000, 
12,500, 15,000, 17,500, and so on.  Then for B1 = 15,000, it considers 
B2 = 15,000, 18,750, 22,500, 26,250, 30,000, and so on.

When does the algorithm stop incrementing B1 and B2 values it examines? 
 Either (1) when the B2 stage would require too much memory (one of the 
algorithm inputs was how much memory is to be available during P-1 
factoring), or (2) when the relative worth of the P-1 run falls below 
90% of the most worthy P-1 bounds combination so far considered.  By 
"relative worth" and "worthy", I mean the amount by which [L-L cost 
multiplied by probability of P-1 success] exceeds [P-1 cost].  The shape 
of Dickman's function guarantees that criterion (2) is eventually 
satisfied at some increment of both B1 and B2.

Where the first-time/double-check matters:  For a double-check, the L-L 
cost is as illustrated in the example above.  For a first-time L-L, the 
L-L cost against which P-1 cost is compared is twice that value -- i.e., 
the algorithm multiplies the exponent by 1.015 * 2 = 2.03 to get the 
number of L-L squarings.  That's because a successful P-1 factoring 
before a first-time test will eliminate the need for both the first-time 
and double-check L-L tests rather than only the double-check L-L test.

Formulas for calculating the numbers of FFT squarings req

Re: Mersenne: Re: P-1 limits

2002-07-14 Thread Richard Woods

Brian J. Beesley wrote:
> Shouldn't make much difference

Of course.

By Daran's example (quoted here in reverse order),
  7714127B1=4, B2=65
  7786567B1=4, B2=64
one can easily see that the effect of differing FFT sizes (384K for 
exponent 7714127, 448K for 7786567) on choice of P-1 limits was small in 
this case.

But Daran's question was about why the the _smaller_ exponent should get 
the _higher_ B2 limit at all.  My point was that that was a result of 
the FFT difference on the algorithm's output.

> P-1 computation rate is directly linked to LL testing computation
> rate, since the same crossover points are used and the main cost
> of both is FFT multiplication.

True, but the question here was about the choice of B1 and B2 limits.

> I think the most likely cause is that the P-1 limits depend on the
> relative speed of P-1/LL computation speed and trial factoring speed.

No.  Neither trial factoring speed nor its ratio to P-1/LL computation 
speed enters into the P-1 limit-choosing algorithm.

> If RollingAverage was slightly different, the relative computation
> rate might be "guessed" to be different enough for the P-1 limits
> to vary slightly.

No.  The P-1 limit-choosing algorithm does not consider RollingAverage 
either.


Richard Woods

P.S.  When replying to this message, check that your reply's To: field 
contains "[EMAIL PROTECTED]".  My ISP's mail handler has been mangling 
the Reply-To: field of messages I send.

_
Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: Re: P-1 limits

2002-07-14 Thread Daran

- Original Message -
From: "Brian J. Beesley" <[EMAIL PROTECTED]>
To: "Richard Woods" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Saturday, July 13, 2002 10:15 PM
Subject: Re: Mersenne: Re: P-1 limits

> On Saturday 13 July 2002 10:04, Richard Woods wrote:

> > V22 non-P4 FFTs are 384K in length for exponents 6545000 through
> > 7779000, then 448K for exponents 7779000-9071000.  The two exponents you
> > list are on opposite sides of that division, so the higher exponent will
> > require substantially longer for each FFT operation.  That cost
> > difference affects the outcome of the algorithm that selects P-1 limits.
>
> Shouldn't make much difference - P-1 computation rate is directly linked
to
> LL testing computation rate, since the same crossover points are used and
the
> main cost of both is FFT multiplication.

I agree.  I did wonder about the following (from the guess_pminus1_bounds
function in file commonc.c):-

/* Pass 2 FFT multiplications seem to be at least 20% slower than */
/* the squarings in pass 1.  This is probably due to several factors. */
/* These include: better L2 cache usage and no calls to the faster */
/* gwsquare routine. */

 pass2_squarings *= 1.2;

However, the adjustment is linear, so should have no effect upon the
relative costs across the FFT boundary.

> I think the most likely cause is that the P-1 limits depend on the
relative
> speed of P-1/LL computation speed and trial factoring speed. If
> RollingAverage was slightly different, the relative computation rate might
be
> "guessed" to be different enough for the P-1 limits to vary slightly.

I don't understand this.  Why should the relative speeds of P-1/LL vs TF
vary?  And what difference would it make if they did?  The depth - hence the
cost - of TF is fixed (at 63 bits for these exponents).  The only trade-off
is between the cost of the P-1 vs the expected cost of the LL computation.

> Regards
> Brian Beesley

Daran G.


_
Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers