Re: Mersenne: Java Prime and lotsa questions

Brian J Beesley Tue, 4 May 1999 08:22:40 -0700
> b1) Using "properly sized" FFT: Yes, I believe using properly sized FFT's
> saves some time. How much depends on the programmer, the data, the
> language/compiler, and the hardware. Non-power-of-2 FFT's can even be
> useful, but it seems that coding up all (or most) of the possible sizes may
> be more trouble than it's worth.

Well, George went to the trouble of adding 3*2^n, 5*2^n and 7*2^n 
run length code into Prime95 - even though it's less efficient than 
plain powers-of-two, it's still well worth while, for the exponents that 
can benefit from it.

I'd be _delighted_ if someone took it on themselves to add code for 
intermediate run lengths into MacLucas/MacLucasUNIX - even just 
3*2^n would be very worthwhile. If I had time, I'd do it myself!
> 
> b2) Recurring pattern in "N=N^2-2 % (2^P-1)": I don't recall anyone here
> mentioning any interesting repeats, even for composite P. I'm sure many have
> looked, and were disappointed.

If you get to residual 0, then you do get a most interesting repeat - 
the next iteration is -2 and all the subsequent ones are +2

This proves that 0 cannot occur as an "interim" residual (any 
iteration prior to 2^p-3) if 2^p-1 is going to turn out to be prime.

Although testing for "early zero" would be very quick, it's 
nevertheless probably not worth doing. We see no res64=2 values 
in the completed results for composite exponents.

Since there are "only" 2^p-1 available values that the residual can 
take, if you iterate long enough, the residual is _certain_ to go into 
a repeating loop, eventually. However, the chance of this 
happenning in the first 2^p-3 iterations would seem to be 
vanishingly small. If it _does_ start to repeat "early" then 2^p-1 
_must_ be composite - because it can't get to zero without sticking 
at 2, see above - but is it really worth adding the extra code to 
check?
> 
> c2) The LL test does not adapt well to parallel efforts. I think it will
> always be the case that interprocess communication will cause the
> calculations to run slower than if just one processor crunched on the whole
> thing. [Array or vector processor machines are another matter!] But if the
> number of significant bits required were orders of magnitude larger than the
> hardware, then it might make sense to parallel the squarings. We're not
> there yet.

This seems self-evident, however it is certainly true that there is an 
efficient way to code an FFT for a vector processor.
> 
> d) NTT's (Number Theoretic Transforms): There has been some talk of that
> here before. However, until the ratio of integer to floating point
> throughput improves, NTT's will not be very popular. [Note that the FFT
> "automagically" computes the %2^P-1, using the Crandall-Fagin procedure, so
> NTT's have a lot of ground to make up first.]
> 
If you have a processor like StrongArm, which has efficient integer 
hardware but no floating-point hardware at all - forcing you to resort 
to software emulation for floating-point - then NTT starts to make a 
lot of sense. You get _some_ ground made up automatically 
because you have no rounding errors, so you can use a smaller 
transform size than experience with real/complex FFT would 
suggest. I'm still "rather hazy" (read, totally baffled) on exactly how 
the Crandall-Fagin procedure works, so I don't know if you could 
adapt NTT to do the mod 2^p-1 automatically. But my gut feeling is 
that it ought to be possible.


Regards
Brian Beesley
________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Re: Mersenne: Java Prime and lotsa questions

Reply via email to