Quoting John R Pierce <[EMAIL PROTECTED]>:

> but the 64 bit FPU multiply has only 52(?) bits of significance (the rest
> goes to exponent), and it generates a 52(?) bit result, so doing a high 
> precision multiply requires MORE operations.

The largest difference this can cause is a floating point FFT needing
a runlength twice as much as an integer FFT. Prime95 includes all sorts
of non-power-of-two FFTs to reduce this largest difference; even if it
did not, doubling the FFT size increases the runtime by ~2.1-2.5x, much
less than the typical slowdown from switching to an integer FFT. Error-
free squarings are nice though...you can take shortcuts that would be 
impossible for an FPU.

Remember also that an integer FFT requires all results to be reduced modulo
some prime number. For general primes this modular multiplication requires
at least four 64x64->64 multiplies, and these typically cannot pipeline
at all (if you have a pipelined multiplier, you've got to overlap several
independent modmul operations; if you don't have a pipelined multiplier,
you'll get dismal performance no matter what you do). There are special 
primes for which modular reductions are cheap; using one of these
primes and a Fast Galois Transform, the fastest integer FFT squaring 
I've managed is ~15% slower than Mlucas on an opteron.

jasonp

------------------------------------------------------
This message was sent using BOO.net's Webmail.
http://www.boo.net/
_______________________________________________
Prime mailing list
[email protected]
http://hogranch.com/mailman/listinfo/prime

Reply via email to