Hi Terry,
At 01:58 AM 9/9/2001 -0700, Terry S. Arnold wrote:
>What is the correct algorithm for calculating the time credit for a LL
>test of an exponent?
Take the timing on the status page, multiply it by the exponent. That gets
your the PII-400 timing in seconds. Divide by (86400 * 365). That gets
you the timing in CPU-years. Multiply by 5.5, the conversion factor from
PII-400 CPU-years to P90 CPU-years.
>I did a linear interpolation from the times in the status page and this
>does not match reality. When I think about the growth characteristics of a
>FFT it appears that my approximation was naive.
I can interpret your question two ways:
1) You were expecting timings to increase linearly within a FFT range. This
is not the case, all exponents from 10.33M to 12.83M take the same time
per iteration.
2) You expected a 512K FFT timing to be twice the 256K FFT timing. This
is not so for a variety of reasons.
a) The FFT algorithm is O(N log N). So the 512K FFT should be
512K*log(512K)/(256K*log(256K)) = 2 * 19/18 slower.
b) More importantly, the CPU caches work less effectively as the
FFT gets larger. A 64K FFT can fit within some CPU's L2 caches.
Jumping to a 128K FFT means half the data is in the L2 cache and
half is in main memory (when switching from pass 1 to pass 2 of
the FFT).
c) At these large FFT sizes, we are now putting pressure on the
TLB caches. The TLB maps a virtual address into a physical address.
Intel chips keep track of 64 TLB entries, each entry maps to a 4KB
page.
IIRC, Athlon CPUs have an even smaller TLB cache. A TLB cache miss
does add some time to a memory access.
Regards,
George
_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers