On 24 Mar 00, at 15:44, [EMAIL PROTECTED] wrote:

> I suspect for LL tests in the ~10M range, this happy medium may be as
> 'small' as 1-2MB. Are PC systems with L2 caches in this size range
> available? If so, how much of a premium does one pay for the extra cache?

Ernst, your Mlucas program has a similar "memory footprint" to 
Prime95. My Alpha 21164-533 has 2MB L3 cache, yet cache effects start 
to become noticeable around exponent 5 million, and are dominant well 
before exponent 10 million. I think you'd need 4 MB cache in order to 
be able to run a LL test on an exponent in the region of 10 million 
whilst containing most of the memory accesses to the cache.

AMD K6-3 supported L3 cache, Socket 7 motherboards with 2MB cache 
were available. But the K6 never did support SMP and the K6-3 is now 
discontinued - AMD now run only the K6-2 as the "budget" line and the 
Athlon as the "performance" line.

Of the Intel processors, only PPro and Xeon have ever been available 
with caches bigger than 512K - PPro to 1MB and Xeon in 1MB and 2MB 
variants. You pay a frightening premium for larger caches - about 
double the price for the processor for each doubling of cache size. 
Given the relatively small increase in performance, this premium is 
very hard to justify. You can build four complete Pentium III systems 
for the price of just one Xeon CPU with 2MB cache running at the same 
speed.

More realistic options would appear to be to use systems using Rambus 
memory technology (which has several times the throughput of SDRAM, 
though at a significant cost penalty) and/or the Intel 840 chipset 
(which I understand has dual memory busses, giving each CPU 
simultaneously full speed access to memory locations which happen not 
to be being accessed by the other processor at the same time). Compaq 
have been building multiprocessor server systems using similar 
technology for some time, these do not appear to suffer from 
throughput throttling to any significant extent when multiple copies 
of Prime95/NTPrime/mprime are run on them. But they aren't cheap!

Incidentally, with core/bus speed ratios as large as the are with the 
fastest processors available today, memory bandwidth bottlenecks are 
possible with uniprocessor systems too. In particular note that 
systems based on the Intel 820 chipset using a memory translation hub 
to enable affordable SDRAM to be used instead of Rambus memory will 
probably deliver only 100 MHz memory bandwidth even if the CPU is 
running at 133 MHz FSB. Such systems also appear to be _very_ 
particular about the memory being used - the parameters in the SPD 
chip must be totally correct, since the MTH prevents the BIOS from 
determining the correct parameters for accessing the RAM directly. 
The upshot is that the system believes that it has no RAM fitted, 
even though the memory may work perfectly well in another (non-i820) 
board. My advice would be not to buy an i820-based motherboard unless 
you're prepared to buy Rambus memory to fit in it - currently Rambus 
is about five times the price of SDRAM.

BTW I have two dual-processor systems, both using Supermicro P6DBx 
motherboards (single memory bus, BX chipset). One has PII-350s 
fitted, the other has a pair of PIII-450s. When running LL tests, the 
350 MHz system slows down about 7% when both processors are running 
compared with the speed with one processor idle; the 450 MHz system 
slows down about 15%. You get full performance on one LL test if the 
other system is doing something which requires CPU time but little or 
no memory access e.g. trial factoring, or Stage 1 of an ECM run on a 
small exponent. This is true for both Win NT and linux. '9x does not 
support SMP, so the second processor doesn't cause memory bus 
congestion, but you don't get benefit from its CPU cycles either!


Regards
Brian Beesley
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Reply via email to