On Tue, Mar 19, 2002 at 10:03:50PM +0100, Steinar H. Gunderson wrote:
> Paste from gwnum.c, Prime95 v19:
> 
> /* Well.... I implemented the above only to discover I had dreadful */
> /* performance in pass 1.  How can that be?  The problem is that each  */
> /* cache line in pass 1 comes from a different 4KB page.  Therefore, */
> /* pass 1 accessed 128 different pages.  This is a problem because the */
> /* Pentium chip has only 64 TLBs (translation lookaside buffers) to map */
> /* logical page addresses into physical addresses.  So we need to shuffle */
> /* the data further so that pass 1 data is on fewer pages while */
> /* pass 2 data is spread over more pages. */
> 
> So, it might be that due to TLB thrashing, George would have to
> choose a less efficient memory layout to avoid them, and thus get
> lower speed overall.

I think 2MB pages will be a win for any memory layout using < 128 MB
of RAM (64 TLB entries * 2 MB pages).

-- 
Nick Craig-Wood
[EMAIL PROTECTED]
_________________________________________________________________________
Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Reply via email to