Re: [PATCH] pte prefetching

Nick Piggin Thu, 24 Mar 2005 21:22:22 -0800

David Mosberger wrote:

On Thu, 24 Mar 2005 18:18:17 +1100, Nick Piggin <[EMAIL PROTECTED]> said:

  Nick> After applying the recent freepgt patchset from Hugh (on
  Nick> lkml), the time to fork+exit a process mapping 64GB of address
  Nick> (32MB of page tables) is 0.471s. With the prefetch patch, this
  Nick> drops to 0.357s.


Sorry, above numbers were wrong:
0.118s versus 0.089s. Improvement ratio is the same, I just used the
wrong divisor.

Looks like a nice improvement to me.

Does prefetching 1 line ahead give the best results?  That's only
128/8=16 PTEs.  Assuming a 200 cycle latency, this would allow
for only 12.5 cycles/iteration.  Especially for large (NUMA) machines,
prefetching further out might help more.


Hmm... yeah it may do. Although I don't think that changes your cycles
/ iteration ratio, does it? Just allows for for a little bit more
variation.

I just retested, and prefetching 2 lines ahead gives virtually the same
performance.

But actually, my tests are set up so each pte page has only a single
'present' pte (I did it that way to speed up initial faulting time).
So the loop will almost always get stopped by the pte_none tests. So
perhaps that is able to complete in close to or less than 12 cycles.

Nick


-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] pte prefetching

Reply via email to