Raymond Hettinger added the comment:

> With j=11, is the lookup at index 10 in the same cache line?

It might or it might not.  The cache lines are typically 64 bytes which holds 4 
key/hash pairs.  So, the odds are 75% in favor of an adjacent entry being in 
the same cache line.

If malloc where guaranteed to return 64-byte aligned memory blocks, the odds 
would be 100%, but my understanding is that 16-byte alignment is the norm.

> I read that CPU likes to prefetch data forward, 
> but I didn't know that they like prefetching backward.

The integrated memory controller can also prefetch backwards, but that won't 
help us.  The prefetching doesn't start until you've done a series of 
monotonically increasing or decreasing accesses.  There is no prefetch on the 
first probe regardless of whether you're going up or down.

The whole problem with large sets is that the data is scattered all over memory 
and is unlikely to be in cache or to benefit from prefetching in the same way 
that lists and deques do (their memory access patterns are consecutive).   

The core concept for the patch is that a cache line fetch may be expensive so 
you might as well check more than one slot once a line is fetched.  Working 
against us is that we don't have control over where malloc starts a memory 
block (possibly starting in the middle of a cache line).

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue18771>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to