[issue18771] Reduce the cost of hash collisions for set objects

2013-09-15 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 9353c611f897 by Raymond Hettinger in branch 'default':
Issue 18771: Make it possible to set the number linear probes at compile-time.
http://hg.python.org/cpython/rev/9353c611f897

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-19 Thread Tal Einat

Tal Einat added the comment:

The following benchmarks were run with the second version of the patch 
(so2.diff), once before and once after.

OSX 10.8 on a Macbook Pro, 2.5 GHz Core i5, two cores each with 256kb L2 cache, 
3MB L3 cache:

1.32% less time for first benchmark (7.07 before, 6.98 after)
1.31% less time for second benchmark (9.81 before, 9.68 after)

Ubuntu 12.10 64-bit, AMD Athlon II X2 250 Processor (3.0 GHz, 2MB L2 cache, 
256KB total L1 per processor):

5% less time for first benchmark (13.6 before, 12.9 after)
11% less time for second benchmark (18.3 before, 16.3 after)

Win7 64-bit, Intel Xeon E3-1230 V2 (quad-core, 3.30GHz, 256kb L2 cache per 
core, 8MB L3 cache):

1.8% less time for first benchmark (3.90 before, 3.83 after)
3.5% less time for second benchmark (7.27 before, 7.02 after)

However, this processor has Intel SmartCache, which as far as I can tell 
means one processor can use other processors' caches if they are inactive. So I 
ran the benchmarks again with 4x larger data sets (revised benchmark script 
attached):

3.8% less time for first benchmark (19.3 before, 18.6 after)
7.5% less time for second benchmark (33.1 before, 30.6 after)

--
nosy: +taleinat
Added file: http://bugs.python.org/file31372/set_bench_smartcache.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-19 Thread Tal Einat

Tal Einat added the comment:

Because of the unusually low differences, I tried running the benchmarks again 
on the MacBook (OSX 10.8). The first time was actually with so.diff, this time 
was with so2.diff.

0.69% less time for first benchmark (7.30 before, 7.25 after)
8.08% less time for second benchmark (10.85 before, 9.97 after)

Strange...

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-19 Thread Christian Heimes

Christian Heimes added the comment:

Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz on Ubuntu 13.04 X86_64 with GCC 4.7.3

1MB L2 cache test: 11.33sec / 10.83sec (4.4% faster)
8Mb L3 cache test: 22.16sec / 19.67sec (11% faster)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-19 Thread Tal Einat

Tal Einat added the comment:

Sorry to be pedantic, but while 19.67sec is about 11.2% less than 22.16sec, it 
is actually about 12.6% faster (22.16/19.67 - 1).

Regardless, that's a good result! And I'm happy that the modified benchmark is 
useful.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-19 Thread Christian Heimes

Christian Heimes added the comment:

Oh *curse*, you are right. Why do I always confuse the order? :|

It looks like the performance boost increases with CPU cache size as well as 
the CPU's generation. The i7-4770K box at work is less than a week old. The CPU 
is currently the fast desktop CPU from Intel.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-19 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Wow, thanks for running these tests and posting the results :-)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-19 Thread Roundup Robot

Roundup Robot added the comment:

New changeset a887596b841f by Raymond Hettinger in branch 'default':
Issue18771:  Reduce the cost of hash collisions for set objects.
http://hg.python.org/cpython/rev/a887596b841f

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-19 Thread Raymond Hettinger

Changes by Raymond Hettinger raymond.hettin...@gmail.com:


--
resolution:  - fixed
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-19 Thread Christian Heimes

Christian Heimes added the comment:

How about a Misc/NEWS and a Doc/whatsnew/3.4 entry, too?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-19 Thread STINNER Victor

STINNER Victor added the comment:

This optimization cannot be applied to the dict type?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-19 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Victor, this should also work for dictionaries as well.   Fundamentally, the 
concepts of loop-unrolling and exploiting cache locality should apply to any 
hash table implementation.

That said, the dicts are more complex than sets due to 1) storing the 
hash/key/value 2) key-sharing, and 3) different access patterns.  So, it will 
require more effort than just porting over the patch.  Feel free to open a new 
tracker item for that one.  It will take more work but will potentially have a 
much greater pay-off.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-18 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Also, cachegrind shows a 3% improvement in the branch misprediction rate (from 
9.9% to 9.6%).

==82814== Mispred rate: 9.9% (  9.4% +  15.5%
==82812== Mispred rate: 9.6% (  9.1% +  14.1%   
)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-18 Thread STINNER Victor

STINNER Victor added the comment:

Instead of only a second lookup, could you try for example 4 lookup and
align j to fit in a cache line? Or do you expect more collisions with 4
contiguious lookup?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-18 Thread Raymond Hettinger

Raymond Hettinger added the comment:

 Instead of only a second lookup, could you try for example 4 lookup
 and align j to fit in a cache line? 

Accessing 4 entries per probe is a tempting line of development, but will be 
subject to diminishing returns (second and third collisions aren't frequent).

I like the idea of aligning j to fit in a cache line, but the computation would 
need to be cheap and portable (standard C with no pointer tricks that rely on 
non-guaranteed behavior).

Have you had a chance to run the benchmarks on your machine?  I'm curious how 
this works out on other processors and with other compilers.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-18 Thread Jesús Cea Avión

Changes by Jesús Cea Avión j...@jcea.es:


--
nosy: +jcea

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-18 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Attaching an updated patch (same algorithm but with a little more factoring to 
make the patch smaller).

--
Added file: http://bugs.python.org/file31361/so2.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread Raymond Hettinger

New submission from Raymond Hettinger:

I'm working on a patch for the lookkey() functions in Object/setobject.c.  The 
core idea is to follow the probe sequence as usual but to also check an 
adjacent entry for each probe (i.e. check table[i  mask] as usual and then 
check table[(i  mask) ^ 1] before going on the next probes which are 
scattered around memory).

On modern processors (anything in the last decade), this is likely to reduce 
the cost of hash collisions and reduce the number of cache lines fetched from 
memory.

+ Cache line benefit:  improves the odds that the adjacent probe will be on the 
same cache line, thereby reducing the total number of cache lines fetched.  
This benefit will work for set insertions as well as set lookups (a partial 
write to a cache line requires that a full cache line be read prior to writing, 
so it is helpful if we've just examined another slot on the same cache line).

+ Parallel execution:  because the second lookup has no data dependency on the 
first lookup, some of its instructions can be executed in parallel by the 
out-of-order engine.

+ Reduced loop overhead:  the second lookup doesn't require a new computation 
of the index *i* (we just do a XOR 1), a new perturb shift, a new masking of 
*i*, or a jump to the top of the loop.  In other words, it has all the usual 
benefits of loop unrolling.

- On the downside, even this partial unrolling when increase the total code 
size which has the negative effect of blowing other code out of the I-cache 
(typically 32kb).

? I'm unsure whether the additional if-statements will have a net positive or 
net negative effect on branch prediction rates (positive because each branch 
can be tracked separately or negative because additional branches use up more 
space in the branch prediction tables).  Once the patch is ready, I'll run 
CacheGrind to get a better sense of what is happening.

--
components: Interpreter Core
messages: 195506
nosy: haypo, rhettinger
priority: normal
severity: normal
status: open
title: Reduce the cost of hash collisions for set objects
type: performance
versions: Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread Christian Heimes

Changes by Christian Heimes li...@cheimes.de:


--
nosy: +christian.heimes

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

How it will affect a performance of set(range(n))?

--
nosy: +serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread Raymond Hettinger

Changes by Raymond Hettinger raymond.hettin...@gmail.com:


--
keywords: +patch
Added file: http://bugs.python.org/file31345/so.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Serhiy, I've posted a patch so you can examine the effects in detail.

For something like set(range(n)), I would expect no change because the dataset 
is collision free due to Tim's design where hash(someint)==someint.  That said, 
I'm trying to optimize the general case and don't really if rare cases are 
modestly slowed.

The new code only kicks in when there is a collision.  The logic for the 
initial probe is unchanged.   The idea is to pickup some of the cache locality 
benefits of collision resolution by separate chaining, but not incurring the 
100% malloc typically associated with chaining.

When adjacent entries are in the same cache-line, the cost of the extra 
adjacent probe is much cheaper (nearly zero) than a probe somewhere else in 
memory.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread STINNER Victor

STINNER Victor added the comment:

Do you expect visible difference on a microbenchmark? Do you have
benchmarks showing the speedup and showing that many collisions are no
too much slower?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread Raymond Hettinger

Raymond Hettinger added the comment:

I just made the patch a few minutes ago.  Am just starting to work on 
benchmarks.  I posted here so that you guys could help find the strengths and 
weaknesses of the approach.  

The theory is sound, but it is going to take a good deal of effort to isolate 
the effects (either good or bad) in realistic benchmarks.  The short, tight 
loops in timeit tend to put everything in cache and hide the effects  of good 
memory access patterns.

If would love to have you all team up with me on this one.  I could use help on 
timings, examining disassembly, and runs with cache grind.

This is just in the experimental phase, so it is premature to be throwing up 
obstacles with show me your data or what does it do in situatation x.  At 
this point, you all could help out by running some timings designed to isolate 
the effects on high collision data big enough to not fit in  L2 cache.  I hope 
you assist in evaluating the idea with an open mind.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread Raymond Hettinger

Changes by Raymond Hettinger raymond.hettin...@gmail.com:


--
assignee:  - rhettinger
priority: normal - low

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread STINNER Victor

STINNER Victor added the comment:

 The theory is sound, but it is going to take a good deal of effort to isolate 
 the effects (either good or bad) in realistic benchmarks.

Oh, it was just asking for a microbenchmark on set(). I don't care of
a real benchmark (realistic use case) for such issue :-)

I asked because I don't understand all these complex things of the CPU
(cache size, data dependency, prefetch, ...), whereas I can compare
number of a microbenchmark :-) And I can try to reproduce it on a
different CPU and a different dataset. Because I don't understand
low-level CPU things, I always fear that a patch works well on a
specific CPU whereas it would behave badly on a completly different
CPU architecture (ex: RISC vs CISC).

All your arguments sound good and I'm always happy to see people
involved to optimize Python!

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Here's an excerpt of the patch that gives a pretty good idea of that is being 
changed (there are essentially three lines of new logic):

static void
set_insert_clean(PySetObject *so, PyObject *key, Py_hash_t hash)
{
setentry *table = so-table;
setentry *entry;
size_t perturb = hash;
size_t mask = (size_t)so-mask;
size_t i, j;// the j variable is new
i = j = (size_t)hash  mask;
while(1) {
entry = table[j];
if (entry-key == NULL)
break;
entry = table[j ^ 1];  // this line is new
if (entry-key == NULL) // this line is new
break;  // this line new
i = i * 5 + perturb + 1;
j = i  mask;
perturb = PERTURB_SHIFT;
}
so-fill++;
entry-key = key;
entry-hash = hash;
so-used++;
}

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread Raymond Hettinger

Changes by Raymond Hettinger raymond.hettin...@gmail.com:


--
Removed message: http://bugs.python.org/msg195530

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Here's an excerpt from the patch that gives a pretty good idea of what is being 
changed (the three lines of new logic are marked):

static void
set_insert_clean(PySetObject *so, PyObject *key, Py_hash_t hash)
{
setentry *table = so-table;
setentry *entry;
size_t perturb = hash;
size_t mask = (size_t)so-mask;
size_t i, j;// the j variable is new
i = j = (size_t)hash  mask;
while(1) {
entry = table[j];
if (entry-key == NULL)
break;
entry = table[j ^ 1];  // this line is new
if (entry-key == NULL) // this line is new
break;  // this line new
i = i * 5 + perturb + 1;
j = i  mask;
perturb = PERTURB_SHIFT;
}
so-fill++;
entry-key = key;
entry-hash = hash;
so-used++;
}

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread STINNER Victor

STINNER Victor added the comment:

+ Cache line benefit:  improves the odds that the adjacent probe will
be on the same cache line (...)
+ Reduced loop overhead:  the second lookup doesn't require a new
computation of the index *i* (we just do a XOR 1) (...)

With j=11, is the lookup at index 10 in the same cache line? Does a
cache line start at the address of the first lookup, or is it
rounded to the size of a cache line?

I read that CPU likes to prefetch data forward, but I didn't know that
they like prefetching backward.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread Raymond Hettinger

Raymond Hettinger added the comment:

 With j=11, is the lookup at index 10 in the same cache line?

It might or it might not.  The cache lines are typically 64 bytes which holds 4 
key/hash pairs.  So, the odds are 75% in favor of an adjacent entry being in 
the same cache line.

If malloc where guaranteed to return 64-byte aligned memory blocks, the odds 
would be 100%, but my understanding is that 16-byte alignment is the norm.

 I read that CPU likes to prefetch data forward, 
 but I didn't know that they like prefetching backward.

The integrated memory controller can also prefetch backwards, but that won't 
help us.  The prefetching doesn't start until you've done a series of 
monotonically increasing or decreasing accesses.  There is no prefetch on the 
first probe regardless of whether you're going up or down.

The whole problem with large sets is that the data is scattered all over memory 
and is unlikely to be in cache or to benefit from prefetching in the same way 
that lists and deques do (their memory access patterns are consecutive).   

The core concept for the patch is that a cache line fetch may be expensive so 
you might as well check more than one slot once a line is fetched.  Working 
against us is that we don't have control over where malloc starts a memory 
block (possibly starting in the middle of a cache line).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Here's a simple benchmark to start with.

On my machine (2.4Ghz Core2 Duo with 2MB L3 cache) and compiled with GCC-4.8, 
the benchmark shows a 6% speedup for sets that spill over L2 cache and 11% for 
sets that spill over the L3 cache.

The theoretically expected result is 9.75%:

   26% collision rate (for tables that are 64% full)
 x 75% chance of adjacent entry in same cache line
 x 50% savings (two lookups for the price of one)
 -
 9.75%
 =

Most programs will have less benefit (they may smaller tables that fit in L1 
cache or have tables that are less densely filled resulting in fewer 
collisions).  My quick timings show either no change or inconsequential 
improvement in the performance of those programs.

--
Added file: http://bugs.python.org/file31349/set_bench.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread Raymond Hettinger

Changes by Raymond Hettinger raymond.hettin...@gmail.com:


Added file: http://bugs.python.org/file31350/insert_clean.s

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18771] Reduce the cost of hash collisions for set objects

2013-08-17 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Attaching the detailed results of a CacheGrind analysis.  Here are the 
high-points (all of which were expected):

* The last level data cache miss-rate improved 12% (from 4.9% to 4.3%).
* The first level data cache miss-rate improved 14% (from 5.5% to 4.7%).
* The additional code caused the instruction cache miss-rate to get slightly 
worse (from .32% to .40%).

The last level cache misses are the most important.  Per the cachegrind docs, 
an L1 miss will typically cost around 10 cycles, [and] an LL miss can cost as 
much as 200 cycles.

--
Added file: http://bugs.python.org/file31352/grind.txt

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18771
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com