On Sat, Aug 31, 2013 at 9:29 PM, raymond.hettinger < python-check...@python.org> wrote:
> http://hg.python.org/cpython/rev/d40a65658ff0 > changeset: 85486:d40a65658ff0 > parent: 85482:4d604f1f0219 > user: Raymond Hettinger <pyt...@rcn.com> > date: Sat Aug 31 21:27:08 2013 -0700 > summary: > Further reduce the cost of hash collisions by inspecting an additional > nearby entry. > Hi Raymond, I'm curious about the benchmarks used to test such micro-optimizations. Do you use one of the existing Python benchmark suites or do you have a particular set of micro-benchmarks you're running this on? Eli > > files: > Objects/setobject.c | 43 +++++++++++++++++++++++++++++--- > 1 files changed, 39 insertions(+), 4 deletions(-) > > > diff --git a/Objects/setobject.c b/Objects/setobject.c > --- a/Objects/setobject.c > +++ b/Objects/setobject.c > @@ -65,10 +65,11 @@ > The initial probe index is computed as hash mod the table size. Subsequent > probe indices are computed as explained in Objects/dictobject.c. > > -To improve cache locality, each probe is done in pairs. > -After the probe is examined, an adjacent entry is then examined as well. > -The likelihood is that an adjacent entry is in the same cache line and > -can be examined more cheaply than another probe elsewhere in memory. > +To improve cache locality, each probe inspects nearby entries before > +moving on to probes elsewhere in memory. Depending on alignment and the > +size of a cache line, the nearby entries are cheaper to inspect than > +other probes elsewhere in memory. This probe strategy reduces the cost > +of hash collisions. > > All arithmetic on hash should ignore overflow. > > @@ -130,6 +131,26 @@ > if (entry->key == dummy && freeslot == NULL) > freeslot = entry; > > + entry = &table[j ^ 2]; > + if (entry->key == NULL) > + break; > + if (entry->key == key) > + return entry; > + if (entry->hash == hash && entry->key != dummy) { > + PyObject *startkey = entry->key; > + Py_INCREF(startkey); > + cmp = PyObject_RichCompareBool(startkey, key, Py_EQ); > + Py_DECREF(startkey); > + if (cmp < 0) > + return NULL; > + if (table != so->table || entry->key != startkey) > + return set_lookkey(so, key, hash); > + if (cmp > 0) > + return entry; > + } > + if (entry->key == dummy && freeslot == NULL) > + freeslot = entry; > + > i = i * 5 + perturb + 1; > j = i & mask; > perturb >>= PERTURB_SHIFT; > @@ -190,6 +211,17 @@ > if (entry->key == dummy && freeslot == NULL) > freeslot = entry; > > + entry = &table[j ^ 2]; > + if (entry->key == NULL) > + break; > + if (entry->key == key > + || (entry->hash == hash > + && entry->key != dummy > + && unicode_eq(entry->key, key))) > + return entry; > + if (entry->key == dummy && freeslot == NULL) > + freeslot = entry; > + > i = i * 5 + perturb + 1; > j = i & mask; > perturb >>= PERTURB_SHIFT; > @@ -258,6 +290,9 @@ > entry = &table[j ^ 1]; > if (entry->key == NULL) > break; > + entry = &table[j ^ 2]; > + if (entry->key == NULL) > + break; > i = i * 5 + perturb + 1; > j = i & mask; > perturb >>= PERTURB_SHIFT; > > -- > Repository URL: http://hg.python.org/cpython > > _______________________________________________ > Python-checkins mailing list > python-check...@python.org > http://mail.python.org/mailman/listinfo/python-checkins > >
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com