Christian Heimes added the comment:
I've modified unicodeobject's unicode_hash() function. V8's algorithm is about
55% slower for a 800 MB ASCII string on my box.
Python's current hash algorithm for bytes and unicode:
while (--len >= 0)
x = (_PyHASH_MULTIPLIER * x) ^ (Py_uhash_t) *P++;
$ ./python -m timeit -s "t = 'abcdefgh' * int(1E8)" "hash(t)"
10 loops, best of 3: 94.1 msec per loop
V8's algorithm:
while (--len >= 0) {
x += (Py_uhash_t) *P++;
x += ((x + (Py_uhash_t)len) << 10);
x ^= (x >> 6);
}
$ ./python -m timeit -s "t = 'abcdefgh' * int(1E8)" "hash(t)"
10 loops, best of 3: 164 msec per loop
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue14621>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com