Serhiy Storchaka added the comment:
Stefan, thank you for the suggestion. The test showed that, in fact, at least
under some x86 there is no performance decrease when using memcpy on nonaligned
data. This is good news. The code can left simple and even some doubtful
potential undefined behavior was removed.
Additional microbenchmarks:
$ ./python -m timeit -n 1 -s "t = memview(b'a' * 10**8)" "hash(t)"
$ ./python -m timeit -n 1 -s "t = memview(b'a' * 10**8)[1:]" "hash(t)"
$ ./python -m timeit -n 1 -s "t = memview(b'a' * 10**8)[8:]" "hash(t)"
original patched speedup
bytes 181 msec 46 msec 3.9x
UCS1 429 msec 46.2 msec 9.3x
UCS2 179 msec 91.9 msec 1.9x
UCS4 183 msec 184 msec 1x
memview() 362 msec 91.7 msec 3.9x
memview()[1:] 362 msec 93.2 msec 3.9x
memview()[8:] 362 msec 92.4 msec 3.9x
I don't know how it will be on other platforms.
----------
Added file: http://bugs.python.org/file27956/fast_hash_3.patch
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue16427>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com