Serhiy Storchaka added the comment: Stefan, thank you for the suggestion. The test showed that, in fact, at least under some x86 there is no performance decrease when using memcpy on nonaligned data. This is good news. The code can left simple and even some doubtful potential undefined behavior was removed.
Additional microbenchmarks: $ ./python -m timeit -n 1 -s "t = memview(b'a' * 10**8)" "hash(t)" $ ./python -m timeit -n 1 -s "t = memview(b'a' * 10**8)[1:]" "hash(t)" $ ./python -m timeit -n 1 -s "t = memview(b'a' * 10**8)[8:]" "hash(t)" original patched speedup bytes 181 msec 46 msec 3.9x UCS1 429 msec 46.2 msec 9.3x UCS2 179 msec 91.9 msec 1.9x UCS4 183 msec 184 msec 1x memview() 362 msec 91.7 msec 3.9x memview()[1:] 362 msec 93.2 msec 3.9x memview()[8:] 362 msec 92.4 msec 3.9x I don't know how it will be on other platforms. ---------- Added file: http://bugs.python.org/file27956/fast_hash_3.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue16427> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com