Marc-Andre Lemburg added the comment: On 04.04.2013 10:33, STINNER Victor wrote: >>> I don't understand why the patch makes the comparaison much slower, >>> since most time is supposed to be spend in memcmp()? >> >> Because reading the last character evicts useful data from the CPU cache, >> just before memcmp() reads it again from memory? >> >> In other words, I'm not convinced this is a useful heuristic.
Same here. The heuristic may work for short strings that easily fit into the CPU cache, but as soon as you use it on longer strings, this will result in much slower comparisons. Whether this results in a speedup or not also depends a lot on the domain of where you need to run comparisons, e.g. if you have run the heuristic on Python's special method names (such as "__init__") it won't give you any benefit. OTOH, it's easy to construct strings that benefit a lot from it :-) Something that typically works well in practice is to inline the comparison of the first few characters and then call memcmp() on the remaining ones. This avoids cache corruption and safes a few cycles setup costs for memcmp() for short strings. ---------- nosy: +lemburg _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue17628> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com