Andrew Bennetts <s...@users.sourceforge.net> added the comment: Ok, this time test_set* passes :)
Currently if you have large set and small set the code will do len(large) lookups in the small set. When large is >> than small, it is cheaper to copy large and do len(small) lookups in large. On my laptop a size difference of 4 times is a clear winner for copy+difference_update over the status quo, even for sets of millions of entries. For more similarly sized sets (even only factor of 2 size difference) the cost of allocating a large set that is likely to be shrunk significantly is greater than the benefit. So my patch only switches behaviour for len(x)/4 > len(y). This patch is complementary to the patch in issue8425, I think. ---------- Added file: http://bugs.python.org/file17293/set-difference-speedup-2.diff _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue8685> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com