Andrew Bennetts <s...@users.sourceforge.net> added the comment:

Ok, this time test_set* passes :)

Currently if you have large set and small set the code will do len(large) 
lookups in the small set.  When large is >> than small, it is cheaper to copy 
large and do len(small) lookups in large.  On my laptop a size difference of 4 
times is a clear winner for copy+difference_update over the status quo, even 
for sets of millions of entries.  For more similarly sized sets (even only 
factor of 2 size difference) the cost of allocating a large set that is likely 
to be shrunk significantly is greater than the benefit.  So my patch only 
switches behaviour for len(x)/4 > len(y).

This patch is complementary to the patch in issue8425, I think.

----------
Added file: http://bugs.python.org/file17293/set-difference-speedup-2.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue8685>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to