New submission from Andrew Bennetts <s...@users.sourceforge.net>: set.difference(s), when s is also a set, basically does::
res = set() for elem in self: if elem not in other: res.add(elem) This is wasteful when len(self) is much greater than len(other): $ python -m timeit -s "s = set(range(100000)); sd = s.difference; empty = set()" "sd(empty)" 100 loops, best of 3: 12.8 msec per loop $ python -m timeit -s "s = set(range(10)); sd = s.difference; empty = set()" "sd(empty)" 1000000 loops, best of 3: 1.18 usec per loop Here's a patch that compares the lengths of self and other before that loop, and if len(self) is greater, swaps them. The new timeit results are: $ python -m timeit -s "s = set(range(100000)); sd = s.difference; empty = set()" "sd(empty)" 1000000 loops, best of 3: 0.289 usec per loop $ python -m timeit -s "s = set(range(10)); sd = s.difference; empty = set()" "sd(empty)" 1000000 loops, best of 3: 0.294 usec per loop ---------- components: Interpreter Core files: set-difference-speedup.diff keywords: patch messages: 105489 nosy: spiv priority: normal severity: normal status: open title: set(range(100000)).difference(set()) is slow type: performance versions: Python 2.7 Added file: http://bugs.python.org/file17292/set-difference-speedup.diff _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue8685> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com