Re: filter list fast
lars_woetmann wrote: > I have a list I filter using another list and I would like this to be > as fast as possible > right now I do like this: > > [x for x in list1 if x not in list2] > > i tried using the method filter: > > filter(lambda x: x not in list2, list1) > > but it didn't make much difference, because of lambda I guess > is there any way I can speed this up Both of these techniques are O(n^2). You can reduce it to O(n log n) by using sets: >>> set2 = set(list2) >>> [x for x in list1 if x not in set2] Checking to see if an item is in a set is much more efficient than a list. --Ben -- http://mail.python.org/mailman/listinfo/python-list
Re: filter list fast
lars_woetmann wrote: > I have a list I filter using another list and I would like this to be > as fast as possible right now I do like this: > > [x for x in list1 if x not in list2] > > i tried using the method filter: > > filter(lambda x: x not in list2, list1) > > but it didn't make much difference, because of lambda I guess > is there any way I can speed this up if list2 is a list object, "not in list2" is an O(N) operation. maybe you should use sets instead ? does the following work better ? set2 = set(list2) result = [x for x in list1 if x not in set2] ? -- http://mail.python.org/mailman/listinfo/python-list
Re: filter list fast
Lars Woetmann wrote: > I have a list I filter using another list and I would like > this to be as fast as possible > right now I do like this: > > [x for x in list1 if x not in list2] > > i tried using the method filter: > > filter(lambda x: x not in list2, list1) > > but it didn't make much difference, because of lambda I guess > is there any way I can speed this up If you use a reasonably new python version, you could use sets: #v+ >>> a = set(range(10)) >>> a set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> b = set(range(5, 15)) >>> b set([5, 6, 7, 8, 9, 10, 11, 12, 13, 14]) >>> a.difference(b) set([0, 1, 2, 3, 4]) >>> a-b set([0, 1, 2, 3, 4]) >>> list(a-b) [0, 1, 2, 3, 4] >>> #v- Cheers, -- Klaus Alexander Seistrup SubZeroNet, Copenhagen, Denmark http://magnetic-ink.dk/ -- http://mail.python.org/mailman/listinfo/python-list
Re: filter list fast
> Both of these techniques are O(n^2). You can reduce it to O(n log n) > by using sets: > set2 = set(list2) [x for x in list1 if x not in set2] > > Checking to see if an item is in a set is much more efficient than a > list. Is the set-lookup reliably O(log n)? I was under the impression that it is hash-based, and this should be O(1) usually, but couldbve O(n) worst-case (hash the same for _all_ entries). Regards, Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: filter list fast
Diez B. Roggisch wrote: >>Both of these techniques are O(n^2). You can reduce it to O(n log n) >>by using sets: >> >set2 = set(list2) >[x for x in list1 if x not in set2] >> >>Checking to see if an item is in a set is much more efficient than a >>list. > > Is the set-lookup reliably O(log n)? I was under the impression that it is > hash-based, and this should be O(1) usually, but couldbve O(n) worst-case > (hash the same for _all_ entries). That's largely a theoretical concern. Google for something like '''dict worst-case performance "tim peters"''' to learn more. (The third article there (no doubt obsolete in some ways, given that it was in 2000) says that Python "keeps at least 1/3 of the internal hash table entries unused, making collisions very rarely a problem... It's possible to contrive keys that will cause collisions systematically ... but unlikely to happen by accident in 2.0") -Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: filter list fast
Thanks all, sets work like i charm -- http://mail.python.org/mailman/listinfo/python-list
Re: filter list fast
What was the speed-up ? -- http://mail.python.org/mailman/listinfo/python-list
Re: filter list fast
comparing [x for x in list1 if x not in list2] with set1, set2 = set(list1), set(list2) list(set1-set2) gives something like len(list2) speedup -- 10010 1000 100 11000 the speedup is constant for different len(list1) -- http://mail.python.org/mailman/listinfo/python-list
Re: filter list fast
Thanks. -- http://mail.python.org/mailman/listinfo/python-list