> def rmlist(original, deletions): > return [i for i in original if i not in deletions] > > original will be a list of odd numbers and deletions will be numbers > that are not prime, thus this code will return all items in original > that are not in deletions. For n > 100,000 or so, the program takes a > very long time to run, whereas it's fine for numbers up to 10,000. > > Does anybody know a faster way to do this? (finding the difference all > items in list a that are not in list b)?
Testing for membership in an unsorted list is an O(n) sort of operation...the larger the list, the longer it takes. I presume order doesn't matter, or that the results can be sorted after the fact. If this is the case, it's quite efficient to use sets which provide intersection/difference/union methods. If you pass in sets rather than lists, you can simply return original.difference(deletions) It's almost not worth calling a function for :) There's also an in-place version called difference_update(). Once you've found all the results you want, and done all the set differences you want, you can just pass the resulting set to a list and sort it, if sorted results matter. -tkc -- http://mail.python.org/mailman/listinfo/python-list