Re: how to remove 50000 elements from a 100000 list?
to Andrew Gwozdziewycz: Real humor... Peter Otten: thanks your reminder, in my project, a will a superset of b. so symmetric_difference equals difference. thank you all again! -- http://mail.python.org/mailman/listinfo/python-list
Re: how to remove 50000 elements from a 100000 list?
Peter Otten wrote: > Ju Hui wrote: > >> cool! >> thanks you all! >> I choose >> a=set(range(10)) >> b=set(range(5)) >> a.symmetric_difference(b) >> certainly,the real data not range(), I am satisfied the performance of >> set.difference >> >> thank you all again! > > Be warned that this may /add/ items to a: > set("abc").symmetric_difference("bcd") > set(['a', 'd']) > > If your subject is correct you want difference(), not > symmetric_difference(). > > Peter Whoops. My bad. Peter is correct. -Larry -- http://mail.python.org/mailman/listinfo/python-list
Re: how to remove 50000 elements from a 100000 list?
It's easy in this case: a = range(5, 10) But, I'm just trying to add some humor to this thread :) On May 5, 2006, at 9:36 AM, Ju Hui wrote: > I want to remove about 5 elements from a list,which has 10 > elements. > sample code like below: > a=range(10) b=range(4) for x in b: > ... a.remove(x) > ... a > [4, 5, 6, 7, 8, 9] > > when a and b is small size, it will finished quickly, but when a and b > have many elements. > such as: a=range(10) b=range(5) for x in b: > ... a.remove(x) > ... > it will very slowly. Shall I change to another data structure and > choos > a better arithmetic? > any suggestion is welcome. > thanks a lot! > > -- > http://mail.python.org/mailman/listinfo/python-list --- Andrew Gwozdziewycz [EMAIL PROTECTED] http://www.23excuses.com http://ihadagreatview.org http://and.rovir.us -- http://mail.python.org/mailman/listinfo/python-list
Re: how to remove 50000 elements from a 100000 list?
Ju Hui wrote: > cool! > thanks you all! > I choose > a=set(range(10)) > b=set(range(5)) > a.symmetric_difference(b) > certainly,the real data not range(), I am satisfied the performance of > set.difference > > thank you all again! Be warned that this may /add/ items to a: >>> set("abc").symmetric_difference("bcd") set(['a', 'd']) If your subject is correct you want difference(), not symmetric_difference(). Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: how to remove 50000 elements from a 100000 list?
cool! thanks you all! I choose a=set(range(10)) b=set(range(5)) a.symmetric_difference(b) certainly,the real data not range(), I am satisfied the performance of set.difference thank you all again! -- http://mail.python.org/mailman/listinfo/python-list
Re: how to remove 50000 elements from a 100000 list?
Tim Chase <[EMAIL PROTECTED]> wrote: >Another attempt might be to try > > >>> a = [x for x in a if x not in b] > >However, this is still doing A*B checks, and will likely >degrade with as their sizes increase. Combine this with the use of sets others have suggested if the order of a matters, ie: >>> bset = set(b) >>> a = [ x for x in a if x not in bset ] which gets you down to O(A) since set membership is O(1). -- \S -- [EMAIL PROTECTED] -- http://www.chaos.org.uk/~sion/ ___ | "Frankly I have no feelings towards penguins one way or the other" \X/ |-- Arthur C. Clarke her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump -- http://mail.python.org/mailman/listinfo/python-list
Re: how to remove 50000 elements from a 100000 list?
> How about a listcomprehension? > > > new_list = [e for e in old_list if predicate(e)] > # Possibly you need this, too: > old_list[:] = new_list forgot the predicate. And you should use a set of objects to remove, as then you'd have O(1) behavior for the in-operator So: to_remove = set(b) new_list = [e for e in a if e not in to_remove] Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: how to remove 50000 elements from a 100000 list?
Ju Hui wrote: > I want to remove about 5 elements from a list,which has 10 > elements. > sample code like below: > a=range(10) b=range(4) for x in b: > ... a.remove(x) > ... a > [4, 5, 6, 7, 8, 9] > > when a and b is small size, it will finished quickly, but when a and b > have many elements. > such as: a=range(10) b=range(5) for x in b: > ... a.remove(x) > ... > it will very slowly. Shall I change to another data structure and choos > a better arithmetic? > any suggestion is welcome. > thanks a lot! > A list isn't going to respond very well to random removals of large numbers of elements like this. Remember that it must do linear search to find the value to be removed and then basically build a completely new list with the remaining values. Depending on the data in question, you might be able to leverage things. Are the two lists sorted? If so you can iterate over both of them and build a third list with the results. This is not particularly 'elegant' but it is fast for sorted lists: import time a=range(10) b=range(5) start_time=time.time() for x in b: a.remove(x) stop_time=time.time() print "brute force elapsed time=%.2f seconds" % (stop_time-start_time) start_time=time.time() n=[] i1=0 i2=0 while 1: try: v1=a[i1] except IndexError: break try: v2=b[i2] except IndexError: n.extend(a[i1:]) break if v1 < v2: n.append(v1) i1+=1 continue elif v1 > v2: i2+=1 continue else: i1+=1 i2+=1 continue stop_time=time.time() print "new elapsed time=%.2f seconds" % (stop_time-start_time) start_time=time.time() a=set(a) b=set(b) a.symmetric_difference(b) stop_time=time.time() print "sets elapsed time=%.2f seconds" % (stop_time-start_time) brute force elapsed time=4.13 seconds new elapsed time=0.05 seconds sets elapsed time=0.03 seconds There may be other ways using bisect or some other module. If data is random, unsorted or has duplicates, I would look at in-memory database like SQLlite instead. If the values are unique you can do something like: a=set(range(10)) b=set(range(5)) a.symmetric_difference(b) Which is the fastest way. It all depends on the "real" data which we can't tell from your test using range(). -Larry Bates -- http://mail.python.org/mailman/listinfo/python-list
Re: how to remove 50000 elements from a 100000 list?
On May 5, 2006, at 9:36 AM, Ju Hui wrote: >>> a=range(10) >>> b=range(5) >>> for x in b: > ... a.remove(x) > ... > it will very slowly. Shall I change to another data structure and choos > a better arithmetic? > any suggestion is welcome. If removal is an O(n) operation, then removing 1/2 the list will take O(n**2), which you don't want. You'd be better off with the contents of "a" being in a hash table (O(1) removal in practice) or a balanced tree (O(log n) removal). Another possibility: If the a and b lists are ordered in the same way, then you could walk through the lists in order using a merge procedure, generating a new list as you go. After ruling out slow data structures and algorithms, you'll almost certainly be better off using something built in to Python rather than coding your own data structure in Python. Jack Orenstein -- http://mail.python.org/mailman/listinfo/python-list
Re: how to remove 50000 elements from a 100000 list?
> but when a and b have many elements. such as: > a=range(10) b=range(5) for x in b: > > ... a.remove(x) > ... > it will very slowly. Well, your problem is rather ambiguous. In your examples, you're removing contiguous ranges. Thus, you should be able to do something like >>> a = range(10) >>> del a[:5] which may have a considerable speedup. However, if B contains a disjoint sets of entries, the simple solution will require A*B checks, making it dependant on the size of A and B (as you've discovered). Assuming the "del" method is considerably faster, you might be able to do some sort of optimization of the disjoint set, using the above-mentioned method. Break B into contiguous pieces, and then pass those as slices to delete. Or if B is a pattern of some sort, you could do >>> a = range(10) >>> del a[::2] #delete even numbers >>> a = range(10) >>> del a[1::2] #delete odd numbers >>> a = range(10) >>> del a[2::5] # delete every 5th element beginning with the 3rd Another attempt might be to try >>> a = [x for x in a if x not in b] However, this is still doing A*B checks, and will likely degrade with as their sizes increase. Just a few ideas. -tkc -- http://mail.python.org/mailman/listinfo/python-list
Re: how to remove 50000 elements from a 100000 list?
Try to use set objects: >>> a=set(range(10)) >>> b=set(range(5)) >>> a = a - b -- http://mail.python.org/mailman/listinfo/python-list
Re: how to remove 50000 elements from a 100000 list?
Ju Hui wrote: > I want to remove about 5 elements from a list,which has 10 > elements. > sample code like below: > a=range(10) b=range(4) for x in b: > ... a.remove(x) > ... a > [4, 5, 6, 7, 8, 9] > > when a and b is small size, it will finished quickly, but when a and b > have many elements. > such as: a=range(10) b=range(5) for x in b: > ... a.remove(x) > ... > it will very slowly. Shall I change to another data structure and choos > a better arithmetic? How about a listcomprehension? new_list = [e for e in old_list if predicate(e)] # Possibly you need this, too: old_list[:] = new_list Diez -- http://mail.python.org/mailman/listinfo/python-list
how to remove 50000 elements from a 100000 list?
I want to remove about 5 elements from a list,which has 10 elements. sample code like below: >>> a=range(10) >>> b=range(4) >>> for x in b: ... a.remove(x) ... >>> a [4, 5, 6, 7, 8, 9] when a and b is small size, it will finished quickly, but when a and b have many elements. such as: >>> a=range(10) >>> b=range(5) >>> for x in b: ... a.remove(x) ... it will very slowly. Shall I change to another data structure and choos a better arithmetic? any suggestion is welcome. thanks a lot! -- http://mail.python.org/mailman/listinfo/python-list