On Sat, 07 May 2005 11:08:31 +1000, Maurice LING <[EMAIL PROTECTED]> wrote:
>James Stroud wrote: > >> Sorry Maurice, apparently in bash its "ulimit" (no n). I don't use bash, so >> I >> don't know all of the differences offhand. Try that. >> >> James > >Thanks guys, > >It doesn't seems to help. I'm thinking that it might be a SOAPpy >problem. The allocation fails when I grab a list of more than 150k >elements through SOAP but allocating a 1 million element list is fine in >python. > >Now I have a performance problem... > >Say I have 3 lists (20K elements, 1G elements, and 0 elements), call >them 'a', 'b', and 'c'. I want to filter all that is in 'b' but not in >'a' into 'c'... > > >>> a = range(1, 100000, 5) > >>> b = range(0, 1000000) > >>> c = [] > >>> for i in b: >... if i not in a: c.append(i) >... > >This takes forever to complete. Is there anyway to optimize this? > Checking whether something is in a list may average checking equality with each element in half the list. Checking for membership in a set should be much faster for any significant size set/list. I.e., just changing to a = set(range(1, 100000, 5)) should help. I assume those aren't examples of your real data ;-) You must have a lot of memory if you are keeping 1G elements there and copying a significant portion of them. Do you need to do this file-to-file, keeping a in memory? Perhaps page-file thrashing is part of the time problem? Regards, Bengt Richter -- http://mail.python.org/mailman/listinfo/python-list