Jani Tiainen wrote: > I have rather simple 'Address' object that contains streetname, > number, my own status and x,y coordinates for it. I have two lists > both containing approximately 30000 addresses. > > I've defined __eq__ method in my class like this: > > def __eq__(self, other): > return self.xcoord == other.xcoord and \ > self.ycoord == other.ycoord and \ > self.streetname == other.streetname and \ > self.streetno == other.streetno > > But it turns out to be very, very slow. > > Then I setup two lists: > > list_external = getexternal() > list_internal = getinternal() > > Now I need get all all addresses from 'list_external' that are not in > 'list_internal', and mark them as "new". > > I did it like this: > > for addr in list_external: > if addr not in list_internal: > addr.status = 1 # New address > > But in my case running that loop takes about 10 minutes. What I am > doing wrong?
Even if list_external and list_internal contain the same items you need about len(list_external)*(len(list_internal)/2), or 45 million comparisons. You can bring that down to 30000*some_small_factor if you make your addresses hashable and put the internal ones into a dict or set def __hash__(self): return hash((self.xcoord, self.yccord, self.streetname, self.streetno)) def __eq__(self, other): # as above Then set_internal = set(list_internal) for addr in list_external: if addr not in set_internal: addr.status = 1 Note that the attributes relevant for hash and equality must not change during this process. Peter -- http://mail.python.org/mailman/listinfo/python-list