On Sun, Jul 17, 2005, Danny Yoo wrote: >> A related question is where's the trade-off between using ``in'' with a >> list, and a dictionary? I presume that using it with small hashes will >> be faster than dictionries since it doesn't have to calculate the >> hashes. > >Hi Bill, > >Scanning for an elements in a list is a "linear" operation, in the sense >that the time it takes to search is proportional to the size of the list. >(Big list == big search time.)
I just noticed that I said it backwards in my first post, ``using it with small hashes'' should have been ``using it with small lists'' (and my perl background leaks out referring to dictionaries as hashes :-). ... >This doesn't mean that dictionaries are always faster than lists: as you >know, calculating hash values can take time. But the cost of hashing is >usually negligible, since many of Python primitive data types (like >strings) automatically cache their hash values too! This would say that it's better to create the dictionary with string keys rather than tuples, but that seems pretty obvious in any case. The problem I'm working on involves going through a large list of invoices that are now zero balance, to purge those before a certain date that have no payment applications after that date. I have a dictionary of zero- balance invoices containing invoice objects, and each invoice object contains a list of invoice keys applied to it. This internal list may well contain keys that refer to invoices that are either non-zero or have a date after the cutoff date. # begin code snippet global invoices # dictionary of all zero balance invoices with date <= cutoff deleted = True while deleted: deleted = False keys = invoices.keys() for key in keys: # use try/except since the instance may be deleted try: invoice = invoices[keys] except KeyError: continue for appKey in invoice.appKeys: if not appKey in invoices: deleted = True del invoices[key] # this invoice can't be purged for appKey in invice.appKeys: try: del invoices[appKey] except KeyError: pass # finish processing invoices ... >A good book in algorithms will almost always cover the performance >characteristics of those two strategies; there are also a lot of good >resources on the web about them. NIST has two articles on those two: > > http://www.nist.gov/dads/HTML/linearSearch.html > http://www.nist.gov/dads/HTML/hashtab.html Thanks for the references (occassionaly there's something that government does that's actually useful :-). Bill -- INTERNET: [EMAIL PROTECTED] Bill Campbell; Celestial Software LLC UUCP: camco!bill PO Box 820; 6641 E. Mercer Way FAX: (206) 232-9186 Mercer Island, WA 98040-0820; (206) 236-1676 URL: http://www.celestial.com/ When the customer has beaten upon you long enough, give him what he asks for, instead of what he needs. This is very strong medicine, and is normally only required once. -- The Consultant's Curse: _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor