Hi Lee, Alan and Steven, Thank you very much for your replies!
First, Lee: >> That does not seem like it will work. What happens when >> 2 addresses have the same zip code? --> Sorry I didn't answer that before. When the zipcode is known, that's not a problem. The data typist simply has to enter the zip code and the street number and voilĂ , the street name and city name appear. A big time saver. When the zipcode is the UNknown, indeed I need street name, apt number, and city to get the right zip code. Without the street number, I might end up with a list of zip codes. But having no street number would automatically invalidate the given address. We couldn't possibly mail a letter without having the apt. number! I just ordered a book on sqlite this morning (http://www.amazon.com/SQLite-Chris-Newman/dp/067232685X/ref=sr_1_2?ie=UTF8&s=books&qid=1256736664&sr=1-2) It indeed seems like the way to go, also in the wider context of the program. It makes much more sense to maintain one database table instead of 3 csv files for the three data typists' output. Alan: I forwarded your book to my office address. I'll print and read it! Btw, your private website is nice too. Nice pictures! Do you recognize where this was taken:http://yfrog.com/n0scotland046j .You're lucky to live in a beautiful place like Scotland Cheers!! Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________ From: Lee Harr <miss...@hotmail.com> To: tutor@python.org Sent: Sat, October 2, 2010 12:56:21 AM Subject: Re: [Tutor] (de)serialization questions >>> I have data about zip codes, street and city names (and perhaps later also of >>> street numbers). I made a dictionary of the form {zipcode: (street, city)} >> >> One dictionary with all of the data? >> >> That does not seem like it will work. What happens when >> 2 addresses have the same zip code? You did not answer this question. Did you think about it? > Maybe my main question is as follows: what permanent object is most suitable to > store a large amount of entries (maybe too many to fit into the computer's > memory), which can be looked up very fast. One thing about Python is that you don't normally need to think about how your objects are stored (memory management). It's an advantage in the normal case -- you just use the most convenient object, and if it's fast enough and small enough you're good to go. Of course, that means that if it is not fast enough, or not small enough, then you've got to do a bit more work to do. > Eventually, I want to create two objects: > 1-one to look up street name and city using zip code So... you want to have a function like: def addresses_by_zip(zipcode): '''returns list of all addresses in the given zipcode''' .... > 2-one to look up zip code using street name, apartment number and city and another one like: def zip_by_address(street_name, apt, city): '''returns the zipcode for the given street name, apartment, and city''' .... To me, it sounds like a job for a database (at least behind the scenes), but you could try just creating a custom Python object that holds these things: class Address(object): street_number = '345' street_name = 'Main St' apt = 'B' city = 'Springfield' zipcode = '99999' Then create another object that holds a collection of these addresses and has methods addresses_by_zip(self, zipcode) and zip_by_address(self, street_number, street_name, apt, city) > I stored object1 in a marshalled dictionary. Its length is about 450.000 (I >live > in Holland, not THAT many streets). Look-ups are incredibly fast (it has to, > because it's part of an autocompletion feature of a data entry program). I > haven't got the street number data needed for object2 yet, but it's going to be > much larger. Many streets have different zip codes for odd or even numbers, or > the zip codes are divided into street number ranges (for long streets). Remember that you don't want to try to optimize too soon. Build a simple working system and see what happens. If it is too slow or takes up too much memory, fix it. > You suggest to simply use a file. I like simple solutions, but doesn't that, by > definition, require a slow, linear search? You could create an index, but then any database will already have an indexing function built in. I'm not saying that rolling your own custom database is a bad idea, but if you are trying to get some work done (and not just playing around and learning Python) then it's probably better to use something that is already proven to work. If you have some code you are trying out, but are not sure you are going the right way, post it and let people take a look at it. _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor