On 03/26/2013 12:36 AM, Robert Sjoblom wrote:
Hi again, Tutor List.

I am trying to figure out a problem I've run into. Let me first say
that this is an assignment, so please don't give me any answers, but
just nudge me in the general direction. So the task is this: from a
text file, populate three different dictionaries with various
information. The text file is structured like so:
Georgie Porgie
87%
$$$
Canadian, Pub Food

So name, rating, price range, and food offered. After food offered
follows a blank line before the next restaurant is listed.


There are a number of things about the input file that you haven't specified, and it's useful to create a running description of the assumptions you're making about it. That way, if one of those assumptions turns out to not always be true, you at least have a clue as to what might be wrong.

And in real-life problems, you might want to add code to test every one of those assumptions, and exit with a clean message when the data doesn't meet them.

Examples of such assumptions:

1) the "name" line is unique;  no two records have the same name
2) the rating is always exactly two digits followed by a percent sign, even if it's less than 10%. 3) white space may occur before and after the dollarsigns on the price-range field, but never on the rating or name lines 4) there will be exactly 5 lines for every record, including the last one in the file.

The three dictionaries are:
name_to_rating = {}
price_to_names = {'$': [], '$$': [], '$$$': [], '$$$$': []}
cuisine_to_names = {}

Now I've poked at this for a while now, and one idea I had, which I
worked on for quite a while, was that since the restaurants all start
at index 0, 5, 10 and so on, I could structure a while loop like this:
with open('textfile.txt') as mdf:
   file_length = len(mdf.readlines())-1
   mdf.seek(0)
   data = mdf.readlines()

   i = 0
   while file_length > 0:
     name_to_rating[data[i]] = int(data[i+1][:2])
     price_to_names[data[i+2].strip()].append(data[i].strip())
     # here's the cuisine_to_names part
     i += 5
     file_length -= 5

And while this works, for the two first dictionaries,  it seems really
cumbersome -- especially that second expression -- and very, very
brittle. However, even if I was happy with that, I can't figure out
what to do in the situation where:
data[i+3] = 'Canadian, Pub Food' #should be two items, is currently a string.
My problem is that I'm... stupid. I can split the entry into a list
with two items, but even so I don't know how to add the key: value
pair to the dictionary so that the value is a list, which I then later
can append things to.


Nothing stupid about that. Your only shortcoming is assuming it should be a single line doing the assignment. Once you use cuisines.split(something) to make a list of cuisines, you then need to loop over them. And if the cuisine doesn't already exist, you need to create the item, while if it does, you need to append to the item.

I'm sorry, this sounds terribly confused, I know. I had another idea
to feed each line to a function, because no restaurant name has a
comma in it, and food offered always has a comma in it if the
restaurant offers more than one kind. But again, this seems really
brittle.

I guess we can't use objects (for some reason), but that doesn't
really matter because if I can't extract the data into dictionaries I
wouldn't have much use of an object either way. So yeah, my two
questions are these:
is there a better way to move through the text file other than a
really convoluted expression? And how do I add more than one value to
a key in a dictionary, if the values are added at different times and
there's no list created in the dictionary to begin with?

(I briefly though about initializing empty lists for each food type in
the dictionary and go with my horrible expressions, but that seems
like a cheap way out of a problem I'd rather tackle in a good way to
begin with)

Much thanks in advance.


First thing I'd do to make those lines clearer is to assign temp names to each of those fields. For example, if you say
    name =

then the other places that use name can be much more readable. Likewise, if a particular name needs to be stripped or split before being assigned, it's in one common place.

So the loop would start with four assignments, capturing usable versions of those four lines. Then you'd have 3 assignments, updating the three dictionaries from those four names. And one of those assignments would update multiple dictionary items, it would actually be a loop.

You mention objects, which is one way to make things easier. But you didn't mention functions. I think it'd be an improvement if each dictionary had a function created to do its updating. Then the loop that you're writing here would be four assignments, followed by 3 function calls.

Finally, you ask if there's a better way than readlines(). I don't think there's any harm in doing it this way, though it could take a lot of memory if the file is really large. But why not do a readline() for each individual variable? Then all the bookkeeping of i+3 etc goes away.



--
DaveA
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to