On 3 October 2012 04:39, Palice Fan <magicwizards...@gmail.com> wrote:

> Hello
> i got stuck with the last bit of my programming practice.
> Can somebody help me?
> Write a program to read through a mail log, and figure out who had the most
> messages in the file. The program looks for “From” lines and takes the
> second parameter on
> those lines as the person who sent the mail.
> The program creates a Python dictionary that maps the sender’s address to
> the total number of
> messages for that person.
> After all the data has been read the program looks through the dictionary
> using a maximum loop
> (see Section 5.7.2) to find who has the most messages and how many
> messages the person has.
>
> Enter a file name: mbox-short.txt
> c...@iupui.edu :5
> Enter a file name: mbox.txt
> zq...@umich.edu :195
>
> Instead of printing off a number beside the email, i got another email and
> i dont know how to fix it.
>

For future reference, please either include the source code in the email as
text or as text attachment.  A screen capture means I have to retype all
your code (and in this case some data) in order to have a look at your
problem.  Not fun.

To fix your problem you have to reverse engineer what's going on in your
program.  I'll try and walk you through a little thought process in
figuring out what's going on to try and help you.

The last statement in your program (which is where the error is apparent)
prints a fixed email address followed by a value that's assigned earlier on
in a loop from the "values" variable. Consequently you should carefully
inspect your code and ask yourself how it's possible that an email address
instead of a number is being assigned to the "values" variable and thereby
eventually to the "max" variable.  (By the way, note that "max" is not
recommended as a variable name since max is also a built-in function in
Python and so by declaring a variable with the same name you're hiding
(known as shadowing) the Python function.  You can see there's something
special about it by the fact that IDLE colours it purple, which should tip
you off.)  But anyway, back to your "max" variable and "values" variable,
we now look back carefully at the loop to see how or where we might be
picking up email addresses when we should be getting integer counts...
Let's look carefully at the loop declaration:

for values in messages:

Hmmm, this is looping directly over the dictionary "messages".  What is
returned when you iterate directly over a dict like that?  (Hint, it's not
the values, but the keys... e.g. the email addresses.)  Add some print
statements in your loop so you can see what happens when it runs, for
example:

print 'Starting iterating over "messages" dict'
for values in messages:
    print 'Value of "values" this iteration =', values
    if max is None or values > max:
        print 'Updating max...'
        max = values
    print 'Value of "max" after this iteration =', max

If you apply similar changes to your program and run that you'll see why
the program doesn't work -- "values" is being assigned the keys (email
addresses) from the dict, not the values.  It should also become clear that
basically "values" is also a bad choice for the items being iterated over
in the messages dict and is perhaps adding to the confusion, better would
be:

for email_sender in messages:

This would make it clear that the items being iterated over are in fact the
email addresses.  It's always a good idea to use descriptive specific names
in your programs, not least because you yourself also need to read and
understand your own code.  Anyway, then later in your loop it's then
obvious that you can't just do:

    if max is None or values > max:
        max = values

(or if we use my suggested renaming)

    if max is None or email_sender > max:
        max = email_sender

Instead you want to retrieve the actual value (count) from the dict for
that specific email sender, e.g.

    if max is None or messages[email_sender] > max:
        max = messages[email_sender]

... and with that I've now basically explained the essence of your first
main problem.

However there remains another major flaw. Why are we assigning and
outputting 'c...@iupui.edu' as the email address with the maximum number of
emails, for any input?  Clearly that can't be right - if the input changes
and another email address has the highest count then this code will output
the wrong result.  So in addition to saving the max count, you must also
save the max sender in the loop.  I think that's enough for now, see if you
can fix your program given the above hints and if not post back again.

HTH,

Walter
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to