On 02/08/15 02:20, Ltc Hotspot wrote:
Hi Alan,

I made a mistake and incorrectly assumed that differences between 54 lines
of output and 27 lines of output is the result of removing duplicate email
addresses,

Apparently, this is not the case and I was wrong :(
The solution to the problem is in the  desired line output:

stephen.marqu...@uct.ac.za
lo...@media.berkeley.edu
zq...@umich.edu
rjl...@iupui.edu
zq...@umich.edu
rjl...@iupui.edu
...

OK, Only a couple of changes should see to that.

Latest revised code:
fname = raw_input("Enter file name: ")
if len(fname) < 1 : fname = "mbox-short.txt"
fh = open(fname)
count = 0
addresses = set()

change this to use a list

addresses = []

for line in fh:
     if line.startswith('From'):
         line2 = line.strip()
         line3 = line2.split()
         line4 = line3[1]
         addresses.add(line4)

and change this to use the list append() method

addresses.append(line4)

         count = count + 1
print addresses
print "There were", count, "lines in the file with From as the first word"

I'm not quite sure where the 54/27 divergence comes from except that
I noticed Emille mention that there were lines beginning 'From:'
too. If that's the case then follow his advice and change the if
test to only check for 'From ' (with the space).

That should be all you need.

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to