Hi Alan, I made a mistake and incorrectly assumed that differences between 54 lines of output and 27 lines of output is the result of removing duplicate email addresses, i.e., gsil...@umich.edu gsil...@umich.edu, c...@iupui.edu, c...@iupui.edu
Apparently, this is not the case and I was wrong :( The solution to the problem is in the desired line output: stephen.marqu...@uct.ac.za lo...@media.berkeley.edu zq...@umich.edu rjl...@iupui.edu zq...@umich.edu rjl...@iupui.edu c...@iupui.edu c...@iupui.edu gsil...@umich.edu gsil...@umich.edu zq...@umich.edu gsil...@umich.edu wagne...@iupui.edu zq...@umich.edu antra...@caret.cam.ac.uk gopal.ramasammyc...@gmail.com david.horw...@uct.ac.za david.horw...@uct.ac.za david.horw...@uct.ac.za david.horw...@uct.ac.za stephen.marqu...@uct.ac.za lo...@media.berkeley.edu lo...@media.berkeley.edu r...@media.berkeley.edu c...@iupui.edu c...@iupui.edu c...@iupui.edu There were 27 lines in the file with From as the first word Not in the output of a subset. Latest output: set(['stephen.marqu...@uct.ac.za', 'lo...@media.berkeley.edu', ' zq...@umich.edu', 'rjl...@iupui.edu', 'c...@iupui.edu', 'gsil...@umich.edu', 'wagne...@iupui.edu', 'antra...@caret.cam.ac.uk', ' gopal.ramasammyc...@gmail.com', 'david.horw...@uct.ac.za', ' r...@media.berkeley.edu']) ← Mismatch There were 54 lines in the file with From as the first word Latest revised code: fname = raw_input("Enter file name: ") if len(fname) < 1 : fname = "mbox-short.txt" fh = open(fname) count = 0 addresses = set() for line in fh: if line.startswith('From'): line2 = line.strip() line3 = line2.split() line4 = line3[1] addresses.add(line4) count = count + 1 print addresses print "There were", count, "lines in the file with From as the first word" Regards, Hal On Sat, Aug 1, 2015 at 5:44 PM, Alan Gauld <alan.ga...@btinternet.com> wrote: > On 02/08/15 00:07, Ltc Hotspot wrote: > >> Question1: The output result is an address or line? >> > > Its your assignment,. you tell me. > But from your previous mails I'm assuming you want addresses? > > Question2: Why are there 54 lines as compared to 27 line in the desired >> output? >> > > Because the set removes duplicates? So presumably there were 27 > duplicates? (Which is a suspicious coincidence!) > > fname = raw_input("Enter file name: ") >> if len(fname) < 1 : fname = "mbox-short.txt" >> fh = open(fname) >> count = 0 >> addresses = set() >> for line in fh: >> if line.startswith('From'): >> line2 = line.strip() >> line3 = line2.split() >> line4 = line3[1] >> addresses.add(line4) >> count = count + 1 >> print addresses >> print "There were", count, "lines in the file with From as the first word" >> > > That looks right in that it does what I think you want it to do. > > The output result: >> set(['stephen.marqu...@uct.ac.za', 'lo...@media.berkeley.edu', ' >> zq...@umich.edu', 'rjl...@iupui.edu', 'c...@iupui.edu', ' >> gsil...@umich.edu', >> 'wagne...@iupui.edu', 'antra...@caret.cam.ac.uk',' >> gopal.ramasammyc...@gmail.com', 'david.horw...@uct.ac.za', ' >> r...@media.berkeley.edu']) ← Mismatch >> > > That is the set of unique addresses, correct? > > There were 54 lines in the file with From as the first word >> > > And that seems to be the number of lines in the original file > starting with From. Can you check manually if that is correct? > > The desired output result: >> stephen.marqu...@uct.ac.za >> lo...@media.berkeley.edu >> zq...@umich.edu >> rjl...@iupui.edu >> zq...@umich.edu >> rjl...@iupui.edu >> > ... > > Now I'm confused again. This has duplicates but you said you > did not want duplicates? Which is it? > > ... > >> c...@iupui.edu >> c...@iupui.edu >> There were 27 lines in the file with From as the first word >> > > And this is reporting the number of lines in the output > rather than the file (I think). Which do you want? > > Its easy enough to change the code to govre the output > you demonstrate, but that's not what you originally asked > for. So just make up your mind exactly what it is you want > out and we can make it work for you. > > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > http://www.amazon.com/author/alan_gauld > Follow my photo-blog on Flickr at: > http://www.flickr.com/photos/alangauldphotos > > > _______________________________________________ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor