Ruben, #1 you can try something like this try: with open('my_file.txt') as file: pass except IOError as e: print "Unable to open file" #Does not exist or you do not have read permission
#2. I would try to use regular expression push words to array and then you can manipulate array. Not sure if it is efficient way but it should work. #3 . easy way would be to use regular expression. Re module. #4. Once you will have array in #2 you can sort it and print whatever top words you need. #5. I am not sure the best way on this but you can play with array from #2. Thanks, Askar From: Pinedo, Ruben A [mailto:rapin...@miners.utep.edu] Sent: Wednesday, October 16, 2013 2:49 PM To: tutor@python.org Subject: [Tutor] Help please I was given this code and I need to modify it so that it will: #1. Error handling for the files to ensure reading only .txt file #2. Print a range of top words... ex: print top 10-20 words #3. Print only the words with > 3 characters #4. Modify the printing function to print top 1 or 2 or 3 .... #5. How many unique words are there in the book of length 1, 2, 3 etc I am fairly new to python and am completely lost, i looked in my book as to how to do number one but i cannot figure out what to modify and/or delete to add the print selection. This is the code: import string def process_file(filename): hist = dict() fp = open(filename) for line in fp: process_line(line, hist) return hist def process_line(line, hist): line = line.replace('-', ' ') for word in line.split(): word = word.strip(string.punctuation + string.whitespace) word = word.lower() hist[word] = hist.get(word, 0) + 1 def common_words(hist): t = [] for key, value in hist.items(): t.append((value, key)) t.sort(reverse=True) return t def most_common_words(hist, num=100): t = common_words(hist) print 'The most common words are:' for freq, word in t[:num]: print freq, '\t', word hist = process_file('emma.txt') print 'Total num of Words:', sum(hist.values()) print 'Total num of Unique Words:', len(hist) most_common_words(hist, 50) Any help would be greatly appreciated because i am struggling in this class. Thank you in advance Respectfully, Ruben Pinedo Computer Information Systems College of Business Administration University of Texas at El Paso
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor