Re: [Tutor] loop error
On Thu, Dec 20, 2018 at 10:47:44PM +0100, Aine Gormley wrote: > Hello, could somebody take a quick look at my code? I am unsure why I am > getting a loop error? That's hard to do if you don't show us the code :-) Please COPY AND PASTE (don't try to retype it from memory) the MINIMUM amount of code which actually runs. If you're unsure about cutting the code down to the minimum that demonstrates the error, please feel free to ask. You can also read this: http://www.sscce.org/ Its written for Java programmers, but applies to any programming language including Python. And what is "a loop error"? Please COPY AND PASTE the full exception. It should start with a line: Traceback... and end with some sort of error message. -- Steve ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] loop error
Greetings Aine! On Thu, Dec 20, 2018 at 6:57 PM Aine Gormley wrote: > > Hello, could somebody take a quick look at my code? I am unsure why I am > getting a loop error? This is a plain text only list that does not (typically) allow file attachments. So I do not see any code. So if you wish for someone on this list to assist you, you need to copy and paste the relevant code into a plain text email, including a copy and paste of the error messages you are receiving. It is also helpful to mention your operating system and what version of Python you are using. An even better approach would be to construct the smallest possible runnable example code that reproduces your problem. Good luck and better thinking! -- boB ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] loop error
Hello, could somebody take a quick look at my code? I am unsure why I am getting a loop error? ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python
On Thu, Dec 20, 2018 at 10:49:25AM -0500, Mary Sauerland wrote: > I want to get rid of words that are less than three characters > f1_name = "/Users/marysauerland/Documents/file1.txt" > #the opinions > f2_name = "/Users/marysauerland/Documents/file2.txt" > #the constitution Better than comments are meaningful file names: opinions_filename = "/Users/marysauerland/Documents/file1.txt" constitution_filename = "/Users/marysauerland/Documents/file2.txt" > def read_words(words_file): > return [word.upper() for line in open(words_file, 'r') for word in > line.split()] Don't try to do too much in a single line of code. While technically that should work (I haven't tried it to see that it actually does) it would be better written as: def read_words(words_file): with open(words_file, 'r') as f: return [word.upper() for line in f for word in line.split()] This also has the advantage of ensuring that the file is closed after the words are read. In your earlier version, it is possible for the file to remain locked in an open state. Note that in this case Python's definition of "word" may not agree with the human reader's definition of a word. For example, Python, being rather simple-minded, will include punctuation in words so that "HELLO" "HELLO." count as different words. Oh well, that's something that can be adjusted later. For now, let's just go with the simple-minded definition of a word, and worry about adjusting it to something more specialised later. > read_words(f1_name) > #performs the function on the file The above line of code (and comment) are pointless. The function is called, the file is read, the words are generated, and then immediately thrown away. To use the words, you need to assign them to a variable, as you do below: > set1 = set(read_words(f1_name)) > #makes each word into a set and removes duplicate words A meaningful name is better. Also the comment is inaccurate: it is not that *each individual* word is turned into a set, but that the *list* of all the words are turned into a set. So better would be: opinions_words = set(read_words(opinions_filename)) constitition_words = set(read_words(constitution_filename)) This gives us the perfect opportunity to skip short words: opinions_words = set( word for word in read_words(opinions_filename) if len(word) >= 3) constitition_words = set( word for word in read_words(constitution_filename) if len(word) >= 3) Now you have two sets of unique words, each word guaranteed to be at least 3 characters long. The next thing you try to do is count how many words appear in each set. You do it with a double loop: > count_same_words = 0 > for word in set1: > if word in set2: > count_same_words += 1 but the brilliant thing about sets is that they already know how to do this themselves! Let's see the sorts of operations sets understand: py> set1 = set("abcdefgh") py> set2 = set("defghijk") py> set1 & set2 # the intersection (overlap) of both sets {'h', 'd', 'f', 'g', 'e'} py> set1 | set2 # the union (combination) of both sets {'f', 'd', 'c', 'b', 'h', 'i', 'k', 'j', 'a', 'g', 'e'} py> set1 ^ set2 # items in one or the other but not both sets {'i', 'k', 'c', 'b', 'j', 'a'} py> set1 - set2 # items in set1 but not set2 {'c', 'b', 'a'} (In the above, "py>" is the Python prompt. On your computer, your prompt is probably set to ">>>".) Can you see which set operation, one of & | ^ or - , you would use to get the set of words which appear in both sets? Hint: it isn't the - operation. If you wanted to know how many words appear in the constitution but NOT in the opinions, you could write: word_count = len(constitition_words - opinions_words) Does that give you a hint how to approach this? Steve ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python
Mary, Mary, It is often best to develop and test small parts of the project where you can easily play with it, then move it into more complex configurations like a function body Here is your code: def read_words(words_file): return [word.upper() for line in open(words_file, 'r') for word in line.split()] I made a file on my local system and this works: def read_words(words_file): return [word.upper() for line in open(words_file, 'r') for word in line.split()] now you are returning an uppercase version of the current 'word" stored in word. So what is the length of that word? Here is the modified variation on your code: >>> [word.upper() for line in open('TESTINK.txt', 'r') for word in line.split()] ['THIS', 'IS', 'LINE', 'ONE', 'AND', 'THIS', 'IS', 'ANOTHER', 'LINE', 'JUST', 'TO', 'TEST', 'WITH.'] Here is yet another modification showing the length, instead: >>> [len(word) for line in open('TESTINK.txt', 'r') for word in line.split()] [4, 2, 4, 3, 3, 4, 2, 7, 4, 4, 2, 4, 5] By your rules, you want to only keep those words where "len(word) > 3" So where in the list comprehension would you add this an if condition to get this? ['THIS', 'LINE', 'THIS', 'ANOTHER', 'LINE', 'JUST', 'TEST', 'WITH.'] Since you read in all your data using the same function, you might even make it take an optional value to cut at, defaulting with 3 or even 0. -Original Message- From: Tutor On Behalf Of Mary Sauerland Sent: Thursday, December 20, 2018 10:49 AM To: tutor@python.org Subject: [Tutor] Python Hi, I want to get rid of words that are less than three characters but I keep getting errors. I tried multiple ways but keep getting errors. Here is my code: f1_name = "/Users/marysauerland/Documents/file1.txt" #the opinions f2_name = "/Users/marysauerland/Documents/file2.txt" #the constitution def read_words(words_file): return [word.upper() for line in open(words_file, 'r') for word in line.split()] read_words(f1_name) #performs the function on the file set1 = set(read_words(f1_name)) #makes each word into a set and removes duplicate words read_words(f2_name) set2 = set(read_words(f2_name)) count_same_words = 0 for word in set1: if word in set2: count_same_words += 1 #comparing the set1 (set of unique words in the opinions) with set2 (set of unique words in the constitution) and adding 1 for each matching word found which is just counting the words print(count_same_words) Best, Mary ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python
On 12/20/18 8:49 AM, Mary Sauerland wrote: > Hi, > > I want to get rid of words that are less than three characters but I keep > getting errors. I tried multiple ways but keep getting errors. Just a quick note or two: > > Here is my code: > > f1_name = "/Users/marysauerland/Documents/file1.txt" > #the opinions > f2_name = "/Users/marysauerland/Documents/file2.txt" > #the constitution > > > def read_words(words_file): > return [word.upper() for line in open(words_file, 'r') for word in > line.split()] > > > read_words(f1_name) ^^^ this line is meaningless. "everything is an object" in Python. your function returns a list object - which you don't do anything with. you should assign a name to it, like: constitution_words = read_words(f1_name) Since no name is assigned to that object, Python sees it has no references, and it is just lost. and then... > #performs the function on the file > set1 = set(read_words(f1_name)) if you saved the object returned from the earlier call to the function, then you don't need to call the function again, instead you convert the saved list object to a set object. We can't tell whether you have an eventual use for the unfiltered list of words, or only the set of unique words, the answer to that determines how you write this section. picking a more descriptive name than set1 would be a good idea (and f1_name as well, and others - when writing software, the hard part is maintenance, where you or others have to go in later and fix or change something. using meaningful names really helps with that, so it's a good habit to get into). since you have sets consisting of words from your two documents, you may as well use set operations to work with them. Do you know the set operation to find all of the members of one set that are also in another set? hint: in set theory, that is called the intersection. you say you are trying to remove short words, but there seems to be no code to do that. instead you seem to be solving a different problem? ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python
On Dec 20, 2018 12:17 PM, "Mary Sauerland" wrote: > > Hi, > > I want to get rid of words that are less than three characters but I keep getting errors. I tried multiple ways but keep getting errors. Hi Mary welcome to the tutor list. We love to help. We are a few volunteers. It is very difficult for us to be mind readers. So please give us more information. Especially what the error is you are getting. I presume it is what we call a trace back. It is important that you copy the entire traceback and paste it into the email. It will also be very helpful if you gave us a sample of the two text files and the output You're Expecting. > > Here is my code: > > f1_name = "/Users/marysauerland/Documents/file1.txt" > #the opinions > f2_name = "/Users/marysauerland/Documents/file2.txt" > #the constitution > > > def read_words(words_file): > return [word.upper() for line in open(words_file, 'r') for word in line.split()] > > > read_words(f1_name) > #performs the function on the file > set1 = set(read_words(f1_name)) > #makes each word into a set and removes duplicate words > read_words(f2_name) > set2 = set(read_words(f2_name)) > > count_same_words = 0 > > for word in set1: > if word in set2: > count_same_words += 1 > #comparing the set1 (set of unique words in the opinions) with set2 (set of unique words in the constitution) and adding 1 for each matching word found which is just counting the words > print(count_same_words) > > > Best, > > Mary > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Python
Hi, I want to get rid of words that are less than three characters but I keep getting errors. I tried multiple ways but keep getting errors. Here is my code: f1_name = "/Users/marysauerland/Documents/file1.txt" #the opinions f2_name = "/Users/marysauerland/Documents/file2.txt" #the constitution def read_words(words_file): return [word.upper() for line in open(words_file, 'r') for word in line.split()] read_words(f1_name) #performs the function on the file set1 = set(read_words(f1_name)) #makes each word into a set and removes duplicate words read_words(f2_name) set2 = set(read_words(f2_name)) count_same_words = 0 for word in set1: if word in set2: count_same_words += 1 #comparing the set1 (set of unique words in the opinions) with set2 (set of unique words in the constitution) and adding 1 for each matching word found which is just counting the words print(count_same_words) Best, Mary ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor