Re: [Tutor] Searching through files for values
On 14/08/15 05:07, Jason Brown wrote: for file_list in filenames: with open(file_list) as files: for items in vals: for line in files: Others have commented on your choice of names. I'll add one small general point. Try to match the plurality of your names to the nature of the object. Thus if it is a collection of items use a plural name. If it is a single object use a single name. This has the effect that for loops would normally look like: for in : This makes no difference to python but it makes it a lot easier for human readers - including you - to comprehend what is going on and potentially spot errors. Also your choice of file_list suggests it is a list object but in fact it's not, its' a single file, so simply reversing the name to list_file makes it clearer what the nature of the object is (although see below re using type names). Applying that to the snippet above it becomes: for list_file in filenames: with open(list_file) as file: for item in vals: for line in file: The final principle, is that you should try to name variable after their purpose rather than their type. ie. describe the content of the data not its type. Using that principle file might be better named as data or similar - better still what kind of data (dates, widgets, names etc), but you don't tell us that... And of course principles are just that. There will be cases where ignoring them makes sense too. HTH -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Searching through files for values
Jason Brown wrote: > (accidentally replied directly to Cameron) > > Thanks, Cameron. It looks like that value_file.close() tab was > accidentally tabbed when I pasted the code here. Thanks for the > suggestion > for using 'with' though! That's will be handy. > > To test, I tried manually specifying the list: > > vals = [ 'value1', 'value2', 'value3' ] > > And I still get the same issue. Only the first value in the list is > looked up. The problem is in the following snippet: > with open(file_list) as files: > for items in vals: > for line in files: > if items in line: > print file_list, line > I'll change it to some meaningful names: with open(filename) as infile: for search_value in vals: for line in infile: if search_value in line: print filename, "has", search_value, "in line", line.strip() You open infile once and then iterate over its lines many times, once for every search_value. But unlike a list of lines you can only iterate once over a file: $ cat values.txt alpha beta gamma $ python Python 2.7.6 (default, Jun 22 2015, 17:58:13) [GCC 4.8.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> lines = open("values.txt") >>> for line in lines: print line.strip() ... alpha beta gamma >>> for line in lines: print line.strip() ... >>> No output in the second loop. The file object remembers the current position and starts its iteration there. Unfortunately you have already reached the end, so there are no more lines. Possible fixes: (1) Open a new file object for every value: for filename in filenames: for search_value in vals: with open(filename) as infile: for line in infile: if search_value in line: print filename, "has", search_value, print "in line", line.strip() (2) Use seek() to reset the position of the file pointer: for filename in filenames: with open(filename) as infile: for search_value in vals: infile.seek(0) for line in infile: if search_value in line: print filename, "has", search_value, print "in line", line.strip() (3) If the file is small or not seekable (think stdin) read its contents in a list and iterate over that: for filename in filenames: with open(filename) as infile: lines = infile.readlines() for search_value in vals: for line in lines: if search_value in line: print filename, "has", search_value, print "in line", line.strip() (4) Adapt your algorithm to test all search values against a line before you proceed to the next line. This will change the order in which the matches are printed, but will work with both stdin and huge files that don't fit into memory. I'll leave the implementation to you as an exercise ;) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Searching through files for values
(accidentally replied directly to Cameron) Thanks, Cameron. It looks like that value_file.close() tab was accidentally tabbed when I pasted the code here. Thanks for the suggestion for using 'with' though! That's will be handy. To test, I tried manually specifying the list: vals = [ 'value1', 'value2', 'value3' ] And I still get the same issue. Only the first value in the list is looked up. Jason On Thu, Aug 13, 2015 at 7:32 PM, Cameron Simpson wrote: > On 13Aug2015 16:48, Jason Brown wrote: > >> I'm trying to search for list values in a set of files. The goal is to >> generate a list of lists that can later be sorted. I can only get a match >> on the first value in the list: >> >> contents of value_file: >> value1 >> value2 >> value3 >> ... >> >> The desired output is: >> >> file1 value1 >> file1 value2 >> file2 value3 >> file3 value1 >> ... >> >> Bit it's only matching on the first item in vals, so the result is: >> >> file1 value1 >> file3 value1 >> >> The subsequent values are not searched. >> > > Rhat is because the subsequent values are never loaded: > > filenames = [list populated with filenames in a dir tree] >> vals = [] >> value_file = open(vars) >> for i in value_file: >>vals.append(i.strip()) >>value_file.close() >> > > You close value_file inside the loop i.e. immediately after the first > value. Because the file is closed, the loop iteration stops. You need to > close it > outside the loop (after all the values have been loaded): > >value_file = open(vars) >for i in value_file: >vals.append(i.strip()) >value_file.close() > > It is worth noting that a better way to write this is: > >with open(vars) as value_file: >for i in value_file: >vals.append(i.strip()) > > Notice that there is no .close(). The "with" construct is the pynthon > syntax to use a context manager, and "open(vars)" returns an open file, > which is also a context manager. A context manager has enter and exit > actions which fire unconditionally at the start and end of the "with", even > if the with is exited with an exception or a control like "return" or > "break". > > The benefit of this is after the "with", the file will _always" get > closed. It is also shorter and easier to read. > > for file_list in filenames: >>with open(file_list) as files: >> for items in vals: >> for line in files: >> if items in line: >> print file_list, line >> > > I would remark that "file_list" is not a great variable name. Many people > would read it as implying that its value is a list. Personally I would have > just called it "filename", the singular of your "filenames". > > Cheers, > Cameron Simpson > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Searching through files for values
On 13Aug2015 16:48, Jason Brown wrote: I'm trying to search for list values in a set of files. The goal is to generate a list of lists that can later be sorted. I can only get a match on the first value in the list: contents of value_file: value1 value2 value3 ... The desired output is: file1 value1 file1 value2 file2 value3 file3 value1 ... Bit it's only matching on the first item in vals, so the result is: file1 value1 file3 value1 The subsequent values are not searched. Rhat is because the subsequent values are never loaded: filenames = [list populated with filenames in a dir tree] vals = [] value_file = open(vars) for i in value_file: vals.append(i.strip()) value_file.close() You close value_file inside the loop i.e. immediately after the first value. Because the file is closed, the loop iteration stops. You need to close it outside the loop (after all the values have been loaded): value_file = open(vars) for i in value_file: vals.append(i.strip()) value_file.close() It is worth noting that a better way to write this is: with open(vars) as value_file: for i in value_file: vals.append(i.strip()) Notice that there is no .close(). The "with" construct is the pynthon syntax to use a context manager, and "open(vars)" returns an open file, which is also a context manager. A context manager has enter and exit actions which fire unconditionally at the start and end of the "with", even if the with is exited with an exception or a control like "return" or "break". The benefit of this is after the "with", the file will _always" get closed. It is also shorter and easier to read. for file_list in filenames: with open(file_list) as files: for items in vals: for line in files: if items in line: print file_list, line I would remark that "file_list" is not a great variable name. Many people would read it as implying that its value is a list. Personally I would have just called it "filename", the singular of your "filenames". Cheers, Cameron Simpson ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Searching through files for values
Hi, I'm trying to search for list values in a set of files. The goal is to generate a list of lists that can later be sorted. I can only get a match on the first value in the list: contents of value_file: value1 value2 value3 ... The desired output is: file1 value1 file1 value2 file2 value3 file3 value1 ... Bit it's only matching on the first item in vals, so the result is: file1 value1 file3 value1 The subsequent values are not searched. filenames = [list populated with filenames in a dir tree] vals = [] value_file = open(vars) for i in value_file: vals.append(i.strip()) value_file.close() for file_list in filenames: with open(file_list) as files: for items in vals: for line in files: if items in line: print file_list, line for line in vals: print line returns: ['value1', 'value2', 'value3'] print filenames returns: ['file1', 'file2', 'file3'] Any help would be greatly appreciated. Thanks, Jason ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor