Re: [Tutor] Searching through files for values

2015-08-14 Thread Alan Gauld

On 14/08/15 05:07, Jason Brown wrote:


for file_list in filenames:

with open(file_list) as files:
 for items in vals:
 for line in files:


Others have commented on your choice of names.
I'll add one small general point.
Try to match the plurality of your names to the
nature of the object. Thus if it is a collection
of items use a plural name.

If it is a single object use a single name.

This has the effect that for loops would
normally look like:

for  in :

This makes no difference to python but it makes it a lot
easier for human readers - including you - to comprehend
what is going on and potentially spot errors.

Also your choice of file_list suggests it is a list object
but in fact it's not, its' a single file, so simply reversing
the name to list_file makes it clearer what the nature of
the object is (although see below re using type names).

Applying that to the snippet above it becomes:

for list_file in filenames:
with open(list_file) as file:
for item in vals:
for line in file:

The final principle, is that you should try to name variable
after their purpose rather than their type. ie. describe the
content of the data not its type.

Using that principle file might be better named as data
or similar - better still what kind of data (dates,
widgets, names etc), but you don't tell us that...

And of course principles are just that. There will be cases
where ignoring them makes sense too.

HTH
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Searching through files for values

2015-08-14 Thread Peter Otten
Jason Brown wrote:

> (accidentally replied directly to Cameron)
> 
> Thanks, Cameron.  It looks like that value_file.close() tab was
> accidentally tabbed when I pasted the code here.  Thanks for the
> suggestion
> for using 'with' though!  That's will be handy.
> 
> To test, I tried manually specifying the list:
> 
> vals = [ 'value1', 'value2', 'value3' ]
> 
> And I still get the same issue.  Only the first value in the list is
> looked up.

The problem is in the following snippet:

> with open(file_list) as files:
>  for items in vals:
>  for line in files:
>  if items in line:
>  print file_list, line
> 

I'll change it to some meaningful names:

with open(filename) as infile:
for search_value in vals:
for line in infile:
if search_value in line:
print filename, "has", search_value, "in line", line.strip()

You open infile once and then iterate over its lines many times, once for 
every search_value. But unlike a list of lines you can only iterate once 
over a file:

$ cat values.txt
alpha
beta
gamma
$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> lines = open("values.txt")
>>> for line in lines: print line.strip()
... 
alpha
beta
gamma
>>> for line in lines: print line.strip()
... 
>>>

No output in the second loop. The file object remembers the current position 
and starts its iteration there. Unfortunately you have already reached the 
end, so there are no more lines. Possible fixes:

(1) Open a new file object for every value:

for filename in filenames:
for search_value in vals:
with open(filename) as infile:
for line in infile:
if search_value in line:
print filename, "has", search_value, 
print "in line", line.strip()

(2) Use seek() to reset the position of the file pointer:

for filename in filenames:
with open(filename) as infile:
for search_value in vals:
infile.seek(0)
for line in infile:
if search_value in line:
print filename, "has", search_value, 
print "in line", line.strip()

(3) If the file is small or not seekable (think stdin) read its contents in 
a list and iterate over that:

for filename in filenames:
with open(filename) as infile:
lines = infile.readlines()
for search_value in vals:
for line in lines:
if search_value in line:
print filename, "has", search_value, 
print "in line", line.strip()

(4) Adapt your algorithm to test all search values against a line before you 
proceed to the next line. This will change the order in which the matches 
are printed, but will work with both stdin and huge files that don't fit 
into memory. I'll leave the implementation to you as an exercise ;)


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Searching through files for values

2015-08-14 Thread Jason Brown
(accidentally replied directly to Cameron)

Thanks, Cameron.  It looks like that value_file.close() tab was
accidentally tabbed when I pasted the code here.  Thanks for the suggestion
for using 'with' though!  That's will be handy.

To test, I tried manually specifying the list:

vals = [ 'value1', 'value2', 'value3' ]

And I still get the same issue.  Only the first value in the list is looked
up.

Jason

On Thu, Aug 13, 2015 at 7:32 PM, Cameron Simpson  wrote:

> On 13Aug2015 16:48, Jason Brown  wrote:
>
>> I'm trying to search for list values in a set of files.  The goal is to
>> generate a list of lists that can later be sorted.  I can only get a match
>> on the first value in the list:
>>
>> contents of value_file:
>> value1
>> value2
>> value3
>> ...
>>
>> The desired output is:
>>
>> file1 value1
>> file1 value2
>> file2 value3
>> file3 value1
>> ...
>>
>> Bit it's only matching on the first item in vals, so the result is:
>>
>> file1 value1
>> file3 value1
>>
>> The subsequent values are not searched.
>>
>
> Rhat is because the subsequent values are never loaded:
>
> filenames = [list populated with filenames in a dir tree]
>> vals = []
>> value_file = open(vars)
>> for i in value_file:
>>vals.append(i.strip())
>>value_file.close()
>>
>
> You close value_file inside the loop i.e. immediately after the first
> value.  Because the file is closed, the loop iteration stops.  You need to
> close it
> outside the loop (after all the values have been loaded):
>
>value_file = open(vars)
>for i in value_file:
>vals.append(i.strip())
>value_file.close()
>
> It is worth noting that a better way to write this is:
>
>with open(vars) as value_file:
>for i in value_file:
>vals.append(i.strip())
>
> Notice that there is no .close(). The "with" construct is the pynthon
> syntax to use a context manager, and "open(vars)" returns an open file,
> which is also a context manager. A context manager has enter and exit
> actions which fire unconditionally at the start and end of the "with", even
> if the with is exited with an exception or a control like "return" or
> "break".
>
> The benefit of this is after the "with", the file will _always" get
> closed. It is also shorter and easier to read.
>
> for file_list in filenames:
>>with open(file_list) as files:
>> for items in vals:
>> for line in files:
>> if items in line:
>> print file_list, line
>>
>
> I would remark that "file_list" is not a great variable name. Many people
> would read it as implying that its value is a list. Personally I would have
> just called it "filename", the singular of your "filenames".
>
> Cheers,
> Cameron Simpson 
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Searching through files for values

2015-08-13 Thread Cameron Simpson

On 13Aug2015 16:48, Jason Brown  wrote:

I'm trying to search for list values in a set of files.  The goal is to
generate a list of lists that can later be sorted.  I can only get a match
on the first value in the list:

contents of value_file:
value1
value2
value3
...

The desired output is:

file1 value1
file1 value2
file2 value3
file3 value1
...

Bit it's only matching on the first item in vals, so the result is:

file1 value1
file3 value1

The subsequent values are not searched.


Rhat is because the subsequent values are never loaded:


filenames = [list populated with filenames in a dir tree]
vals = []
value_file = open(vars)
for i in value_file:
   vals.append(i.strip())
   value_file.close()


You close value_file inside the loop i.e. immediately after the first value.  
Because the file is closed, the loop iteration stops.  You need to close it

outside the loop (after all the values have been loaded):

   value_file = open(vars)
   for i in value_file:
   vals.append(i.strip())
   value_file.close()

It is worth noting that a better way to write this is:

   with open(vars) as value_file:
   for i in value_file:
   vals.append(i.strip())

Notice that there is no .close(). The "with" construct is the pynthon syntax to 
use a context manager, and "open(vars)" returns an open file, which is also a 
context manager. A context manager has enter and exit actions which fire 
unconditionally at the start and end of the "with", even if the with is exited 
with an exception or a control like "return" or "break".


The benefit of this is after the "with", the file will _always" get closed. It 
is also shorter and easier to read.



for file_list in filenames:
   with open(file_list) as files:
for items in vals:
for line in files:
if items in line:
print file_list, line


I would remark that "file_list" is not a great variable name. Many people would 
read it as implying that its value is a list. Personally I would have just 
called it "filename", the singular of your "filenames".


Cheers,
Cameron Simpson 
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] Searching through files for values

2015-08-13 Thread Jason Brown
Hi,

I'm trying to search for list values in a set of files.  The goal is to
generate a list of lists that can later be sorted.  I can only get a match
on the first value in the list:

contents of value_file:
value1
value2
value3
...

The desired output is:

file1 value1
file1 value2
file2 value3
file3 value1
...

Bit it's only matching on the first item in vals, so the result is:

file1 value1
file3 value1

The subsequent values are not searched.

filenames = [list populated with filenames in a dir tree]
vals = []
value_file = open(vars)
for i in value_file:
vals.append(i.strip())
value_file.close()

for file_list in filenames:
with open(file_list) as files:
 for items in vals:
 for line in files:
 if items in line:
 print file_list, line



for line in vals:
print line

returns:
['value1', 'value2', 'value3']

print filenames

returns:
['file1', 'file2', 'file3']


Any help would be greatly appreciated.

Thanks,

Jason
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor