On 12/28/2014 02:12 PM, Dave Angel wrote:
On 12/28/2014 12:27 PM, Seymore4Head wrote:
I need to search through a directory of text files for a string.
Here is a short program I made in the past to search through a single
text file for a line of text.

How can I modify the code to search through a directory of files that
have different filenames, but the same extension?


You have two other replies to your specific question, glob and
os.listdir.  I would also mention the module fileinput:

https://docs.python.org/2/library/fileinput.html

import fileinput
from glob import glob

fnames = glob('*.txt')
for line in fileinput.input(fnames):
     pass # do whatever

If you're not on Windows, I'd mention that the shell will expand the
wildcards for you, so you could get the filenames from argv even
simpler.  See first example on the above web page.


I'm more concerned that you think the following code you supplied does a
search for a string.  It does something entirely different, involving
making a crude dictionary.  But it could be reduced to just a few lines,
and probably take much less memory, if this is really the code you're
working on.

Note: the changes I suggest also should be tons faster, if you have very many words you're parsing this way.


fname = raw_input("Enter file name: ")  #"*.txt"
fh = open(fname)
lst = list()
biglst=[]
for line in fh:
     line=line.rstrip()
     line=line.split()
     biglst+=line
final=[]
for out in biglst:
     if out not in final:
         final.append(out)
final.sort()
print (final)




Something like the following:
Untested, I should have said.


import fileinput
from glob import glob

res = set()
fnames = glob('*.txt')
for line in fileinput.input(fnames):
     res.update(line.rstrip().split())

And I should have omitted the rsplit(), which does nothing that split() isn't already going to do.

print sorted(res)






--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to