Jay Loden wrote:
I have the following code in my updates script (gets the five most recent updated files on my site)
def get_fles(exts, upd_dir):
'''return list of all the files matching any extensions in list exts'''
fle_list = [] for each in exts:
cmd = upd_dir + "*." + each
ext_ls = glob.glob(cmd)
fle_list = fle_list + ext_ls
return filter(notlink, fle_list)
I wanted to just get one list, of all the .htm and .exe files in my upd_dir. I was trying to make a far more elegant solution that what's above, that could generate a list through a filter. Is there a way to trim the code down to something that does ONE sort through the directory and picks up the .htm and .exe files? (note, it is not necessary for this to recurse through subdirectories in the upd_dir). I have cmd defined above because calling
"glob.glob(upd_dir + "*." + each) returned the error "cannot concatenate string and list objects" - is this the only way around that, or is there a better way?
Breaking out the expression and assigning it to a variable shouldn't make any difference. Are you sure you didn't have something like this?
glob.glob(upd_dir + "*." + exts)
That would give the error message you cite.
In general if you have a question about an error, please post the exact error message including the stack trace, it can be very helpful.
If the filter criteria is complex, it deserves it's own function:
import os import os.path def get_files(exts, upd_dir):
... def criteria(filename): ... return filename.split('.')[-1] in exts \ ... and not os.path.islink(filename) ... return filter(criteria, os.listdir(upd_dir)) ...
get_files(('gz', 'conf'), '.')
['dynfun.pdf.gz', 'twander-3.160.tar.gz', 'PyCon2004DocTestUnit.pdf.gz', 'arg23.txt.gz', '.fonts.conf']
Javier's solution is good. You could also make a regular expression to look for the desired extensions. Here is one way:
import os, re
def get_files(exts, upd_dir): extnMatch = re.compile('(%s)$' % '|'.join(map(re.escape, exts)))
def criteria(filename): return extnMatch.search(filename) \ and not os.path.islink(filename) return filter(criteria, os.listdir(upd_dir))
print get_files(['.java', '.txt'], '.')
I better pick apart the extnMatch line a bit...what it does is build a regular expression that matches any of the extensions if they occur at the end of the filename.
>>> exts = ['.java', '.txt']
First I use map() to apply re.escape() to each extension. This escapes any characters in the extension that have special meaning in a regex; specifically the '.':
>>> e1 = map(re.escape, exts)
>>> e1
['\\.java', '\\.txt']
I could also have used a list comprehension e1 = [ re.escape(ext) for ext in exts ] but for applying a function like this I usually use map()
Next I join the individual extensions with '|'. In a regex this selects between alternatives.
>>> e2 = '|'.join(e1) >>> e2 '\\.java|\\.txt'
Finally I put the alternatives in parentheses to group them and add a '$' at the end meaning 'match the end of the string.
>>> e3 = '(%s)$' % e2
>>> e3
'(\\.java|\\.txt)$'
This solution is definitely harder to explain than Javier's :-)
Kent
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor