> i wrote a function to parse a given directory and make a sorted list > of files with .txt,.doc extensions .it works,but i want to know if it > is too bloated..can this be rewritten in more efficient manner? > > here it is... > > from string import split > from os.path import isdir,join,normpath > from os import listdir > > def parsefolder(dirname): > filenms=[] > folder=dirname > isadr=isdir(folder) > if (isadr): > dirlist=listdir(folder) > filenm="" > for x in dirlist: > filenm=x > if(filenm.endswith(("txt","doc"))): > nmparts=[] > nmparts=split(filenm,'.' ) > if((nmparts[1]=='txt') or (nmparts[1]=='doc')): > filenms.append(filenm) > filenms.sort() > filenameslist=[] > filenameslist=[normpath(join(folder,y)) for y in filenms] > numifiles=len(filenameslist) > print filenameslist > return filenameslist > > > folder='F:/mysys/code/tstfolder' > parsefolder(folder)
It seems to me that this is awfully baroque with many unneeded superfluous variables. Is this not the same functionality (minus prints, unused result-counting, NOPs, and belt-and-suspenders extension-checking) as def parsefolder(dirname): if not isdir(dirname): return return sorted([ normpath(join(dirname, fname)) for fname in listdir(dirname) if fname.lower().endswith('.txt') or fname.lower().endswith('.doc') ]) In Python2.5 (or 2.4 if you implement the any() function, ripped from the docs[1]), this could be rewritten to be a little more flexible...something like this (untested): def parsefolder(dirname, types=['.doc', '.txt']): if not isdir(dirname): return return sorted([ normpath(join(dirname, fname)) for fname in listdir(dirname) if any( fname.lower().endswith(s) for s in types) ]) which would allow you to do both parsefolder('/path/to/wherever/') and parsefolder('/path/to/wherever/', ['.xls', '.ppt', '.htm']) In both cases, you don't define the case where isdir(dirname) fails. Caveat Implementor. -tkc [1] http://docs.python.org/lib/built-in-funcs.html -- http://mail.python.org/mailman/listinfo/python-list