William O'Higgins Witteman wrote: > On Wed, Sep 13, 2006 at 11:34:25AM -0400, Kent Johnson wrote: >> William O'Higgins Witteman wrote: >>> I have to walk a directory tree and examine files within it. I have a >>> set of directory names and filename patterns that I must skip while >>> doing this walk. How do I create a set of rules to skip files or >>> directory branches? I'm looking for something reasonably scalable, >>> 'cause I'm sure to need to update these rules in the future. > > First, thanks to Kent and Dave for their thoughts - a big help and much > appreciated. Notes and results below, for archival posterity (so at > least *I'll* know where to look for it :-) > >> def matchesAny(name, tests): >> for test in tests: >> if fnmatch.fnmatch(name, test): >> return True >> return False > > fnmatch was a good choice for this in my case, because I have to do > case-insensitive matching of very simple patterns, but re or glob would > provide more power if needed.
> I originally put the return False inside > the conditional with else - but that meant that unless my name matched > on the last test in tests, it would always return False. Not what I > wanted. The above works very nicely without the return False line. If you leave out the return False you get an implicit return None. None is interpreted as False in a conditional so it has the same result. I would prefer to make it explicit. > >> for dirpath, dirnames, filenames in os.walk(baseDir): >> # Note use of slice assignment - you have to modify the caller's list >> dirnames[:] = [ name for name in dirnames if not matchesAny(name, >> dirsToSkip) ] >> >> filenames = [name for name in filenames if not matchesAny(name, >> filesToSkip) ] >> >> for name in filenames: >> # whatever file processing you want to do goes here > > The above approach was not quite what I needed, because I had a list of > exclusion criteria and a list of inclusion criteria, but both could be > applied to the whole path. Therefore, I used this approach: > > for dirpath, dirnames, filenames in os.walk(fs_path): > """Filter the list of filenames to exclude elements in ToSkip""" > filenames[:] = [name for name in filenames if not > matches(os.path.join(dirpath,name),ToSkip)] > """Filter the list of filenames to exclude elements not in ToKeep""" > filenames[:] = [name for name in filenames if > matches(os.path.join(dirpath,name),ToKeep)] > > for fname in filenames: > # do what needs to be done Another way to write this that might be a little easier to read is this: for fname in filenames: if matches(os.path.join(dirpath,name),ToSkip): continue if not matches(os.path.join(dirpath,name),ToKeep): continue # process the file or even if matches(os.path.join(dirpath,name),ToSkip) or not matches(os.path.join(dirpath,name),ToKeep): continue > > This is getting me just the results I was looking for, and I have to > say, I'm pretty pleased. Thanks again. Python rocks :-) Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor