Re: Is there an alternative to os.walk?

2006-10-08 Thread Ant
The idiomatic way of doing the tree traversal is:

def search(a_dir):
   valid_dirs = []
   for dirpath, dirnames, filenames in os.walk(a_dir):
   if dirtest(filenames):
   valid_dirs.append(dirpath)
   return valid_dirs

Also since you are given a list of filenames in the directory, then why
not just check the list of those files for your test files:

def dirtest(filenames):
   testfiles = ['a','b','c']
   for f in testfiles:
   if not f in filenames:
   return False
   return False

You'd have to test this to see if it made a difference in performance,
but it makes for more readable code

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is there an alternative to os.walk?

2006-10-07 Thread Bruce
waylan wrote:
 Bruce wrote:
  Hi all,
  I have a question about traversing file systems, and could use some
  help. Because of directories with many files in them, os.walk appears
  to be rather slow. I`m thinking there is a potential for speed-up since
  I don`t need os.walk to report filenames of all the files in every
  directory it visits. Is there some clever way to use os.walk or another
  tool that would provide functionality like os.walk except for the
  listing of the filenames?

 You might want to check out the path module [1] (not os.path). The
 following is from the docs:

  The method path.walk() returns an iterator which steps recursively
  through a whole directory tree. path.walkdirs() and path.walkfiles()
  are the same, but they yield only the directories and only the files,
  respectively.

 Oh, and you can thank Paul Bissex for pointing me to path [2].


 [1]: http://www.jorendorff.com/articles/python/path/
 [2]: http://e-scribe.com/news/289

A little late but.. thanks for the replies, was very useful. Here`s
what I do in this case:

def search(a_dir):
   valid_dirs = []
   walker = os.walk(a_dir)
   while 1:
   try:
   dirpath, dirnames, filenames = walker.next()
   except StopIteration:
   break
   if dirtest(dirpath,filenames):
   valid_dirs.append(dirpath)
   return valid_dirs

def dirtest(a_dir):
   testfiles = ['a','b','c']
   for f in testfiles:
   if not os.path.exists(os.path.join(a_dir,f)):
   return 0
   return 1

I think you`re right - it`s not os.walk that makes this slow, it`s the
dirtest method that takes so much more time when there are many files
in a directory. Also, thanks for pointing me to the path module, was
interesting.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is there an alternative to os.walk?

2006-10-07 Thread Tim Roberts
Bruce [EMAIL PROTECTED] wrote:

A little late but.. thanks for the replies, was very useful. Here`s
what I do in this case:

def search(a_dir):
   valid_dirs = []
   walker = os.walk(a_dir)
   while 1:
   try:
   dirpath, dirnames, filenames = walker.next()
   except StopIteration:
   break
   if dirtest(dirpath,filenames):
   valid_dirs.append(dirpath)
   return valid_dirs

def dirtest(a_dir):
   testfiles = ['a','b','c']
   for f in testfiles:
   if not os.path.exists(os.path.join(a_dir,f)):
   return 0
   return 1

I think you`re right - it`s not os.walk that makes this slow, it`s the
dirtest method that takes so much more time when there are many files
in a directory. Also, thanks for pointing me to the path module, was
interesting.

Umm, may I point out that you don't NEED the os.path.exists call, because
you are already being HANDED a list of all the filenames in that directory?
You could dirtest with this much faster routinee:

def dirtest(a_dir,filenames):
for f in ['a','b','c']:
if not f in filenames:
return 0
return 1
-- 
- Tim Roberts, [EMAIL PROTECTED]
  Providenza  Boekelheide, Inc.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is there an alternative to os.walk?

2006-10-07 Thread hanumizzle
On 10/8/06, Tim Roberts [EMAIL PROTECTED] wrote:

 Umm, may I point out that you don't NEED the os.path.exists call, because
 you are already being HANDED a list of all the filenames in that directory?
 You could dirtest with this much faster routinee:

 def dirtest(a_dir,filenames):
 for f in ['a','b','c']:
 if not f in filenames:
 return 0
 return 1

Or False / True for sufficiently new versions of Python. :)

-- Theerasak
-- 
http://mail.python.org/mailman/listinfo/python-list


Is there an alternative to os.walk?

2006-10-04 Thread Bruce
Hi all,
I have a question about traversing file systems, and could use some
help. Because of directories with many files in them, os.walk appears
to be rather slow. I`m thinking there is a potential for speed-up since
I don`t need os.walk to report filenames of all the files in every
directory it visits. Is there some clever way to use os.walk or another
tool that would provide functionality like os.walk except for the
listing of the filenames?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is there an alternative to os.walk?

2006-10-04 Thread Irmen de Jong
Bruce wrote:
 Hi all,
 I have a question about traversing file systems, and could use some
 help. Because of directories with many files in them, os.walk appears
 to be rather slow. 

Provide more info/code. I suspect it is not os.walk itself that is slow,
but rather the code that processes its result...

 I`m thinking there is a potential for speed-up since
 I don`t need os.walk to report filenames of all the files in every
 directory it visits. Is there some clever way to use os.walk or another
 tool that would provide functionality like os.walk except for the
 listing of the filenames?

You may want to take a look at os.path.walk then.

--Irmen
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is there an alternative to os.walk?

2006-10-04 Thread waylan
Bruce wrote:
 Hi all,
 I have a question about traversing file systems, and could use some
 help. Because of directories with many files in them, os.walk appears
 to be rather slow. I`m thinking there is a potential for speed-up since
 I don`t need os.walk to report filenames of all the files in every
 directory it visits. Is there some clever way to use os.walk or another
 tool that would provide functionality like os.walk except for the
 listing of the filenames?

You might want to check out the path module [1] (not os.path). The
following is from the docs:

 The method path.walk() returns an iterator which steps recursively
 through a whole directory tree. path.walkdirs() and path.walkfiles()
 are the same, but they yield only the directories and only the files,
 respectively.

Oh, and you can thank Paul Bissex for pointing me to path [2].

[1]: http://www.jorendorff.com/articles/python/path/
[2]: http://e-scribe.com/news/289

-- 
http://mail.python.org/mailman/listinfo/python-list