Charles-François Natali neolo...@free.fr added the comment:
It would be nice if the documentation of fwalk() explained why you would
want to use it over walk().
How does the attached patch look?
--
Added file: http://bugs.python.org/file25550/fwalk-doc.diff
Benjamin Peterson benja...@python.org added the comment:
2012/5/12 Charles-François Natali rep...@bugs.python.org:
Charles-François Natali neolo...@free.fr added the comment:
It would be nice if the documentation of fwalk() explained why you would
want to use it over walk().
How does the
Benjamin Peterson benja...@python.org added the comment:
It would be nice if the documentation of fwalk() explained why you would want
to use it over walk().
--
nosy: +benjamin.peterson
___
Python tracker rep...@bugs.python.org
Roundup Robot devn...@psf.upfronthosting.co.za added the comment:
New changeset 773a97b3927d by Charles-François Natali in branch 'default':
Issue #13734: Add os.fwalk(), a directory walking function yielding file
http://hg.python.org/cpython/rev/773a97b3927d
--
nosy: +python-dev
Charles-François Natali neolo...@free.fr added the comment:
Committed, thanks for the comments.
Note to myself (and others that might be interested in the O(1)) version):
we can't simply call openat(dirfd, .., O_RDONLY) to re-open the current
directory's file descriptor after having walked a
Charles-François Natali neolo...@free.fr added the comment:
I think the O(depth) version is fine. The O(1) version is quite more
complicated, difficult to follow, and it seems less robust (it doesn't
use try/finally and therefore might leak fds if the generator isn't
exhausted before being
Nick Coghlan ncogh...@gmail.com added the comment:
I'm also a fan of using the simpler approach unless/until we have solid
evidence that the file descriptor limit could be a problem in practice.
A comment in the code mentioning the concern, along with the fact that there's
an alternate
Nick Coghlan ncogh...@gmail.com added the comment:
Hmm, given the various *at() APIs that sort alphabetically next to their path
based counterparts, perhaps we can make the naming consistency change go the
other way? (i.e. listdirfd() and walkfd()). Even if POSIX puts the fd at the
front, do
Nick Coghlan ncogh...@gmail.com added the comment:
Taking a closer look at the current naming scheme in the os module, fdopen()
appears to be the only current function that uses the 'fd' prefix. All the
other operations that accept a file descriptor just use 'f' as the prefix
(fchmod, fchown,
Antoine Pitrou pit...@free.fr added the comment:
There's something I don't understand in the patch: why does _are_same_file
examine st_mode?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13734
Charles-François Natali neolo...@free.fr added the comment:
Given that, flistdir() and fwalk() seem like the most consistent choices of
name for APIs that aren't directly
matching an underlying POSIX function name.
Well, that seems OK for me.
I guess the only reason fdlistdir() is named
Antoine Pitrou pit...@free.fr added the comment:
One other thing: is it deliberate to silence errors in the following snippet?
+try:
+names = fdlistdir(topfd)
+except error as err:
+if onerror is not None:
+onerror(err)
+return
Charles-François Natali neolo...@free.fr added the comment:
It's to mimic os.walk()'s behavior:
http://hg.python.org/cpython/file/bf31815548c9/Lib/os.py#l268
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13734
Charles-François Natali neolo...@free.fr added the comment:
Here are two new versions, both addressing Antoine's and Nick's comments:
- fwalk-3.diff is just an updated version
- fwalk-single_fd.diff doesn't use more than 2 FDs to walk a directory
tree, instead of the depth of directory tree.
Antoine Pitrou pit...@free.fr added the comment:
I was a little worried about the performance impact, so I did some
trivial benchmarks:
- O(depth) fwalk() is actually a tiny bit faster than walk() (it may
be because we don't do as much path lookup)
- O(1) fwalk() is around 20% slower, on a
Charles-François Natali neolo...@free.fr added the comment:
Here's an updated version.
Note that I'm not pushing towards changing the current behavior
pertaining to symlinks to directories, because if we change this, this
will break code.
For example to count the number of lines of all the
Changes by Charles-François Natali neolo...@free.fr:
Removed file: http://bugs.python.org/file24176/walkfd.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13734
___
Changes by Charles-François Natali neolo...@free.fr:
Removed file: http://bugs.python.org/file24197/fdwalk.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13734
___
Changes by Charles-François Natali neolo...@free.fr:
Removed file: http://bugs.python.org/file24202/fdwalk-1.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13734
___
Changes by Charles-François Natali neolo...@free.fr:
Added file: http://bugs.python.org/file24211/fdwalk-2.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13734
___
Charles-François Natali neolo...@free.fr added the comment:
Here's a patch with tests and documentation.
I noticed something surprising:
walk() with followlinks=False returns links to directories as
directories (in dirnames).
I find this surprising, since if you don't follow symlinks, those are
Changes by Jesús Cea Avión j...@jcea.es:
--
nosy: +jcea
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13734
___
___
Python-bugs-list mailing list
Nick Coghlan ncogh...@gmail.com added the comment:
OK, os.walkfd is sounding good:
- accepts a file descriptor, byte sequence or string for top
- produces 4-tuples, with the dirfd added at the end
- documents clearly that the dirfd is normally only valid until the next
iteration step, so you
Nick Coghlan ncogh...@gmail.com added the comment:
Since walkdir is currently entirely based on returning filesystem paths as
strings (just like os.walk()) and hence shares the pervasive symlink attack
vulnerability, I'm particularly interested in the question of whether or not
the various
Nick Coghlan ncogh...@gmail.com added the comment:
Another, possibly better, alternative would be to produce a tuple-subclass that
adds a separate dirfd attribute to the (dirpath, subdirs, files) triple.
I'll stop talking about the walkdir implications here. Instead, I've created a
Antoine Pitrou pit...@free.fr added the comment:
Since walkdir is currently entirely based on returning filesystem
paths as strings (just like os.walk()) and hence shares the pervasive
symlink attack vulnerability, I'm particularly interested in the
question of whether or not the various *at
Charles-François Natali neolo...@free.fr added the comment:
Here's a possible walkfd() implementation.
Example:
$ cat /home/cf/testwalkfd.py
import os
import sys
topfd = os.open(sys.argv[1], os.O_RDONLY)
for rootfd, dirs, files in os.walkfd(topfd):
print(rootfd, dirs, files)
$ ./python
Antoine Pitrou pit...@free.fr added the comment:
Also be aware that symlinks mean sometimes you won't have a dirfd: if
you have a symlink that points to another directory, you can't open that
directory using openat from the symlink's directory. So if you follow
symlinks (or have an
Nick Coghlan ncogh...@gmail.com added the comment:
Thanks for that Charles-François - do you mind if I adapt that for walkdir?
The changes I would make are basically those that Antoine pointed out:
- rather than replacing the dirpath entry, instead yield a 4-tuple that appends
the dirfd
Ross Lagerwall rosslagerw...@gmail.com added the comment:
I'm currently leaning towards the simple 4-tuple approach
I would also take that approach. It seems simplest to me.
--
___
Python tracker rep...@bugs.python.org
New submission from Hynek Schlawack h...@ox.cx:
This is an offspring of #4489 (which is a security bug). The method is AFAIU
intended to be private.
As shown in the discussion of the mentioned #4489, there is a whole family of
attacks that exploit the time window between gathering path names
Nick Coghlan ncogh...@gmail.com added the comment:
I'm working on a library of general directory walking tools that will hopefully
make their way back into the stdlib at some point
(http://walkdir.readthedocs.org).
They're designed to filter and transform the output of os.walk (and similar
Ross Lagerwall rosslagerw...@gmail.com added the comment:
Has there already been done any work? Ross mentioned he wanted to take a stab?
Unfortunately, I'm rather busy at the moment but when I get some free time and
if no one else wants to work on it then I'll take a look.
--
33 matches
Mail list logo