Re: [Python-Dev] PEP 471 scandir accepted

2014-07-22 Thread Victor Stinner
Modify os.listdir() to use os.scandir() is not part of the PEP, you should
not do that. If you worry about performances, try to implement my free list
idea.

You may modify the C code of listdir() to share as much code as possible. I
mean you can implement your idea in C.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 scandir accepted

2014-07-22 Thread Akira Li
Ben Hoyt benh...@gmail.com writes:

 I think if I were doing this from scratch I'd reimplement listdir() in
 Python as return [e.name for e in scandir(path)].
...
 So my basic plan is to have an internal helper function in
 posixmodule.c that either yields DirEntry objects or strings. And then
 listdir() would simply be defined something like return
 list(_scandir(path, yield_strings=True)) in C or in Python.

 My reasoning is that then there'll be much less (if any) code
 duplication between scandir() and listdir().

 Does this sound like a reasonable approach?

Note: listdir() accepts an integer path (an open file descriptor that
refers to a directory) that is passed to fdopendir() on POSIX [4] i.e.,
*you can't use scandir() to replace listdir() in this case* (as I've
already mentioned in [1]). See the corresponding tests from [2].

[1] https://mail.python.org/pipermail/python-dev/2014-July/135296.html
[2] https://mail.python.org/pipermail/python-dev/2014-June/135265.html

From os.listdir() docs [3]:

 This function can also support specifying a file descriptor; the file
 descriptor must refer to a directory.

[3] https://docs.python.org/3.4/library/os.html#os.listdir
[4] http://hg.python.org/cpython/file/3.4/Modules/posixmodule.c#l3736


--
Akira

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 scandir accepted

2014-07-22 Thread Ben Hoyt
 Note: listdir() accepts an integer path (an open file descriptor that
 refers to a directory) that is passed to fdopendir() on POSIX [4] i.e.,
 *you can't use scandir() to replace listdir() in this case* (as I've
 already mentioned in [1]). See the corresponding tests from [2].

 [1] https://mail.python.org/pipermail/python-dev/2014-July/135296.html
 [2] https://mail.python.org/pipermail/python-dev/2014-June/135265.html

 From os.listdir() docs [3]:

 This function can also support specifying a file descriptor; the file
 descriptor must refer to a directory.

 [3] https://docs.python.org/3.4/library/os.html#os.listdir
 [4] http://hg.python.org/cpython/file/3.4/Modules/posixmodule.c#l3736

Fair point.

Yes, I hadn't realized listdir supported dir_fd (must have been
looking at 2.x docs), though you've pointed it out at [1] above. and I
guess I wasn't thinking about implementation at the time.

It would be easy enough (I think) to have the helper function support
both, but raise an error in the scandir() function if the type of path
is an integer.

However, given that we have to support this for listdir() anyway, I
think it's worth reconsidering whether scandir()'s directory argument
can be an integer FD. Given that listdir() already supports it, it
will almost certainly be asked for later anyway for someone who's
porting some listdir code that uses an FD. Thoughts, Victor?

-Ben
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 scandir accepted

2014-07-22 Thread Victor Stinner
2014-07-22 17:52 GMT+02:00 Ben Hoyt benh...@gmail.com:
 However, given that we have to support this for listdir() anyway, I
 think it's worth reconsidering whether scandir()'s directory argument
 can be an integer FD. Given that listdir() already supports it, it
 will almost certainly be asked for later anyway for someone who's
 porting some listdir code that uses an FD. Thoughts, Victor?

Please focus on what was accepted in the PEP. We should first test
os.scandir(). In a few months, with better feedbacks, we can consider
extending os.scandir() to support a file descriptor. There are
different issues which should be discussed and decided to implement it
(ex: handle the lifetime of the directory file descriptor).

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 scandir accepted

2014-07-22 Thread Nick Coghlan
On 23 Jul 2014 02:18, Victor Stinner victor.stin...@gmail.com wrote:

 2014-07-22 17:52 GMT+02:00 Ben Hoyt benh...@gmail.com:
  However, given that we have to support this for listdir() anyway, I
  think it's worth reconsidering whether scandir()'s directory argument
  can be an integer FD. Given that listdir() already supports it, it
  will almost certainly be asked for later anyway for someone who's
  porting some listdir code that uses an FD. Thoughts, Victor?

 Please focus on what was accepted in the PEP. We should first test
 os.scandir(). In a few months, with better feedbacks, we can consider
 extending os.scandir() to support a file descriptor. There are
 different issues which should be discussed and decided to implement it
 (ex: handle the lifetime of the directory file descriptor).

As Victor suggests, getting the core version working and incorporated first
is a good way to go. Future enhancements (like accepting a file descriptor)
and refactorings (like eliminating the code duplication with listdir) don't
need to (and hence shouldn't) go into the initial patch.

Cheers,
Nick.


 Victor
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] [PEP466] SSLSockets, and sockets, _socketobjects oh my!

2014-07-22 Thread Alex Gaynor
Hi all,

I've been happily working on the SSL module backports for Python2 (pursuant to
PEP466), and I've hit something of a snag:

In python3, the SSLSocket keeps a weak reference to the underlying socket,
rather than a strong reference, as Python2 uses.

Unfortunately, due to the way sockets work in Python2, this doesn't work:

On Python2, _socketobject composes around _real_socket from the _socket module,
whereas on Python3, it subclasses _socket.socket. Since you now have a Python-
level class, you can weak reference it.

The question is:

a) Should we backport weak referencing _socket.sockets (changing the structure
   of the module seems overly invasive, albeit completely backwards
   compatible)?
b) Does anyone know why weak references are used in the first place? The commit
   message just alludes to fixing a leak with no reference to an issue.

Anyone who's interested in the state of the branch can see it at:
github.com/alex/cpython on the backport-ssl branch. Note that many many tests
are still failing, and you'll need to apply the patch from
http://bugs.python.org/issue22023 to get it to work.

Thanks,
Alex

PS: Any help in getting http://bugs.python.org/issue22023 landed which be very
much appreciated.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 scandir accepted

2014-07-22 Thread Ben Hoyt
Makes sense, thanks. -Ben

On Tue, Jul 22, 2014 at 4:57 PM, Nick Coghlan ncogh...@gmail.com wrote:

 On 23 Jul 2014 02:18, Victor Stinner victor.stin...@gmail.com wrote:

 2014-07-22 17:52 GMT+02:00 Ben Hoyt benh...@gmail.com:
  However, given that we have to support this for listdir() anyway, I
  think it's worth reconsidering whether scandir()'s directory argument
  can be an integer FD. Given that listdir() already supports it, it
  will almost certainly be asked for later anyway for someone who's
  porting some listdir code that uses an FD. Thoughts, Victor?

 Please focus on what was accepted in the PEP. We should first test
 os.scandir(). In a few months, with better feedbacks, we can consider
 extending os.scandir() to support a file descriptor. There are
 different issues which should be discussed and decided to implement it
 (ex: handle the lifetime of the directory file descriptor).

 As Victor suggests, getting the core version working and incorporated first
 is a good way to go. Future enhancements (like accepting a file descriptor)
 and refactorings (like eliminating the code duplication with listdir) don't
 need to (and hence shouldn't) go into the initial patch.

 Cheers,
 Nick.


 Victor


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [PEP466] SSLSockets, and sockets, _socketobjects oh my!

2014-07-22 Thread Antoine Pitrou

Le 22/07/2014 17:03, Alex Gaynor a écrit :


The question is:

a) Should we backport weak referencing _socket.sockets (changing the structure
of the module seems overly invasive, albeit completely backwards
compatible)?
b) Does anyone know why weak references are used in the first place? The commit
message just alludes to fixing a leak with no reference to an issue.


Because :
- the SSLSocket has a strong reference to the ssl object (self._sslobj)
- self._sslobj having a strong reference to the SSLSocket would mean 
both would only get destroyed on a GC collection


I assume that's what leak means here :-)

As for 2.x, I don't see why you couldn't just continue using a strong 
reference.


Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [PEP466] SSLSockets, and sockets, _socketobjects oh my!

2014-07-22 Thread Nick Coghlan
On 23 Jul 2014 07:28, Antoine Pitrou anto...@python.org wrote:

 Le 22/07/2014 17:03, Alex Gaynor a écrit :


 The question is:

 a) Should we backport weak referencing _socket.sockets (changing the
structure
 of the module seems overly invasive, albeit completely backwards
 compatible)?
 b) Does anyone know why weak references are used in the first place? The
commit
 message just alludes to fixing a leak with no reference to an issue.


 Because :
 - the SSLSocket has a strong reference to the ssl object (self._sslobj)
 - self._sslobj having a strong reference to the SSLSocket would mean both
would only get destroyed on a GC collection

 I assume that's what leak means here :-)

 As for 2.x, I don't see why you couldn't just continue using a strong
reference.

As Antoine says, if the cycle already exists in Python 2 (and it sounds
like it does), we can just skip backporting the weak reference change.

I'll also give the Fedora Python list a heads up about your repo to see if
anyone there can help you with the backport.

Cheers,
Nick.


 Regards

 Antoine.



 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 scandir accepted

2014-07-22 Thread Victor Stinner
2014-07-22 4:27 GMT+02:00 Ben Hoyt benh...@gmail.com:
 The PEP is accepted.

 Superb. Could you please update the PEP with the Resolution and
 BDFL-Delegate fields?

Done.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [PEP466] SSLSockets, and sockets, _socketobjects oh my!

2014-07-22 Thread Antoine Pitrou

Le 22/07/2014 17:44, Nick Coghlan a écrit :


 
  As for 2.x, I don't see why you couldn't just continue using a strong
reference.

As Antoine says, if the cycle already exists in Python 2 (and it sounds
like it does), we can just skip backporting the weak reference change.


No, IIRC there shouldn't be a cycle. It's just complicated in a 
different way than 3.x :-)


Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 scandir accepted

2014-07-22 Thread Akira Li
Ben Hoyt benh...@gmail.com writes:

 Note: listdir() accepts an integer path (an open file descriptor that
 refers to a directory) that is passed to fdopendir() on POSIX [4] i.e.,
 *you can't use scandir() to replace listdir() in this case* (as I've
 already mentioned in [1]). See the corresponding tests from [2].

 [1] https://mail.python.org/pipermail/python-dev/2014-July/135296.html
 [2] https://mail.python.org/pipermail/python-dev/2014-June/135265.html

 From os.listdir() docs [3]:

 This function can also support specifying a file descriptor; the file
 descriptor must refer to a directory.

 [3] https://docs.python.org/3.4/library/os.html#os.listdir
 [4] http://hg.python.org/cpython/file/3.4/Modules/posixmodule.c#l3736

 Fair point.

 Yes, I hadn't realized listdir supported dir_fd (must have been
 looking at 2.x docs), though you've pointed it out at [1] above. and I
 guess I wasn't thinking about implementation at the time.

FYI, dir_fd is related but *different*: compare specifying a file
descriptor [1] vs. paths relative to directory descriptors [2].

NOTE: os.supports_fd and os.supports_dir_fd are different sets. [3]:

   import os
   os.listdir in os.supports_fd
  True
   os.listdir in os.supports_dir_fd
  False


[1] https://docs.python.org/3/library/os.html#path-fd
[2] https://docs.python.org/3/library/os.html#dir-fd
[3] https://mail.python.org/pipermail/python-dev/2014-July/135296.html

To be clear: *listdir() does not support dir_fd* though it can be
emulated using os.open(dir_fd=..).

You can safely ignore the rest of the e-mail until you want to implement
path-fd [1] support for os.scandir() in several months.

Here's code example that demonstrates both path-fd [1] and dir-fd [2]:

  import contextlib
  import os

  with contextlib.ExitStack() as stack:
  dir_fd = os.open('/etc', os.O_RDONLY)
  stack.callback(os.close, dir_fd)
  fd = os.open('init.d', os.O_RDONLY, dir_fd=dir_fd) # dir-fd [2]
  stack.callback(os.close, fd)
  print(\n.join(os.listdir(fd))) # path-fd [1]

It is the same as os.listdir('/etc/init.d') unless '/etc' is symlinked
to refer to another directory after the first os.open('/etc',..)
call. See also, os.fwalk(dir_fd=..) [4]

[4] https://docs.python.org/3/library/os.html#os.fwalk

 However, given that we have to support this for listdir() anyway, I
 think it's worth reconsidering whether scandir()'s directory argument
 can be an integer FD.

What is entry.path in this case? If input directory is a file descriptor
(an integer) then os.path.join(directory, entry.name) won't work.

PEP 471 should explicitly reject the support for specifying a file
descriptor so that a code that uses os.scandir may assume that
entry.path attribute is always present (no exceptions due
to a failure to read /proc/self/fd/NNN or an error while calling
fcntl(F_GETPATH) or GetFileInformationByHandleEx() -- see
http://stackoverflow.com/q/1188757 ). [5]

[5] https://mail.python.org/pipermail/python-dev/2014-July/135441.html

On the other hand os.fwalk() [4] that supports both path-fd [1] and
dir-fd [2] could be implemented without entry.path property if
os.scandir() supports just path-fd [1]. os.fwalk() provides a safe way
to traverse a directory tree without symlink races e.g., [6]:

  def get_tree_size(directory):
  Return total size of files in directory and subdirs.
  return sum(entry.lstat().st_size
 for root, dirs, files, rootfd in fwalk(directory)
 for entry in files)

[6] http://legacy.python.org/dev/peps/pep-0471/#examples

where fwalk() is the exact copy of os.fwalk() except that it uses
_fwalk() which is defined in terms of scandir():

  import os

  # adapt os._fwalk() to use scandir() instead of os.listdir()
  def _fwalk(topfd, toppath, topdown, onerror, follow_symlinks):
  # Note: This uses O(depth of the directory tree) file descriptors:
  # if necessary, it can be adapted to only require O(1) FDs, see
  # http://bugs.python.org/issue13734

  entries = scandir(topfd)
  dirs, nondirs = [], []
  for entry in entries: #XXX call onerror on OSError on next() and return?
  # report symlinks to directories as directories (like os.walk)
  #  but no recursion into symlinked subdirectories unless
  #  follow_symlinks is true

  # add dangling symlinks as nondirs (DirEntry.is_dir() doesn't
  #  raise on broken links)
  try:
  (dirs if entry.is_dir() else nondirs).append(entry)
  except FileNotFoundError:
  continue # ignore disappeared files

  if topdown:
  yield toppath, dirs, nondirs, topfd

  for entry in dirs:
  try:
  orig_st = entry.stat(follow_symlinks=follow_symlinks)
  #XXX O_DIRECTORY, O_CLOEXEC, [? O_NOCTTY, O_SEARCH ?]
  dirfd = os.open(entry.name, os.O_RDONLY, dir_fd=topfd)
  except OSError as err:
  if onerror is not None:
  

Re: [Python-Dev] PEP 471 scandir accepted

2014-07-22 Thread Antoine Pitrou

Le 21/07/2014 18:26, Victor Stinner a écrit :


I'm happy because the final API is very close to os.path functions and
pathlib.Path methods. Python stays consistent, which is a great power
of this language!


By the way, http://bugs.python.org/issue19767 could benefit too.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com