On 04:25 pm, eric.pru...@gmail.com wrote:
I'm bumping this PEP again in hopes of getting some feedback.
Thanks,
Eric
On Tue, Sep 8, 2009 at 23:52, Eric Pruitt <eric.pru...@gmail.com>
wrote:
PEP: 3145
Title: Asynchronous I/O For subprocess.Popen
Author: (James) Eric Pruitt, Charles R. McCreary, Josiah Carlson
Type: Standards Track
Content-Type: text/plain
Created: 04-Aug-2009
Python-Version: 3.2
Abstract:
In its present form, the subprocess.Popen implementation is prone
to
dead-locking and blocking of the parent Python script while waiting
on data
� �from the child process.
Motivation:
� �A search for "python asynchronous subprocess" will turn up numerous
accounts of people wanting to execute a child process and
communicate with
it from time to time reading only the data that is available
instead of
blocking to wait for the program to produce data [1] [2] [3]. The
current
behavior of the subprocess module is that when a user sends or
receives
data via the stdin, stderr and stdout file objects, dead locks are
common
and documented [4] [5]. While communicate can be used to alleviate
some of
the buffering issues, it will still cause the parent process to
block while
attempting to read data when none is available to be read from the
child
� �process.
Rationale:
There is a documented need for asynchronous, non-blocking
functionality in
subprocess.Popen [6] [7] [2] [3]. Inclusion of the code would
improve the
utility of the Python standard library that can be used on Unix
based and
Windows builds of Python. Practically every I/O object in Python
has a
file-like wrapper of some sort. Sockets already act as such and
for
strings there is StringIO. Popen can be made to act like a file by
simply
using the methods attached the the subprocess.Popen.stderr, stdout
and
stdin file-like objects. But when using the read and write methods
of
those options, you do not have the benefit of asynchronous I/O. In
the
proposed solution the wrapper wraps the asynchronous methods to
mimic a
� �file object.
Reference Implementation:
I have been maintaining a Google Code repository that contains all
of my
changes including tests and documentation [9] as well as blog
detailing
� �the problems I have come across in the development process [10].
I have been working on implementing non-blocking asynchronous I/O
in the
subprocess.Popen module as well as a wrapper class for
subprocess.Popen
that makes it so that an executed process can take the place of a
file by
duplicating all of the methods and attributes that file objects
have.
"Non-blocking" and "asynchronous" are actually two different things.
From the rest of this PEP, I think only a non-blocking API is being
introduced. I haven't looked beyond the PEP, though, so I might be
missing something.
There are two base functions that have been added to the
subprocess.Popen
class: Popen.send and Popen._recv, each with two separate
implementations,
� �one for Windows and one for Unix based systems. �The Windows
implementation uses ctypes to access the functions needed to
control pipes
in the kernel 32 DLL in an asynchronous manner. On Unix based
systems,
� �the Python interface for file control serves the same purpose. �The
different implementations of Popen.send and Popen._recv have
identical
arguments to make code that uses these functions work across
multiple
� �platforms.
Why does the method for non-blocking read from a pipe start with an "_"?
This is the convention (widely used) for a private API. The name also
doesn't suggest that this is the non-blocking version of reading.
Similarly, the name "send" doesn't suggest that this is the non-blocking
version of writing.
� �When calling the Popen._recv function, it requires the pipe name be
passed as an argument so there exists the Popen.recv function that
passes
selects stdout as the pipe for Popen._recv by default.
Popen.recv_err
selects stderr as the pipe by default. "Popen.recv" and
"Popen.recv_err"
are much easier to read and understand than "Popen._recv('stdout'
..." and
� �"Popen._recv('stderr' ..." respectively.
What about reading from other file descriptors? subprocess.Popen allows
arbitrary file descriptors to be used. Is there any provision here for
reading and writing non-blocking from or to those?
� �Since the Popen._recv function does not wait on data to be produced
before returning a value, it may return empty bytes.
Popen.asyncread
� �handles this issue by returning all data read over a given time
� �interval.
Oh. Popen.asyncread? What's that? This is the first time the PEP
mentions it.
The ProcessIOWrapper class uses the asyncread and asyncwrite
functions to
allow a process to act like a file so that there are no blocking
issues
that can arise from using the stdout and stdin file objects
produced from
� �a subprocess.Popen call.
What's the ProcessIOWrapper class? And what's the asyncwrite function?
Again, this is the first time it's mentioned.
So, to sum up, I think my main comment is that the PEP seems to be
missing a significant portion of the details of what it's actually
proposing. I suspect that this information is present in the
implementation, which I have not looked at, but it probably belongs in
the PEP.
Jean-Paul
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com