Re: subprocess and stdin.write(), stdout.read()

2015-03-24 Thread Nobody
On Tue, 24 Mar 2015 12:08:24 -0700, Tobiah wrote:

> But if I want to send a string to stdin, how can I do that without
> stdin.write()?

p.communicate(string)

> This seems to work:

Only because the amounts of data involved are small enough to avoid
deadlock.

If both sides write more data in one go than will fit into a pipe buffer,
you will get deadlock. The parent will be blocked waiting for the child to
consume the input, which doesn't happen because the child will be blocked
waiting for the parent to consume its output, which doesn't happen because
he parent will be blocked waiting for the child to consume the input, ...

That's a textbook example of deadlock: each side waiting forever for the
other side to make the first move.

This is exactly why POSIX' popen() function lets you either write to stdin
(mode=="w") or read from stdout (mode=="r") but not both.

> Will this always avoid the deadlock problem?

No.

> This also works:

Again, only because the amounts of data involved are small enough to avoid
deadlock.

> Is that vulnerable to deadlock?

Yes.

> Is there a better way to write to and read from the same process?

Use threads; one for each descriptor (stdin, stdout, stderr).

Non-blocking I/O is an alternative (and that's what .communicate() uses on
Unix), but threads will work on all common desktop and server platforms.

If you need to support platforms which lack threads, either

a) have the parent first write to a file (instead of .stdin), then have
the child read from the file while the parent reads .stdout and .stderr,
or

b) have the parent write to .stdin while the child writes its
stdout/stderr to files (or a file). Once the child completes, have
the parent read the file(s).

Using files allows for potentially gigabytes of data to be buffered. With
pipes, the amount may be as low as 512 bytes (the minimum value allowed by
POSIX) and will rarely be much more (a typical value is 4096 bytes, i.e.
one "page" on x86).

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: subprocess and stdin.write(), stdout.read()

2015-03-24 Thread Chris Kaynor
On Tue, Mar 24, 2015 at 12:08 PM, Tobiah  wrote:
> The docs for the subprocess.Popen() say:
>
> Use communicate() rather than .stdin.write, .stdout.read
> or .stderr.read to avoid deadlocks due to any of the other
> OS pipe buffers filling up and blocking the child process
>
> But if I want to send a string to stdin, how can I do that without
> stdin.write()?
>
> This seems to work:
>
> import subprocess as s
>
> thing = """
> hey
> there
> foo man is here
> hey foo
> man is there
> so foo
> """
> p = s.Popen(['grep', 'foo'], stdin = s.PIPE, stdout = s.PIPE)
> p.stdin.write(thing)
> print p.communicate()
>
> ##
>
> ('\they foo\n \tfoo there\n', None)
>
>
> Will this always avoid the deadlock problem?

What you should do is use "print p.communicate(thing)". That will
always avoid the deadlock issue.

Your code MAY deadlock in some cases as the stdin pipe could fill up
fully, but the other process is not reading it as it is waiting for
you to read output. What this means is that, you must be reading from
stdout AND stderr if you are possibly waiting for the process (such as
when writing to stdin or using .wait() or looping on .poll()).

subprocess.communicate() takes care of that issue internally, however
you can write your own variations (useful if you need to process
stdout to produce stdin, for example), however you must either be
using a select or threads to be sure to be reading stdout and stderr.
You should also pay attention to the note on communicate - if
potentially large amounts of data will be produced, you may need to
write your own method to avoid memory paging/OOM issues due to
communicate filling up the system's RAM.

In the example you provide, you will probably never hit the deadlock
as the data being written is small enough that it should never fill
the buffers (typically, they are ~2k). Additionally, if you know the
process never produces output on stdout or stderr, you can ignore them
(but then, why would you pipe them?).

>
> This also works:
>
> p = s.Popen(['grep', 'foo'], stdin = s.PIPE, stdout = s.PIPE)
> p.stdin.write(thing)
> p.stdin.close()
> print p.stdout.read()
>
> Is that vulnerable to deadlock?  Is there a better way
> to write to and read from the same process?

This is more likely to cause deadlocks as, if the process writes too
much to stderr, it may stall waiting for you to read it, while you are
waiting for it to close stdout.
-- 
https://mail.python.org/mailman/listinfo/python-list


subprocess and stdin.write(), stdout.read()

2015-03-24 Thread Tobiah

The docs for the subprocess.Popen() say:

Use communicate() rather than .stdin.write, .stdout.read
or .stderr.read to avoid deadlocks due to any of the other
OS pipe buffers filling up and blocking the child process

But if I want to send a string to stdin, how can I do that without
stdin.write()?

This seems to work:

import subprocess as s

thing = """
hey
there
foo man is here
hey foo
man is there
so foo
"""
p = s.Popen(['grep', 'foo'], stdin = s.PIPE, stdout = s.PIPE)
p.stdin.write(thing)
print p.communicate()

##

('\they foo\n \tfoo there\n', None)


Will this always avoid the deadlock problem?

This also works:

p = s.Popen(['grep', 'foo'], stdin = s.PIPE, stdout = s.PIPE)
p.stdin.write(thing)
p.stdin.close()
print p.stdout.read()

Is that vulnerable to deadlock?  Is there a better way
to write to and read from the same process?

Thanks!

Tobiah
--
https://mail.python.org/mailman/listinfo/python-list