Re: subprocess and stdin.write(), stdout.read()
On Tue, 24 Mar 2015 12:08:24 -0700, Tobiah wrote: > But if I want to send a string to stdin, how can I do that without > stdin.write()? p.communicate(string) > This seems to work: Only because the amounts of data involved are small enough to avoid deadlock. If both sides write more data in one go than will fit into a pipe buffer, you will get deadlock. The parent will be blocked waiting for the child to consume the input, which doesn't happen because the child will be blocked waiting for the parent to consume its output, which doesn't happen because he parent will be blocked waiting for the child to consume the input, ... That's a textbook example of deadlock: each side waiting forever for the other side to make the first move. This is exactly why POSIX' popen() function lets you either write to stdin (mode=="w") or read from stdout (mode=="r") but not both. > Will this always avoid the deadlock problem? No. > This also works: Again, only because the amounts of data involved are small enough to avoid deadlock. > Is that vulnerable to deadlock? Yes. > Is there a better way to write to and read from the same process? Use threads; one for each descriptor (stdin, stdout, stderr). Non-blocking I/O is an alternative (and that's what .communicate() uses on Unix), but threads will work on all common desktop and server platforms. If you need to support platforms which lack threads, either a) have the parent first write to a file (instead of .stdin), then have the child read from the file while the parent reads .stdout and .stderr, or b) have the parent write to .stdin while the child writes its stdout/stderr to files (or a file). Once the child completes, have the parent read the file(s). Using files allows for potentially gigabytes of data to be buffered. With pipes, the amount may be as low as 512 bytes (the minimum value allowed by POSIX) and will rarely be much more (a typical value is 4096 bytes, i.e. one "page" on x86). -- https://mail.python.org/mailman/listinfo/python-list
Re: subprocess and stdin.write(), stdout.read()
On Tue, Mar 24, 2015 at 12:08 PM, Tobiah wrote: > The docs for the subprocess.Popen() say: > > Use communicate() rather than .stdin.write, .stdout.read > or .stderr.read to avoid deadlocks due to any of the other > OS pipe buffers filling up and blocking the child process > > But if I want to send a string to stdin, how can I do that without > stdin.write()? > > This seems to work: > > import subprocess as s > > thing = """ > hey > there > foo man is here > hey foo > man is there > so foo > """ > p = s.Popen(['grep', 'foo'], stdin = s.PIPE, stdout = s.PIPE) > p.stdin.write(thing) > print p.communicate() > > ## > > ('\they foo\n \tfoo there\n', None) > > > Will this always avoid the deadlock problem? What you should do is use "print p.communicate(thing)". That will always avoid the deadlock issue. Your code MAY deadlock in some cases as the stdin pipe could fill up fully, but the other process is not reading it as it is waiting for you to read output. What this means is that, you must be reading from stdout AND stderr if you are possibly waiting for the process (such as when writing to stdin or using .wait() or looping on .poll()). subprocess.communicate() takes care of that issue internally, however you can write your own variations (useful if you need to process stdout to produce stdin, for example), however you must either be using a select or threads to be sure to be reading stdout and stderr. You should also pay attention to the note on communicate - if potentially large amounts of data will be produced, you may need to write your own method to avoid memory paging/OOM issues due to communicate filling up the system's RAM. In the example you provide, you will probably never hit the deadlock as the data being written is small enough that it should never fill the buffers (typically, they are ~2k). Additionally, if you know the process never produces output on stdout or stderr, you can ignore them (but then, why would you pipe them?). > > This also works: > > p = s.Popen(['grep', 'foo'], stdin = s.PIPE, stdout = s.PIPE) > p.stdin.write(thing) > p.stdin.close() > print p.stdout.read() > > Is that vulnerable to deadlock? Is there a better way > to write to and read from the same process? This is more likely to cause deadlocks as, if the process writes too much to stderr, it may stall waiting for you to read it, while you are waiting for it to close stdout. -- https://mail.python.org/mailman/listinfo/python-list
subprocess and stdin.write(), stdout.read()
The docs for the subprocess.Popen() say: Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process But if I want to send a string to stdin, how can I do that without stdin.write()? This seems to work: import subprocess as s thing = """ hey there foo man is here hey foo man is there so foo """ p = s.Popen(['grep', 'foo'], stdin = s.PIPE, stdout = s.PIPE) p.stdin.write(thing) print p.communicate() ## ('\they foo\n \tfoo there\n', None) Will this always avoid the deadlock problem? This also works: p = s.Popen(['grep', 'foo'], stdin = s.PIPE, stdout = s.PIPE) p.stdin.write(thing) p.stdin.close() print p.stdout.read() Is that vulnerable to deadlock? Is there a better way to write to and read from the same process? Thanks! Tobiah -- https://mail.python.org/mailman/listinfo/python-list