On Tue, 05 Mar 2013 17:38:09 -0500, Vladimir Panteleev <vladi...@thecybershadow.net> wrote:

On Tuesday, 5 March 2013 at 21:55:24 UTC, Steven Schveighoffer wrote:
On Tue, 05 Mar 2013 16:04:14 -0500, Vladimir Panteleev <vladi...@thecybershadow.net> wrote:

4. Is there any way to deal with pipe clogging (pipe buffer getting exceeded when manually handling both input and output of a subprocess)? Can we query the number of bytes we can immediately read/write without blocking on a File?

I don't know how this could happen, can you elaborate? Perhaps an example?

OK! Here's a program based off the pipeProcess/pipeShell example:

---
import std.file;
import std.process2;
import std.stdio;
import std.string;

void main()
{
     auto pipes = pipeProcess("./my_application",
         Redirect.stdout | Redirect.stderr);
     scope(exit) wait(pipes.pid);

     // Store lines of output.
     string[] output;
     foreach (line; pipes.stdout.byLine) output ~= line.idup;

     // Store lines of errors.
     string[] errors;
     foreach (line; pipes.stderr.byLine) errors ~= line.idup;

     writefln("%d lines of stdout, %d lines of stderr",
         output.length, errors.length);
}
---

And here is an accompanying my_application.d:

---
import std.stdio;

enum N = 100;

void main()
{
     foreach (n; 0..N)
     {
         stdout.writeln("stdout");
         stderr.writeln("stderr");
     }
}
---

Now, everything works just fine when N is small. However, if you increase it to 10000, both the test program and my_application get stuck with 0% CPU usage.

The reason for that is that the stderr pipe is clogged: my_application can't write to it, because nothing is reading from the other end. At the same time, the first program is blocked on reading from the stdout pipe, but nothing is coming out, because my_application is blocked on writing to stderr.

Right, the issue there is, File does not make a good socket/pipe interface. I don't know what to do about that.

a while ago (2008 or 09 I believe?), I was using Tango's Process object to execute programs on a remote agent, and forwarding all the resulting data back over the network. On Linux, I used select to read data as it arrived. On Windows, I think I had to spawn off a separate thread to wait for data/child processes.

But Tango did not base it's I/O on FILE *, so I think we had more flexibility there.

Suggestions are welcome...


By the way, I should mention that I ran into several issues while trying to come up with the above example. The test program does not work on Windows, for some reason I get the exception:

std.process2.ProcessException@std\process2.d(494): Failed to spawn new process (The parameter is incorrect.)

I think Lars is on that.


I've also initially tried writing a different program:

---
import std.file;
import std.process2;
import std.string;

/// Sort an array of strings using the Unix "sort" program.
string[] unixSort(string[] lines)
{
        auto pipes = pipeProcess("sort", Redirect.stdin | Redirect.stdout);
        scope(exit) wait(pipes.pid);

        foreach (line; lines)
                pipes.stdin.writeln(line);
        pipes.stdin.close();

        string[] sortedLines;
        foreach (line; pipes.stdout.byLine())
                sortedLines ~= line.idup;

        return sortedLines;
}

void main()
{
        // For the sake of example, pretend these lines came from
        // some intensive computation, and not actually a file.
        auto lines = readText("input.txt").splitLines();

        auto sortedLines = unixSort(lines);
}
---

However, I couldn't get it to work neither on Windows (same exception) nor Linux (it just gets stuck, even with a very small input.txt). No idea if I'm doing something wrong (maybe I need to indicate EOF in some way?) or if the problem is elsewhere.

Linux should work here.  From what I can tell, you are doing it right.

If I get some time, I'll try and debug this.


We are sort of stuck with File being the stream handler in phobos, which means we are currently stuck with FILE *. I don't know if there is a way to do partial reads/writes on a FILE *, or checking to see if data is available.

I guess you could always get the OS file handles/descriptors and query them directly, although there's also the matter of the internal FILE * buffers.

I think at that point, you would have to forgo all usage of File niceties (writeln, etc). Which would really suck.

But on the read end, this is a very viable option.

-Steve

Reply via email to