Re: SelectableChannels and Process API

Roger Riggs Wed, 15 Apr 2015 11:15:12 -0700

Hi Peter,

I don't know the history behind the stream draining in ProcessImpl.
I understood it to be a performance/scalability issue.
Maybe Martin, Alan, or someone else can fill in the history.


Roger



On 4/15/2015 2:31 AM, Peter Levart wrote:

Hi Roger,

So I started new thread...


On 04/14/2015 11:33 PM, Roger Riggs wrote:
On 4/14/2015 11:47 AM, Peter Levart wrote:
I have been thinking of another small Process API update. Somepeople find it odd how redirected in/out/err streams are exposed:
http://blog.headius.com/2013/06/the-pain-of-broken-subprocess.html
yep, I've read that several times.
To be fair, it's mostly, but not entirely correct. The part that says:
" So when the child process exits, the any data waiting to be readfrom its output stream is drained into a buffer. All of it. In memory.
Did you launch a process that writes a gigabyte of data to its outputstream and then terminates? Well, friend, I sure hope you have agigabyte of memory, because the JDK is going to read that sucker inand there's nothing you can do about it. And let's hope there's notmore than 2GB of data, since this code basically just grows a byte[],which in Java can only grow to 2GB. If there's more than 2GB of dataon that stream, this logic errors out and the data is lost forever."
...is exaggeration. This does not happen as the pipe has a boundedbuffer. When subprocess exits, there is at most that much data left inthe buffer (64k typically) and only that much is sucked into the Javaprocess and the underlying handle closed.
They basically don't like:

- that exposed Input/Output streams are buffered
- that underlying streams are File(Input/Output)Streams which,although the backing OS implementation are not files but pipes,don't expose selectable channels so that non-blocking event-based IOcould be performed on them.- that exposed IO streams are automatically "managed" in UNIXvariants of ProcessImpl which needs subtle "hacks" to do it in aperceptively transparent way (delayed close, draining input on exitand making it available after the underlying handle is alreadyclosed, ...)
So I've been playing with the idea of exposing the "real" pipechannels in last couple of days. Here's the prototype I came up with:
http://cr.openjdk.java.net/~plevart/jdk9-sandbox/JDK-8046092-branch/Process.PipeChannel/webrev.01/
This adds new Redirect type to the API and 3 new methods to Processthat return Pipe channels when this new Redirect type is used. It'sinteresting that no native code changes were necessary. The behaviorof pipes on Windows is a little different (perhaps because the PipeNIO API uses sockets under the hood on Windows - why is that?Windows does have a pipe equivalent). What bothers me is that filehandles opened on files (when redirecting to/from File) can beclosed as soon as the subprocess is started and the subprocess isstill able to read/write from the files (like with UNIX). It's notthe same with pipe (i.e. socket) handles on Windows. They must beclosed only after subprocess exits.
If this subtle difference between file handles and socket handles onWindows could be dealt with (perhaps some options exist that affectsubprocess spawning), then the extra waiting thread would not beneeded on Windows.
So what do you think of this API update?
Definitely worthy of a separate thread. It looks promising andaddresses some of the issuesraised, while moving other problems from the implementation to theapplication.Such as closing of the channels and cleanup. I worry about how theresources are freedif the code spawning the app doesn't do the cleanup. Will it requirehooks (like a finalizer)
to do the cleanup?
Also, it doesn't help with Martin's goal of being able to implement
emacs in Java since it doesn't provide pty control.
As you are aware the complexity in Process is to ensure a timelycleanup and
allowing the Process to terminate and release the process resources
when it was done and not having to wait for the stdout/stderr consumer.
I wonder how this automatic stream cleanup really helps in real-worldprograms. It doesn't help the Process to terminate and release theprocess resources any sooner as the process terminates on it's own(unless killed) and OS releases it's resources without the outsidehelp anyway. Draining and closing the stream after the process hasalready exited just releases one file handle (the consuming side ofthe pipe) in a promptly manner. This could be left to the user and/orfinalizer. Draining after the process has already exited does not helpthe process to exit any sooner as it happens after the fact. A programthat doesn't consume the stream can cause the process to hang foreveras the pipe's buffer is bounded (64k typically). So draining andclosing after the process has exited only potentially helps for thelast 64k of the stream and only to release one file handle in apotentially more timely manner.
OTOH now that ProcessImpl for UNIX does that (and why does Windowsimplementation not do that?) sloppy programs might exist that wouldpotentially break if the status quo is not maintained.

I think Windows use of handles makes sure they are open for as long asany process

holds a handle, so they don't get prematurely closed.

But new functionality need not be so permissive. I'll take a look athow and if Channel(s) do any kind of automatic cleanup based onreachability and whether this can be bolted on for Process use. Idoubt it is possible to drain and close a Channel without disturbingthe ongoing Selector IO processing...
Regards, Peter
Thanks, Roger

Re: SelectableChannels and Process API

Reply via email to