I have internalized the idea that finalization is a method of last resort. If there's a chance to free some OS resource by doing some work NOW, do it, and don't leave it for a finalizer to do later.
On Wed, Apr 15, 2015 at 12:31 PM, Peter Levart <peter.lev...@gmail.com> wrote: > > > On 04/15/2015 07:59 PM, Martin Buchholz wrote: > > I was at least partly responsible for the pipe buffer cleanup code. > > Subprocess terminates, but may have written some data to the pipe buffer > (typically 4k on Linux). Usually the pipe buffer is empty, but in case > it's not, you don't want to lose the straggler data, you want to drain it > and close the file descriptor, because it's easier to manage the memory > than the fd. Messy, but I didn't see a better way. > > > But the data would stay there (in the pipe's buffer) until it is read by > the user. The producing end of pipe may already be closed, but the > consuming end is still open. You would just have to keep the file > descriptor open and let user drain and close it (or leave it to > FileInputStream finalizer to close it). Yes, a file descriptor will be > potentially open some more time, but you wouldn't loose any data. That's > how Windows implementation works, I think. There's not reaper thread in > Windows that would trigger asynchronous actions when subprocess exits. > > Regards, Peter > > > > On Tue, Apr 14, 2015 at 11:31 PM, Peter Levart <peter.lev...@gmail.com> > wrote: > >> Hi Roger, >> >> So I started new thread... >> >> >> On 04/14/2015 11:33 PM, Roger Riggs wrote: >> >>> >>> On 4/14/2015 11:47 AM, Peter Levart wrote: >>> >>>> I have been thinking of another small Process API update. Some people >>>> find it odd how redirected in/out/err streams are exposed: >>>> >>>> http://blog.headius.com/2013/06/the-pain-of-broken-subprocess.html >>>> >>> yep, I've read that several times. >>> >> >> To be fair, it's mostly, but not entirely correct. The part that says: >> >> " So when the child process exits, the any data waiting to be read from >> its output stream is drained into a buffer. All of it. In memory. >> >> Did you launch a process that writes a gigabyte of data to its output >> stream and then terminates? Well, friend, I sure hope you have a gigabyte >> of memory, because the JDK is going to read that sucker in and there's >> nothing you can do about it. And let's hope there's not more than 2GB of >> data, since this code basically just grows a byte[], which in Java can only >> grow to 2GB. If there's more than 2GB of data on that stream, this logic >> errors out and the data is lost forever." >> >> ...is exaggeration. This does not happen as the pipe has a bounded >> buffer. When subprocess exits, there is at most that much data left in the >> buffer (64k typically) and only that much is sucked into the Java process >> and the underlying handle closed. >> >> >>>> They basically don't like: >>>> >>>> - that exposed Input/Output streams are buffered >>>> - that underlying streams are File(Input/Output)Streams which, although >>>> the backing OS implementation are not files but pipes, don't expose >>>> selectable channels so that non-blocking event-based IO could be performed >>>> on them. >>>> - that exposed IO streams are automatically "managed" in UNIX variants >>>> of ProcessImpl which needs subtle "hacks" to do it in a perceptively >>>> transparent way (delayed close, draining input on exit and making it >>>> available after the underlying handle is already closed, ...) >>>> >>>> So I've been playing with the idea of exposing the "real" pipe channels >>>> in last couple of days. Here's the prototype I came up with: >>>> >>>> >>>> http://cr.openjdk.java.net/~plevart/jdk9-sandbox/JDK-8046092-branch/Process.PipeChannel/webrev.01/ >>>> >>>> This adds new Redirect type to the API and 3 new methods to Process >>>> that return Pipe channels when this new Redirect type is used. It's >>>> interesting that no native code changes were necessary. The behavior of >>>> pipes on Windows is a little different (perhaps because the Pipe NIO API >>>> uses sockets under the hood on Windows - why is that? Windows does have a >>>> pipe equivalent). What bothers me is that file handles opened on files >>>> (when redirecting to/from File) can be closed as soon as the subprocess is >>>> started and the subprocess is still able to read/write from the files (like >>>> with UNIX). It's not the same with pipe (i.e. socket) handles on Windows. >>>> They must be closed only after subprocess exits. >>>> >>>> If this subtle difference between file handles and socket handles on >>>> Windows could be dealt with (perhaps some options exist that affect >>>> subprocess spawning), then the extra waiting thread would not be needed on >>>> Windows. >>>> >>>> So what do you think of this API update? >>>> >>> Definitely worthy of a separate thread. It looks promising and >>> addresses some of the issues >>> raised, while moving other problems from the implementation to the >>> application. >>> Such as closing of the channels and cleanup. I worry about how the >>> resources are freed >>> if the code spawning the app doesn't do the cleanup. Will it require >>> hooks (like a finalizer) >>> to do the cleanup? >>> Also, it doesn't help with Martin's goal of being able to implement >>> emacs in Java since it doesn't provide pty control. >>> As you are aware the complexity in Process is to ensure a timely cleanup >>> and >>> allowing the Process to terminate and release the process resources >>> when it was done and not having to wait for the stdout/stderr consumer. >>> >> >> I wonder how this automatic stream cleanup really helps in real-world >> programs. It doesn't help the Process to terminate and release the process >> resources any sooner as the process terminates on it's own (unless killed) >> and OS releases it's resources without the outside help anyway. Draining >> and closing the stream after the process has already exited just releases >> one file handle (the consuming side of the pipe) in a promptly manner. This >> could be left to the user and/or finalizer. Draining after the process has >> already exited does not help the process to exit any sooner as it happens >> after the fact. A program that doesn't consume the stream can cause the >> process to hang forever as the pipe's buffer is bounded (64k typically). So >> draining and closing after the process has exited only potentially helps >> for the last 64k of the stream and only to release one file handle in a >> potentially more timely manner. >> >> OTOH now that ProcessImpl for UNIX does that (and why does Windows >> implementation not do that?) sloppy programs might exist that would >> potentially break if the status quo is not maintained. >> >> But new functionality need not be so permissive. I'll take a look at how >> and if Channel(s) do any kind of automatic cleanup based on reachability >> and whether this can be bolted on for Process use. I doubt it is possible >> to drain and close a Channel without disturbing the ongoing Selector IO >> processing... >> >> Regards, Peter >> >> >>> Thanks, Roger >>> >>> >>> >> > >