On 04/15/2015 07:59 PM, Martin Buchholz wrote:
I was at least partly responsible for the pipe buffer cleanup code.

Subprocess terminates, but may have written some data to the pipe buffer (typically 4k on Linux). Usually the pipe buffer is empty, but in case it's not, you don't want to lose the straggler data, you want to drain it and close the file descriptor, because it's easier to manage the memory than the fd. Messy, but I didn't see a better way.

But the data would stay there (in the pipe's buffer) until it is read by the user. The producing end of pipe may already be closed, but the consuming end is still open. You would just have to keep the file descriptor open and let user drain and close it (or leave it to FileInputStream finalizer to close it). Yes, a file descriptor will be potentially open some more time, but you wouldn't loose any data. That's how Windows implementation works, I think. There's not reaper thread in Windows that would trigger asynchronous actions when subprocess exits.

Regards, Peter


On Tue, Apr 14, 2015 at 11:31 PM, Peter Levart <peter.lev...@gmail.com <mailto:peter.lev...@gmail.com>> wrote:

    Hi Roger,

    So I started new thread...


    On 04/14/2015 11:33 PM, Roger Riggs wrote:


        On 4/14/2015 11:47 AM, Peter Levart wrote:

            I have been thinking of another small Process API update.
            Some people find it odd how redirected in/out/err streams
            are exposed:

            http://blog.headius.com/2013/06/the-pain-of-broken-subprocess.html

        yep, I've read that several times.


    To be fair, it's mostly, but not entirely correct. The part that says:

    " So when the child process exits, the any data waiting to be read
    from its output stream is drained into a buffer. All of it. In memory.

    Did you launch a process that writes a gigabyte of data to its
    output stream and then terminates? Well, friend, I sure hope you
    have a gigabyte of memory, because the JDK is going to read that
    sucker in and there's nothing you can do about it. And let's hope
    there's not more than 2GB of data, since this code basically just
    grows a byte[], which in Java can only grow to 2GB. If there's
    more than 2GB of data on that stream, this logic errors out and
    the data is lost forever."

    ...is exaggeration. This does not happen as the pipe has a bounded
    buffer. When subprocess exits, there is at most that much data
    left in the buffer (64k typically) and only that much is sucked
    into the Java process and the underlying handle closed.


            They basically don't like:

            - that exposed Input/Output streams are buffered
            - that underlying streams are File(Input/Output)Streams
            which, although the backing OS implementation are not
            files but pipes, don't expose selectable channels so that
            non-blocking event-based IO could be performed on them.
            - that exposed IO streams are automatically "managed" in
            UNIX variants of ProcessImpl which needs subtle "hacks" to
            do it in a perceptively transparent way (delayed close,
            draining input on exit and making it available after the
            underlying handle is already closed, ...)

            So I've been playing with the idea of exposing the "real"
            pipe channels in last couple of days. Here's the prototype
            I came up with:

            
http://cr.openjdk.java.net/~plevart/jdk9-sandbox/JDK-8046092-branch/Process.PipeChannel/webrev.01/
            
<http://cr.openjdk.java.net/%7Eplevart/jdk9-sandbox/JDK-8046092-branch/Process.PipeChannel/webrev.01/>


            This adds new Redirect type to the API and 3 new methods
            to Process that return Pipe channels when this new
            Redirect type is used. It's interesting that no native
            code changes were necessary. The behavior of pipes on
            Windows is a little different (perhaps because the Pipe
            NIO API uses sockets under the hood on Windows - why is
            that? Windows does have a pipe equivalent). What bothers
            me is that file handles opened on files (when redirecting
            to/from File) can be closed as soon as the subprocess is
            started and the subprocess is still able to read/write
            from the files (like with UNIX). It's not the same with
            pipe (i.e. socket) handles on Windows. They must be closed
            only after subprocess exits.

            If this subtle difference between file handles and socket
            handles on Windows could be dealt with (perhaps some
            options exist that affect subprocess spawning), then the
            extra waiting thread would not be needed on Windows.

            So what do you think of this API update?

        Definitely worthy of a separate thread.  It looks promising
        and addresses some of the issues
        raised, while moving other problems from the implementation to
        the application.
        Such as closing of the channels and cleanup.  I worry about
        how the resources are freed
        if the code spawning the app doesn't do the cleanup.  Will it
        require hooks (like a finalizer)
        to do the cleanup?
        Also, it doesn't help with Martin's goal of being able to
        implement
        emacs in Java since it doesn't provide pty control.
        As you are aware the complexity in Process is to ensure a
        timely cleanup and
        allowing the Process to terminate and release the process
        resources
        when it was done and not having to wait for the stdout/stderr
        consumer.


    I wonder how this automatic stream cleanup really helps in
    real-world programs. It doesn't help the Process to terminate and
    release the process resources any sooner as the process terminates
    on it's own (unless killed) and OS releases it's resources without
    the outside help anyway. Draining and closing the stream after the
    process has already exited just releases one file handle (the
    consuming side of the pipe) in a promptly manner. This could be
    left to the user and/or finalizer. Draining after the process has
    already exited does not help the process to exit any sooner as it
    happens after the fact. A program that doesn't consume the stream
    can cause the process to hang forever as the pipe's buffer is
    bounded (64k typically). So draining and closing after the process
    has exited only potentially helps for the last 64k of the stream
    and only to release one file handle in a potentially more timely
    manner.

    OTOH now that ProcessImpl for UNIX does that (and why does Windows
    implementation not do that?) sloppy programs might exist that
    would potentially break if the status quo is not maintained.

    But new functionality need not be so permissive. I'll take a look
    at how and if Channel(s) do any kind of automatic cleanup based on
    reachability and whether this can be bolted on for Process use. I
    doubt it is possible to drain and close a Channel without
    disturbing the ongoing Selector IO processing...

    Regards, Peter


        Thanks, Roger





Reply via email to