Re: ProcessReaper: single thread reaper

Roger Riggs Wed, 26 Mar 2014 11:01:55 -0700

Hi  Martin,

My solution to that is to export a public API that can be used

by other subsystems that fork processes. Some peaceful cooperation isrequired.


Roger


On 3/26/14 10:54 AM, Martin Buchholz wrote:

Peter and Roger, please stop going down this road until you have asolution for my show-stopper problem, that in the below you arereaping children that don't belong to java.lang.Process


+    pid = waitpid(-1, &exitValue, 0);

On Wed, Mar 26, 2014 at 10:49 AM, Peter Levart <[email protected]<mailto:[email protected]>> wrote:


    Hi Roger,

    Your current implementation works for the demostrated use case in
    UNIXProcess, where a single call-back is registered for a pid. If
    you wanted to register another, the waitList and exitList might
    have already removed the pid from them as a result of the 1st
    call-back already been serviced before registering the 2nd one.

    So you might want to keep the entry in exitList for a while, to
    accommodate for 2nd and subsequent call-back registration if such
    usage is to be needed.

    This is what I tried to do in the following code:

    http://cr.openjdk.java.net/~plevart/jdk9-dev/ProcessWaiter/webrev.01/
    <http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/ProcessWaiter/webrev.01/>

    I leveraged ConcurrentHashMap and CompletableFuture for the task.
    As an example of usage in UNIXProcess, I modified the linux
    variant only. A single waiter thread is  dispatching exit statuses
    from exited children to CompletableFutures. Various waitFor() etc.
    methods in UNIXProcess are implemented in terms of
    Future.get()/isDone() methods and clean-up is implemented by
    dispatching asynchronously a cleanup task to the thread pool, so
    that multiple threads can help draining buffers. Children that are
    collected by waiter thread but nobody has asked to be notified
    about get expunged from the map after a time-out.

    I haven't done any checking for live pids as you did in
    checkLiveness() because in current usage, a pid that registers a
    Future entry in the map is taken from a successful forkAndExec()
    call so it will definitely be collected by the waitLoop() when the
    child process exits. The only possibility that this does not
    happen is if the same pid was waited for somewhere else (not in
    the waitLoop()). So if this is to be supported, a scan of live
    pids will be necessary from time to time and not only when
    there're no unwaited children (suppose there is a long-lived child
    running indefinitely).

    What do you think?

    Regards, Peter



    On 03/25/2014 10:47 PM, Roger Riggs wrote:

    Hi Martin,

    Two cases, one current and one future.

    In the current case, Process can spawn a process and the process
    can exit before Process can register the callback to get the
    exitValue.
    Peter pointed out this race in his comments.
    The exitValue needs to be saved (for some time yet to be determined)
    to allow the Process to register and get its callback with the
    exitValue.

    The second case is future looking to the case where a child process
    not spawned by Process is exiting.  It might be due to a child being
    inherited from a dieing child or due to some different subprocess
    launcher.

    When the JEP 102 process work happens, it should be possible to
    wait or get called back for those non-spawned processes.

    The single (smaller) number of threads has been requested to handle
    processes that control a large number of children. It could be from
    a thread pool, either dedicated or common.  The common threadpool
    does not expect its tasks to hang for an indefinite period as might
    occur when waiting for along running process to exit.

    Thanks, Roger


    On 3/25/14 2:39 PM, Martin Buchholz wrote:

    What happens if the pid you get back is a subprocess not created by 
ProcessBuilder?
    +    pid = waitpid(-1, &exitValue, 0);
    ---

    What is the advantage of having a single thread?  Are you just
    trying to save threads?  The cost of the reaper threads is much
    lower than usual because of the small stack size.



    On Tue, Mar 25, 2014 at 7:05 AM, Peter Levart
    <[email protected] <mailto:[email protected]>> wrote:

        On 03/24/2014 10:05 PM, roger riggs wrote:

        Hi Rob, Martin, et.al <http://et.al>.

        I've prototyped (for 9) a thread reaper[1] that uses a
        single thread to wait for exiting
        processes and calling back to the process with the exit
        status.
        the interesting part is getting the exit status back to the
        Process that needs it
        It needs more testing and hardening.

        I had not considered using a signal handler for SIGCHLD but
        that's an option,
        though we need to be very careful about thread usage.

        Roger

        p.s. comments on the single thread reaper appreciated (in a
        new thread)
        [1] http://cr.openjdk.java.net/~rriggs/webrev-waitpid/
        <http://cr.openjdk.java.net/%7Erriggs/webrev-waitpid/>

        Hi Roger,

        I think I found a little race. Suppose reaper task is still
        alive and that all consumers have been serviced
        (consumerCount is 0). The reaper task waits for 500 millis
        for next consumer to be registered, but times out. Before
        calling "reaperThread.release()", new consumer comes around
        and registers itself, also calling runReaper(), but since
        reaperThread.release() has not yet been called by old reaper
        task, new reaper task is not submitted to commonPool. The
        old reaper task finishes, leaving one consumer on the
        waitingList with no reaper task to service it. If no new
        consumers get registered, the waiting consumer will never be
        notified...

        The simplest solution for this race, I think, would be to
        have a dedicated long-running thread. It could be spawned
        lazily, but then it would never finish.

        Otherwise a nice solution with two lists (exitList/waitList)
        and avoidance of race with reversed orders between
        - consumer registration: register on waitList 1st then check
        exitList, and
        - exit event dispatch: register on exitList 1st then check
        waitList

        ...but the check you use with consumeCount local variable to
        detect processes spawned by other means (for purposes of
        logging only) has a race:

        thread1: Suppose a new consumer is being registered with
        onExitCall(...), is added on the waitList, but before
        checking exitList().size() and iterating the exitList, ...
        thread2: the reaper task detects that the very same process
        has finished (gets it's pid from waitpid) and adds it's pid
        to exitList. Then before iterating waitList, ...
        thread1: iterates the exitList, finds a match and consumes
        the pid, removing the matching entries from both exitList
        and waitList. Then ...
        thread2: the reaper task iterates waitList, doesn't find a
        matching entry for exitPid, doesn't increment consumeCount
        and voila: debug log("Unexpected process exit for pid:...").


        That's enough races for today.

        Regards, Peter



        On 3/24/2014 12:38 AM, Rob McKenna wrote:

        Hi folks,

        Roger Riggs (cc'd) may want to chip in here as he's
        looking at the reaper thread arrangement in 9 at the moment.

        On another note, I too support the merging of those files.
        I didn't think there was much appetite for it at the time
        so I must admit this fell down my todo list. Looking at
        this bug did remind me that its something worth trying
        though. As per Alan's mail, I'm going to tackle it
        separately if you folks don't mind. I'll have a look at
        Peter's changes (thanks Peter!) as soon as I can and see
        about getting them in.

            -Rob

        On 23/03/14 22:30, Martin Buchholz wrote:




        On Sun, Mar 23, 2014 at 2:34 AM, Martin Buchholz
        <[email protected] <mailto:[email protected]>> wrote:


            We have also thought about whether having reaper
            threads is necessary.  The Unix rule is that child
            processes should be waited for, and some thread needs
            to do that.  There's no way to wait for a set of
            child pids, or to specify a "completion handler".
             Well, you might be able to get the newish waitid()
            to do what you want, but it looks like it's not
            sufficient when java is running inside a process that
            might do independent subprocess creation outside of
            the JVM.


        Actually, I take it back.  With sufficient work, it looks
        like you can get SIGCHLD to give you pid information in
        siginfo_t si_pid, and that can be used to trigger the
        reaping.  It looks like waitpid is "async-signal-safe",
        so we can call it from our signal handler.

        While we're at it we can fix SIGCHLD handling to do
        signal chaining, as with other signals.

Re: ProcessReaper: single thread reaper

Reply via email to