Re: RFR 9: 8077350 Process API Updates Implementation Review

Roger Riggs Mon, 20 Apr 2015 10:19:26 -0700

ok, thanks, roger


On 4/20/2015 11:57 AM, Thomas Stüfe wrote:

Hi Roger,

thanks!

Maybe better:

"When using ProcessHandles avoid assumptions about the state orliveness of the underlying process."-> "When using ProcessHandles avoid assumptions about liveness oridentity of the underlying process."


Regards, Thomas

On Mon, Apr 20, 2015 at 5:53 PM, Roger Riggs <roger.ri...@oracle.com<mailto:roger.ri...@oracle.com>> wrote:


    Hi Thomas,

    I expanded the ProcessHandle class javadoc [1] paragraph to
    include the
    caution about process id reuse.

    Thanks, Roger

    [1]
    http://cr.openjdk.java.net/~rriggs/ph-apidraft/java/lang/ProcessHandle.html
    
<http://cr.openjdk.java.net/%7Erriggs/ph-apidraft/java/lang/ProcessHandle.html>



    On 4/18/2015 2:44 PM, Thomas Stüfe wrote:

    Hi Roger,

    On Fri, Apr 17, 2015 at 8:57 PM, Roger Riggs
    <roger.ri...@oracle.com <mailto:roger.ri...@oracle.com>> wrote:

        Hi David,

        On 4/17/2015 2:44 PM, David M. Lloyd wrote:

            On 04/17/2015 11:53 AM, Roger Riggs wrote:

                Hi Peter,

                On 4/17/2015 4:05 AM, Peter Levart wrote:

                    Hi Roger,

                    Retrieving and caching the process start time as
                    soon as ProcessHandle
                    is instantiated might be a good idea. "destroy"
                    native code would then
                    use pid *and* start time to check the identity of
                    the process before
                    killing it.

                Yes, though it does raise a question about how to
                specify PH.equals().
                Either the start time is explicitly
                mentioned (and is repeatable and testable) or it is
                vague and refers to
                some implementation
                specific value that uniquely identifies the process;
                and is less testable.


            Even with timestamps though, you cannot prevent the PID
            from being reused in the time window after you verify its
            timestamp but before you kill it unless it is a direct
            child process (on UNIX anyway; who knows what Windows
            does...).  I think that this inevitable race should be
            mentioned in the docs regardless of whether the timestamp
            thing is implemented (or doc'd) to prevent unrealistic
            expectations (the api draft link seems to be dead so I
            didn't see if it already includes this kind of language).

        I can add a note about race conditions to the existing class
        level
        javadoc about process information changing asynchronously.

        "Race conditions can exist between checking the status of a
        process and acting upon it."

        But I'm still struck by the oddity of Java needing to provide
        a better interface
        to processes than the native OS does.  In the native OS, a
        pid is a pid and
        the only handle given to applications.  So both the app and
        the os need to
        be conservative about pid re-use.

    The problem is the audience. Every UNIX system programmer knows
    about the dangers of pid recycling. Java programmers don't, and
    your API promises a robustness and simplicity which it
    unfortunately cannot deliver. No one will expect a ProcessHandle
    returned by allChildren suddenly to refer to a totally different
    process. Therefore I also think the dangers should be clearly
    mentioned in the java doc.

    Regards, Thomas

        The link[1] was broken while I replaced it with an update
        reflecting the recent comments.

        Thanks, Roger

        [1] http://cr.openjdk.java.net/~rriggs/phdoc/
        <http://cr.openjdk.java.net/%7Erriggs/phdoc/>



                    At least on Linux (/proc/<pid>/stat) and Solaris
                    (/proc/<pid>/status)
                    it is not necessary to open those special files
                    and read them. Just
                    doing stat() on them and using the st_mtime will
                    get you the process
                    start time. I see AIX shares native code with
                    Linux (unix), so perhaps
                    AIX does the same. Mac OSX and Windows have
                    special calls...

                That is less expensive.  But opening /proc/<pid>/stat
                is necessary for
                allChildren to get
                and filter on the parent and can provide the
                starttime as well.
                stat would be useful for allProcesses which does not
                need the parent.


                    In case OS does not allow retrieving the start
                    time (not owner or
                    similar), it should be cached as some "undefined"
                    value and treated
                    the same when destroying. If while obtaining the
                    ProcessHandle *and*
                    while destroying the process, the start time of
                    the process with a
                    particular pid is "undefined" then there are two
                    options:

                    1 - don't kill the process
                    2 - kill the process

                    They might actually be no different as the reason
                    of not being able to
                    retrieve the start time (not owner) might prevent
                    the process from
                    being killed too, so option 2 could be used to
                    allow killing on any
                    hypothetical platforms that don't support
                    retrieving start time and it
                    is no worse than current implementation anyway.

                Destroy is a best-effort method;  it is not
                guaranteed to result in
                termination, though
                the docs are a bit weak on this point.

                I'd go for option 2, anyone using the API is looking
                for results on
                processes they
                already know something about (mostly), so there's no
                reason to be
                particularly
                conservative about.  If the caller is trying to be
                more conservative,
                they can maintain
                their own extra information about the processes they
                are managing.



                    What do you think?

                I'd like to pick this up as a separate change, and
                get the current one
                in first.

                Roger

Re: RFR 9: 8077350 Process API Updates Implementation Review

Reply via email to