Hi Thomas,

On 11/26/2018 10:05 AM, Thomas Stüfe wrote:
Hi Roger,

On Wed, Nov 21, 2018 at 4:03 PM Roger Riggs <roger.ri...@oracle.com> wrote:
Hi Thomas,

I'd be interested in hearing more about the use cases.
There seem to be many cases where containers are doing the management
of groups of processes.

The function will need to have an equivalent on Windows.

The expressed use case is taking advantage of Posix/Unix signal behavior.
But there are oh so many issues with signals, its likely to be a big can
of worms.
You mention a desire for other Posix functions, please elaborate.

I think the most useful and sought-after use case would really be the
ability to kill a group of associated processes. That can be
implemented on both Windows and Posix platforms. It would come in very
handy in all kind of java test runners, or were java is used in a
process scheduling role.

Yes, you are right, containers are a typical way nowadays to manage
process groups. But as Goetz stated, the fact that they are used so
much shows there is demand for this kind of functionality.

Secondary features could be suspending a group of processes (as
implemented, see proposal), sending a group of processes to the
foreground of a given terminal, sending any kind of signal to a
process group... however all these can only be reliably implemented on
Posix platforms. I am never sure how much of a deal breaker that is,
considering that we already have functionality which works only on
some platforms, and throwing UnsupportedOperationException seems to be
the standard way not to be bound down by unsupportive platforms. But I
also am a fan of "Write Once Run Everywhere" so I am torn.

Could you elaborate on the can of signal-related worms?
Signals and signal handling is split between the VM, native and Java code.

Raising signals is not so much of an issue from an implementation point of view.
However, handling signals is a very delicate operation.
Some signals are reserved for the VM, including USR1, USR2, ILL, BUS, etc.
Suspend and resume are delicate too since a suspended VM can't respond
as expected and may deadlock the process or threads.
Signal handling is very context sensitive, which threads handle signals, etc. And is timing sensitive, so the Java level signal handlers are decoupled from the native signal delivery and are just event notifications, possibly delayed by scheduling.
We've looked at signal handling from time to time and have not come up
with a robust mechanism.

And yes, its not very portable.

The cases that have come before usually relate to handling SIGINT,
SIGHUP, and SIGTERM to gracefully terminate.


One way to go with this would be to limit the feature set to the
kill-aspect, since that is a very limited thing that could be
implemented on Windows too.

If there's a way to scope it reasonably then a JEP is the process to pursue.

I am willing to draft a JEP if there it has a reasonable chance of
getting approved, so lets hear opinions. I understand that the barrier
to get features included into the JDK is high, and that is as it
should be. My personal view on this is that this is useful enough for
inclusion.
You've only proposed sending signals and that should be separable from
handling signals though they should/would be complementary from an API perspective

I could see process groups and supporting term/kill would be straightforward
from the API and behavioral aspects.  But I'd be concerned about feature
creep with other signals and the side-effect/semantics expected.

Regards, Roger


Thanks & best Regards, Thomas


Thanks, Roger


On 11/12/2018 12:29 PM, Thomas Stüfe wrote:
Dear all,

may I please hear your thoughts about the following proposal?

We would like to add support for process groups to the JDK: the
ability to put child processes into new or pre-existing process
groups. We added this feature to our proprietary port some time ago
and has been very useful in cases where the VM acts in a process
scheduling role.

With process groups we mean of course standard Unix process groups.
There exists a similar concept on Windows, Job Objects, so at least a
subset of what we propose could be done in a platform independent way.

----

Motivation:

Most importantly, the ability to safely terminate a group of processes.

The established way to do this is, since Java 9, to iterate over a
process tree, calling Process.children() or Process.descendants() on
the root Process, and killing them using Process.destroy().

In practice, that approach is not always a good fit. It leaves out any
orphaned processes; any deceased non-leaf process in the tree makes
its children unreachable. Worst case, if the root process dies, all
children are orphaned and cannot be reached. Another limitation is
that this only works for process trees - parent-child relationships -
but not for unrelated processes one might want to group together. It
also becomes a bit inefficient with many processes, requiring one JNI
call/system call per process to kill.

Process groups, OTOH, would allow us to group together any number of
unrelated processes. We can then send them bulk signals, eg
SIGTERM/SIGKILL with only one system call. And for that to work, the
parent relationships do not matter, so we also reach processes which
have been orphaned.

There are more things one could do with process groups besides killing
them: suspend/resume them together (SIGSTOP/CONT), or to send them to
the background of the controlling terminal.

In fact, one could write its own shell in Java :)

----

I drew up a tiny patch to demonstrate how this could look. This is
just an example, to have something to play with and talk about:

http://cr.openjdk.java.net/~stuefe/webrevs/processgroup-support/webrev.01/webrev/index.html

and here is a small usage example:

https://github.com/tstuefe/ojdk-repros/blob/master/src/other/RuntimeExecSimpleTestWithProcessGroup.java

The suggested API changes are small:

- A new class ProcessGroup as the platform's notion of a process
group. In this patch, it offers four functions:
    - destroy()/destroyForcibly() terminate or kill the whole process group
    - suspend()/resume() puts them to sleep and wakes them up.
    More functionality could be added if needed. This mostly depends on
how tightly we want to be bound by platform limitations on Windows,
where process groups cannot be translated 1:1 to Job Objects.

- ProcessBuilder has now two new attributes:
    - createProcessGroup() is a boolean flag directing the builder to
let sub processes create their own process group, with themselves
being the leader.
    - processGroup() is a reference to a ProcessGroup object; when not
null, subprocesses will join that process group.

- The Process class gets a new query method to retrieve a ProcessGroup
object linked to its process group id.

Using these building stones, a typical pattern could be:

<example>
          ProcessBuilder processBuilder = new ProcessBuilder(cmd);
          processBuilder.createProcessGroup(true);  <-- next process is pg 
leader

          Process leader = processBuilder.start();

          ProcessGroup pgr = leader.processGroup();  <-- retrieve newly
created process group
          processBuilder.processGroup(pgr); <-- next processes shall be
members of this process group too

          processBuilder.start();
          processBuilder.start();
          ....
</example>

and then call operations on the ProcessGroup object.

----

It is clear to me that this kind of change would require probably a
JEP, if it is desired at all. With this mail I just wanted to gauge
interest.

What do you think?

Thanks & Best Regards, Thomas

Reply via email to