Patricia Shanahan wrote:
Sim IJskes - QCG wrote:
On 26-11-10 15:21, Patricia Shanahan wrote:
What worries me a little is the state the operating system keeps on
terminated processes. We know of the SO_REUSEADDR issue, but you seem
to experience the non-release of resources on process termination.
formally a OS bug, but if so is something we need to workaround. How
definate are you on your reports where you have problems rerunning a
test caused by the OS not releasing resources on VM exit?
I see two issues here:
1. Terminating every VM.
A working termination is a firm requirement before bypassing orderly
teardown can be attempted.
2. Deleting any temporary files.
I think this comes down to making sure that if we create any files they
are marked deleteOnExit, reducing the problem to terminating every VM.
Empirically, a Ctrl-C termination often leaves the system in a state in
which subsequent tests fail.
Any idea which VM receives this Ctrl-C? The ant VM, the harness
master, or any random process? There is no such concept as process
group leader under windows is there?
I'm sure the shell initially delivers the signal to the ant job that I
told it to run. I have no idea how and to what extent it gets passed on
to the various processes that are created e.g. to run services. It's a
good question, and I'll think a bit about how to find out.
I've thought of two schemes, and some reasons why we may want to
implement both.
Scheme 1: Modify each class that calls Runtime.getRuntime.exec to
register a shutdownHook to destroy the Process that exec returned.
This has a window between the exec call and the addShutdownHook call
during which termination of the VM doing the process creation would
leave the process as an orphan, with no arrangement to destroy it.
Scheme 2: Require each public static void main(String[]) method in River
to log the pid of its process in a log message with a specific format.
Write a program that scans the log file for those messages, and kills
each process.
This is reliable, in the sense that a thread can be required to write to
the log before doing anything that reserves a resource. It is not so
good for normal shutdown, because it requires log reading.
I think Scheme 1 would work well for normal termination. By the time the
test finishes, all the windows between creating a process and creating
the shutdown hook to kill it will have closed. Scheme 2 would be useful
as a backup in the event of a crash or cntrl-C while processes are being
created. Knowing the PID of each River process in a configuration could
be useful to admins running production systems, as well as in our QA
environment.
Any other ideas?
Patricia