Dear all,
first of all, thank you for the inspiring work on GNU Parallel. I came across
it just very recently and try to integrate the data processing of a public
research project with it.
For reasons, the processing of a single file of multiple GBs is heavily
multithreaded since processing depends on a multitude of tools.
In this particular case we chose to encapsulate the toolchain as a docker swarm
which we instantiate multiple times, once per file, as many as necessary.
To leverage the potential of multiple servers for processing we thought of
employing GNU Parallel to distribute these jobs.
However, I'm facing the issue that using GNU Parallel, I am unable to drive the
state machine of the swarms properly.
While calling 'docker-compose up' from the wrapper scripts works just fine,
termination through 'docker-compose down' is impossible.
Instead, GNU Parallel terminates the wrapper script immediately and signals do
not get trapped by the script.
To isolate the swarm instances from each other we have an extensive resource
management wrapped around it.
Hence GNU Parallel cannot call docker-compose directly but depends on the
wrapper-script setting up the environment.
Telling from the extensive examples given in the manpage and the tutorial, I am
aware, that is probably not the most typical use-case for GNU Parallel.
Nonetheless, I've tried to replicate the issue with a very simple example
below, which yields the same behaviour.
In neither case, the signal handler is called. This applies to both, the local
host ":" and remote servers.
Any advice on how to have parallel send the INT/TERM signal and subsequently
wait for the process to terminate gracefully?
I suspect this is an issue of routing signals. What am I missing here?
Version: 20200822
All the best,
Jens
========================runner.sh===============================
#!/bin/bash
function _do_cleanup() {
echo "Stopping swarm"
sleep 5 # docker-compose down (and releasing
exclusive resources).
exit 0
}
trap _do_ cleanup EXIT # alternatively SIGINT SIGTERM
echo "Running Docker swarm"
/bin/sleep 100 & # docker-compose up .... (allocating exclusive
resources)
wait # Waiting, having bash run in fg
to actually receive the signal
exit 0
==============================================================
Option A) with bash
$> parallel --line-buffer --termseq TERM,10000,INT,10000,KILL,25 ./runner.sh
::: 1
Running Docker swarm
^C
Option B) with replacing the foreground process through exec
$> parallel --line-buffer --termseq TERM,10000,INT,10000,KILL,25 exec
./runner.sh ::: 1
Running Docker swarm
^C
Option C) haven a bash function exported as the original use case has.
$> function do_runner() { ./runner.sh; }
$> export -f do_runner
$> parallel -env do_runner --line-buffer --termseq TERM,10000,INT,10000,KILL,25
do_runner ::: 1
Running Docker swarm
^C
INTERNAL