Repository: aurora Updated Branches: refs/heads/master fb936b545 -> 5410c229f
Send SIGTERM to daemonized processes on shutdown. Problem Processes can deamonize and escape the supervision of a coordinator. Using the Docker Containerizer or the Mesos Containerizer with pid isolation means that the processes will be come reparented to the sh process that launches the executor. For example: ``` root@aurora:/# ps xf PID TTY STAT TIME COMMAND 48 ? Ss 0:00 /bin/bash 86 ? R+ 0:00 _ ps xf 1 ? Ss 0:00 /bin/sh -c ${MESOS_SANDBOX=.}/thermos_executor.pex --announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config /home/vagrant/aurora/examples/va 5 ? Sl 0:02 python2.7 /mnt/mesos/sandbox/thermos_executor.pex --announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config /home/vagrant/aurora/examples/vag 23 ? S 0:00 _ /usr/local/bin/python2.7 /mnt/mesos/sandbox/thermos_runner.pex --task_id=www-data-devel-hello_docker_engine-0-bde5cdc7-8685-46fd-9078-4a86bd5be152 -- 29 ? Ss 0:00 _ /usr/local/bin/python2.7 /mnt/mesos/sandbox/thermos_runner.pex --task_id=www-data-devel-hello_docker_engine-0-bde5cdc7-8685-46fd-9078-4a86bd5be15 32 ? S 0:00 | _ /bin/bash -c while true; do echo hello world sleep 10 done 81 ? S 0:00 | _ sleep 10 31 ? Ss 0:00 _ /usr/local/bin/python2.7 /mnt/mesos/sandbox/thermos_runner.pex --task_id=www-data-devel-hello_docker_engine-0-bde5cdc7-8685-46fd-9078-4a86bd5be15 33 ? S 0:00 _ /bin/bash -c while true; do echo hello world sleep 10 done 82 ? S 0:00 _ sleep 10 47 ? S 0:00 python ./daemon.py ``` Solution Ensure processes that escape the supervision of the coordinator reparent to the runner who can send signals to them on task tear down. We do this by using the `PR_SET_CHILD_SUBREAPER` flag of `prctl(2)`. After this change the process tree looks like: ``` root@aurora:/# ps xf PID TTY STAT TIME COMMAND 66 ? Ss 0:00 /bin/bash 70 ? R+ 0:00 _ ps xf 1 ? Ss 0:00 /bin/sh -c ${MESOS_SANDBOX=.}/thermos_executor.pex --announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config /home/vagrant/aurora/examples/va 5 ? Sl 0:02 python2.7 /mnt/mesos/sandbox/thermos_executor.pex --announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config /home/vagrant/aurora/examples/vag 23 ? S 0:00 _ /usr/local/bin/python2.7 /mnt/mesos/sandbox/thermos_runner.pex --task_id=www-data-devel-hello_docker_engine-0-721406db-00f5-4c0c-915e-1dbc5568b849 -- 33 ? Ss 0:00 _ /usr/local/bin/python2.7 /mnt/mesos/sandbox/thermos_runner.pex --task_id=www-data-devel-hello_docker_engine-0-721406db-00f5-4c0c-915e-1dbc5568b84 40 ? S 0:00 | _ /bin/bash -c while true; do echo hello world sleep 10 done 63 ? S 0:00 | _ sleep 10 36 ? Ss 0:00 _ /usr/local/bin/python2.7 /mnt/mesos/sandbox/thermos_runner.pex --task_id=www-data-devel-hello_docker_engine-0-721406db-00f5-4c0c-915e-1dbc5568b84 37 ? S 0:00 | _ /bin/bash -c while true; do echo hello world sleep 10 done 62 ? S 0:00 | _ sleep 10 55 ? S 0:00 _ python ./daemon.py ``` Now the runner is aware of the reparented procesess can can tear it down cleanly with a `SIGTERM`. Testing Done: src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh Bugs closed: AURORA-1808 Reviewed at https://reviews.apache.org/r/53418/ Project: http://git-wip-us.apache.org/repos/asf/aurora/repo Commit: http://git-wip-us.apache.org/repos/asf/aurora/commit/5410c229 Tree: http://git-wip-us.apache.org/repos/asf/aurora/tree/5410c229 Diff: http://git-wip-us.apache.org/repos/asf/aurora/diff/5410c229 Branch: refs/heads/master Commit: 5410c229f30d6d8e331cdddc5c84b9b2b5313c01 Parents: fb936b5 Author: Zameer Manji <zma...@apache.org> Authored: Fri Nov 4 13:41:25 2016 -0700 Committer: Zameer Manji <zma...@apache.org> Committed: Fri Nov 4 13:41:25 2016 -0700 ---------------------------------------------------------------------- RELEASE-NOTES.md | 2 + .../apache/thermos/common/process_util.py | 30 +++++++ src/main/python/apache/thermos/core/helper.py | 20 +++++ src/main/python/apache/thermos/core/process.py | 22 +++++- src/main/python/apache/thermos/core/runner.py | 2 + .../aurora/e2e/test_daemonizing_process.aurora | 83 ++++++++++++++++++++ .../sh/org/apache/aurora/e2e/test_end_to_end.sh | 29 ++++++- 7 files changed, 182 insertions(+), 6 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/aurora/blob/5410c229/RELEASE-NOTES.md ---------------------------------------------------------------------- diff --git a/RELEASE-NOTES.md b/RELEASE-NOTES.md index d89ef2f..94224be 100644 --- a/RELEASE-NOTES.md +++ b/RELEASE-NOTES.md @@ -10,6 +10,8 @@ - The Aurora Scheduler API supports volume mounts per task for the Mesos Containerizer if the scheduler is running with the `-allow_container_volumes` flag. +* The executor will send SIGTERM to processes that self daemonize via double forking. +* The executor now requires Linux kernel 3.4 or later. ### Deprecations and removals: http://git-wip-us.apache.org/repos/asf/aurora/blob/5410c229/src/main/python/apache/thermos/common/process_util.py ---------------------------------------------------------------------- diff --git a/src/main/python/apache/thermos/common/process_util.py b/src/main/python/apache/thermos/common/process_util.py index abd2c0e..c63b9af 100644 --- a/src/main/python/apache/thermos/common/process_util.py +++ b/src/main/python/apache/thermos/common/process_util.py @@ -12,8 +12,11 @@ # limitations under the License. # +import ctypes import os +from twitter.common import log + from gen.apache.aurora.api.constants import TASK_FILESYSTEM_MOUNT_POINT @@ -42,3 +45,30 @@ def wrap_with_mesos_containerizer(cmdline, user, cwd, mesos_containerizer_path): os.path.join(os.environ['MESOS_DIRECTORY'], TASK_FILESYSTEM_MOUNT_POINT), user, bash_wrapper % cmdline)) + + +def setup_child_subreaping(): + """ + This uses the prctl(2) syscall to set the `PR_SET_CHILD_SUBREAPER` flag. This + means if any children processes need to be reparented, they will be reparented + to this process. + + More documentation here: http://man7.org/linux/man-pages/man2/prctl.2.html + and here: https://lwn.net/Articles/474787/ + + Callers should reap terminal children to prevent zombies. + + raises OSError if the underlying prctl call fails. + raises RuntimeError if libc cannot be found. + """ + log.debug("Calling prctl(2) with PR_SET_CHILD_SUBREAPER") + # This constant is taken from prctl.h + PR_SET_CHILD_SUBREAPER = 36 + library_name = ctypes.util.find_library('c') + if library_name is None: + raise RuntimeError("libc not found") + libc = ctypes.CDLL(library_name, use_errno=True) + ret = libc.prctl(PR_SET_CHILD_SUBREAPER, 1, 0, 0, 0) + if ret != 0: + errno = ctypes.get_errno() + raise OSError(errno, os.strerror(errno)) http://git-wip-us.apache.org/repos/asf/aurora/blob/5410c229/src/main/python/apache/thermos/core/helper.py ---------------------------------------------------------------------- diff --git a/src/main/python/apache/thermos/core/helper.py b/src/main/python/apache/thermos/core/helper.py index 68855e1..0811e84 100644 --- a/src/main/python/apache/thermos/core/helper.py +++ b/src/main/python/apache/thermos/core/helper.py @@ -219,6 +219,26 @@ class TaskRunnerHelper(object): return state.processes[process_name][-1].coordinator_pid @classmethod + def terminate_orphans(cls, state): + """ + Given the state, send SIGTERM to children that are orphaned processes. + + The direct children of the runner will always be coordinators or orphans. + """ + log.debug('TaskRunnerHelper.terminate_orphans()') + process_tree = cls.scan_tree(state) + + coordinator_pids = {p[0] for p in process_tree.values() if p[0]} + children_pids = {c.pid for c in psutil.Process().children()} + orphaned_pids = children_pids - coordinator_pids + + if len(orphaned_pids) > 0: + log.info("Orphaned pids detected: %s", orphaned_pids) + for p in orphaned_pids: + log.debug("SIGTERM pid %s", p) + cls.terminate_pid(p) + + @classmethod def terminate_process(cls, state, process_name): log.debug('TaskRunnerHelper.terminate_process(%s)' % process_name) _, pid, _ = cls._get_process_tuple(state, process_name) http://git-wip-us.apache.org/repos/asf/aurora/blob/5410c229/src/main/python/apache/thermos/core/process.py ---------------------------------------------------------------------- diff --git a/src/main/python/apache/thermos/core/process.py b/src/main/python/apache/thermos/core/process.py index 3ec43e2..13f9ad5 100644 --- a/src/main/python/apache/thermos/core/process.py +++ b/src/main/python/apache/thermos/core/process.py @@ -39,7 +39,7 @@ from twitter.common.lang import Interface from twitter.common.quantity import Amount, Data, Time from twitter.common.recordio import ThriftRecordReader, ThriftRecordWriter -from apache.thermos.common.process_util import wrap_with_mesos_containerizer +from apache.thermos.common.process_util import setup_child_subreaping, wrap_with_mesos_containerizer from gen.apache.aurora.api.constants import TASK_FILESYSTEM_MOUNT_POINT from gen.apache.thermos.ttypes import ProcessState, ProcessStatus, RunnerCkpt @@ -94,6 +94,7 @@ class ProcessBase(object): class CheckpointError(Error): pass class UnspecifiedSandbox(Error): pass class PermissionError(Error): pass + class ForkError(Error): pass CONTROL_WAIT_CHECK_INTERVAL = Amount(100, Time.MILLISECONDS) MAXIMUM_CONTROL_WAIT = Amount(1, Time.MINUTES) @@ -151,8 +152,9 @@ class ProcessBase(object): if self._rotate_log_backups <= 0: raise ValueError('Log backups cannot be less than one.') - def _log(self, msg): - log.debug('[process:%5s=%s]: %s' % (self._pid, self.name(), msg)) + def _log(self, msg, exc_info=None): + log.debug('[process:%5s=%s]: %s' % (self._pid, self.name(), msg), + exc_info=exc_info) def _getpwuid(self): """Returns a tuple of the user (i.e. --user) and current user.""" @@ -283,7 +285,16 @@ class ProcessBase(object): # calls _getpwuid which can raise: # UnknownUserError # PermissionError - self._pid = self._platform.fork() + try: + self._pid = self._platform.fork() # calls setup_child_subreaping which can + # raise OSError or RuntimeError + except (OSError, RuntimeError) as e: + # Reraise the exceptions possible from the fork as Process.Error + # Note only Python 3 has nice exception chaining, so we do our best here + # by logging the original exception and raising ForkError + msg = 'Error trying to fork process %s'.format(self._name) + self._log(msg, exc_info=True) + raise self.ForkError(msg) if self._pid == 0: self._pid = self._platform.getpid() self._wait_for_control() # can raise CheckpointError @@ -312,6 +323,9 @@ class RealPlatform(Platform): self._fork = fork def fork(self): + # Before we fork, ensure we become the parent of any processes that escape + # the cordinator. + setup_child_subreaping() pid = self._fork() if pid == 0: self._sanitize() http://git-wip-us.apache.org/repos/asf/aurora/blob/5410c229/src/main/python/apache/thermos/core/runner.py ---------------------------------------------------------------------- diff --git a/src/main/python/apache/thermos/core/runner.py b/src/main/python/apache/thermos/core/runner.py index 7b9013d..1b63c08 100644 --- a/src/main/python/apache/thermos/core/runner.py +++ b/src/main/python/apache/thermos/core/runner.py @@ -873,6 +873,8 @@ class TaskRunner(object): return len(launched) > 0 def _terminate_plan(self, plan): + TaskRunnerHelper.terminate_orphans(self.state) + for process in plan.running: last_run = self._current_process_run(process) if last_run and last_run.state in (ProcessState.FORKED, ProcessState.RUNNING): http://git-wip-us.apache.org/repos/asf/aurora/blob/5410c229/src/test/sh/org/apache/aurora/e2e/test_daemonizing_process.aurora ---------------------------------------------------------------------- diff --git a/src/test/sh/org/apache/aurora/e2e/test_daemonizing_process.aurora b/src/test/sh/org/apache/aurora/e2e/test_daemonizing_process.aurora new file mode 100644 index 0000000..3a204f6 --- /dev/null +++ b/src/test/sh/org/apache/aurora/e2e/test_daemonizing_process.aurora @@ -0,0 +1,83 @@ +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + + +hello_loop = Process( + name = 'hello', + cmdline = """ + while true; do + echo hello world + sleep 10 + done + """) + +# Write out a python program that self daemonizes via double forking. +# Asserts that thermos doesn't lose track of double forking proceses. On task +# tear down it should be given a SIGTERM so it can shut down cleanly. +write_program = Process( + name = "write_program", + cmdline = """ + cat >./daemon.py <<EOL +import os +import signal +import sys +import time + +def handler(signum, frame): + os.remove("{{term_file}}") + sys.exit(0) + +def main(): + pid = os.fork() + if pid > 0: + sys.exit(0) + + os.setsid() + os.umask(0) + + pid = os.fork() + if pid > 0: + sys.exit(0) + + signal.signal(signal.SIGTERM, handler) + while True: + time.sleep(1) + +if __name__ == '__main__': + main() + +EOL + """ +) + +run_daemon = Process( + name = 'run_daemon', + cmdline = 'python ./daemon.py' +) + +task = Task( + processes = [hello_loop, write_program, run_daemon], + constraints = order(write_program, run_daemon), + resources = Resources(cpu=1, ram=1*MB, disk=100*MB) +) + +jobs = [ + Service( + cluster = 'devcluster', + environment = 'test', + role = 'vagrant', + name = 'daemonize', + task = task + ) +] http://git-wip-us.apache.org/repos/asf/aurora/blob/5410c229/src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh ---------------------------------------------------------------------- diff --git a/src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh b/src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh index 67702d2..f014b28 100755 --- a/src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh +++ b/src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh @@ -58,7 +58,7 @@ check_url_live() { test_file_removed() { local _file=$1 local _success=0 - for i in $(seq 1 10); do + for i in {1..10}; do if [[ ! -e $_file ]]; then _success=1 break @@ -66,7 +66,7 @@ test_file_removed() { sleep 1 done - if [[ "$_success" -ne "1" ]]; then + if [[ $_success -ne 1 ]]; then echo "File was not removed." exit 1 fi @@ -391,6 +391,19 @@ test_ephemeral_daemon_with_final() { test_file_removed $_stop_file # Removed by 'final_process'. } +test_daemonizing_process() { + local _cluster=$1 _role=$2 _env=$3 _job=$4 _config=$5 + local _jobkey="$_cluster/$_role/$_env/$_job" + local _term_file=$(mktemp) + local _extra_args="--bind term_file=$_term_file" + + test_create $_jobkey $_config $_extra_args + test_observer_ui $_cluster $_role $_job + test_job_status $_cluster $_role $_env $_job + test_kill $_jobkey + test_file_removed $_term_file +} + restore_netrc() { mv ~/.netrc.bak ~/.netrc >/dev/null 2>&1 || true } @@ -481,6 +494,8 @@ TEST_CONFIG_UPDATED_FILE=$EXAMPLE_DIR/http_example_updated.aurora TEST_BAD_HEALTHCHECK_CONFIG_UPDATED_FILE=$EXAMPLE_DIR/http_example_bad_healthcheck.aurora TEST_EPHEMERAL_DAEMON_WITH_FINAL_JOB=ephemeral_daemon_with_final TEST_EPHEMERAL_DAEMON_WITH_FINAL_CONFIG_FILE=$TEST_ROOT/ephemeral_daemon_with_final.aurora +TEST_DAEMONIZING_PROCESS_JOB=daemonize +TEST_DAEMONIZING_PROCESS_CONFIG_FILE=$TEST_ROOT/test_daemonizing_process.aurora BASE_ARGS=( $TEST_CLUSTER @@ -509,6 +524,14 @@ TEST_JOB_EPHEMERAL_DAEMON_WITH_FINAL_ARGS=( $TEST_EPHEMERAL_DAEMON_WITH_FINAL_CONFIG_FILE ) +TEST_DAEMONIZING_PROCESS_ARGS=( + $TEST_CLUSTER + $TEST_ROLE + $TEST_ENV + $TEST_DAEMONIZING_PROCESS_JOB + $TEST_DAEMONIZING_PROCESS_CONFIG_FILE +) + trap collect_result EXIT aurorabuild all @@ -535,6 +558,8 @@ test_basic_auth_unauthenticated "${TEST_JOB_ARGS[@]}" test_ephemeral_daemon_with_final "${TEST_JOB_EPHEMERAL_DAEMON_WITH_FINAL_ARGS[@]}" +test_daemonizing_process "${TEST_DAEMONIZING_PROCESS_ARGS[@]}" + /vagrant/src/test/sh/org/apache/aurora/e2e/test_kerberos_end_to_end.sh /vagrant/src/test/sh/org/apache/aurora/e2e/test_bypass_leader_redirect_end_to_end.sh RETCODE=0