[ https://issues.apache.org/jira/browse/AURORA-698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
brian wickman updated AURORA-698: --------------------------------- Sprint: Twitter Aurora Q2'15 Sprint 3 > aurora executor _shutdown deadline calls should be daemonized > ------------------------------------------------------------- > > Key: AURORA-698 > URL: https://issues.apache.org/jira/browse/AURORA-698 > Project: Aurora > Issue Type: Bug > Components: Executor > Reporter: brian wickman > Assignee: brian wickman > > In the aurora executor shutdown method, we have deadline() calls: > {noformat} > def _shutdown(self, status_result): > runner_status = self._runner.status > try: > deadline(self._runner.stop, timeout=self.STOP_TIMEOUT) > except Timeout: > log.error('Failed to stop runner within deadline.') > try: > deadline(self._chained_checker.stop, timeout=self.STOP_TIMEOUT) > except Timeout: > log.error('Failed to stop all checkers within deadline.') > # If the runner was alive when _shutdown was called, defer to the > status_result, > # otherwise the runner's terminal state is the preferred state. > exit_status = runner_status or status_result > self.send_update( > self._driver, > self._task_id, > exit_status.status, > status_result.reason) > self.terminated.set() > defer(self._driver.stop, delay=self.PERSISTENCE_WAIT) > {noformat} > However if runner.stop fails with a Timeout exception, the spawned > AnonymousThread is not daemonized and causes the executor to fail to exit. > This means that the cgroup will not be torn down and if the runner.stop > actually failed, the process can stay alive even if TASK_KILLED was delivered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)