[ 
https://issues.apache.org/jira/browse/AURORA-698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

brian wickman updated AURORA-698:
---------------------------------
    Sprint: Twitter Aurora Q2'15 Sprint 3

> aurora executor _shutdown deadline calls should be daemonized
> -------------------------------------------------------------
>
>                 Key: AURORA-698
>                 URL: https://issues.apache.org/jira/browse/AURORA-698
>             Project: Aurora
>          Issue Type: Bug
>          Components: Executor
>            Reporter: brian wickman
>            Assignee: brian wickman
>
> In the aurora executor shutdown method, we have deadline() calls:
> {noformat}
>   def _shutdown(self, status_result):
>     runner_status = self._runner.status
>     try:
>       deadline(self._runner.stop, timeout=self.STOP_TIMEOUT)
>     except Timeout:
>       log.error('Failed to stop runner within deadline.')
>     try:
>       deadline(self._chained_checker.stop, timeout=self.STOP_TIMEOUT)
>     except Timeout:
>       log.error('Failed to stop all checkers within deadline.')
>     # If the runner was alive when _shutdown was called, defer to the 
> status_result,
>     # otherwise the runner's terminal state is the preferred state.
>     exit_status = runner_status or status_result
>     self.send_update(
>         self._driver,
>         self._task_id,
>         exit_status.status,
>         status_result.reason)
>     self.terminated.set()
>     defer(self._driver.stop, delay=self.PERSISTENCE_WAIT)
> {noformat}
> However if runner.stop fails with a Timeout exception, the spawned 
> AnonymousThread is not daemonized and causes the executor to fail to exit.  
> This means that the cgroup will not be torn down and if the runner.stop 
> actually failed, the process can stay alive even if TASK_KILLED was delivered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to