I'm sorry for the duplicated messages. Accidentally pressed the wrong key shortcuts twice :(
Unfortunately I don't have the log right now. IIRC the executor received the `KILL` event because the log I saw contained this line: https://github.com/apache/mesos/blob/7e11a2d39cc642944897d2480105db fd860fa601/src/launcher/default_executor.cpp#L1236 But it didn't contain this line: https://github.com/apache/mesos/blob/7e11a2d39cc642944897d2480105dbfd860fa601/src/launcher/default_executor.cpp#L1101 The reason that caused the `LAUNCH_NESTED_CONTAINER` to be stuck was rotated out in the log file when I examined it. On Wed, May 16, 2018 at 6:57 PM, Chun-Hung Hsiao <chhs...@mesosphere.io> wrote: > Unfortunately I don't have the log right now. IIRC the executor received > the `KILL` event because the log I saw contained this line: > https://github.com/apache/mesos/blob/7e11a2d39cc642944897d2480105db > fd860fa601/src/launcher/default_executor.cpp#L1236 > But it didn't contain this line: > > On Wed, May 16, 2018 at 6:18 PM, Vinod Kone <vi...@mesosphere.io> wrote: > >> Can you paste some logs here too if you have? >> >> On Wed, May 16, 2018 at 5:53 PM, Chun-Hung Hsiao (JIRA) <j...@apache.org> >> wrote: >> >> > >> > [ https://issues.apache.org/jira/browse/MESOS-8927?page= >> > com.atlassian.jira.plugin.system.issuetabpanels:comment- >> > tabpanel&focusedCommentId=16478318#comment-16478318 ] >> > >> > Chun-Hung Hsiao commented on MESOS-8927: >> > ---------------------------------------- >> > >> > I'd like to add some notes here. This problem is actually nontrivial, >> > because AFAIK we don't have a reliable way to kill a container at any >> state. >> > >> > > Default executor cannot kill tasks if `LAUNCH_NESTED_CONTAINER` is >> stuck. >> > > ------------------------------------------------------------ >> > ------------- >> > > >> > > Key: MESOS-8927 >> > > URL: https://issues.apache.org/jira/browse/MESOS-8927 >> > > Project: Mesos >> > > Issue Type: Bug >> > > Components: executor >> > > Affects Versions: 1.5.1, 1.6.0 >> > > Reporter: Chun-Hung Hsiao >> > > Priority: Critical >> > > Labels: default-executor, mesosphere >> > > >> > > In the default executor, if the {{LAUNCH_NESTED_CONTAINER}} call never >> > returns, {{container->launched}} won't be set, so a follow-up {{KILL}} >> > event will be ignored: >> > > [https://github.com/apache/mesos/blob/40b40d9b73221388e583fc140280f1 >> > eb2b48b832/src/launcher/default_executor.cpp#L1091] >> > > This could lead to tasks stuck in {{TASK_STARTING}}. >> > >> > >> > >> > -- >> > This message was sent by Atlassian JIRA >> > (v7.6.3#76005) >> > >> > >