[jira] [Commented] (MESOS-7546) WAIT_NESTED_CONTAINER sometimes returns 404
[ https://issues.apache.org/jira/browse/MESOS-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030104#comment-16030104 ] Jie Yu commented on MESOS-7546: --- commit 55e7ea5ed788acb0e4f810dd4575a5a4479520d1 Author: Gastón KleimanDate: Tue May 30 13:50:06 2017 -0700 Fixed a bug in 'ComposingContainerizerProcess::wait()'. Fixed a bug in the Composing Containerizer that would make it always immediately return 'None' when trying to wait on a nested container that had already been terminated and whose exit status was checkpointed. Review: https://reviews.apache.org/r/59537/ > WAIT_NESTED_CONTAINER sometimes returns 404 > --- > > Key: MESOS-7546 > URL: https://issues.apache.org/jira/browse/MESOS-7546 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.2.0, 1.3.0 >Reporter: Gastón Kleiman >Assignee: Gastón Kleiman >Priority: Critical > Labels: containerizer, mesosphere > > {{WAIT_NESTED_CONTAINER}} sometimes returns 404s even though the nested > container has already exited and the parent task/executor is still running. > This happens when an agent uses more than one containerizer (e.g., > {{docker,mesos}}, {{WAIT_NESTED_CONTAINER}} and the exit status of the > nested container has already been checkpointed. > The root cause of this is a bug in the {{ComposingContainerizer}} in the > following lines: > https://github.com/apache/mesos/blob/1c7ffbeb505b3f5ab759202195f0b946a20cb803/src/slave/containerizer/composing.cpp#L620-L628 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7546) WAIT_NESTED_CONTAINER sometimes returns 404
[ https://issues.apache.org/jira/browse/MESOS-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022257#comment-16022257 ] Anand Mazumdar commented on MESOS-7546: --- [~gkleiman] Can you add info about the shepherd to this issue? > WAIT_NESTED_CONTAINER sometimes returns 404 > --- > > Key: MESOS-7546 > URL: https://issues.apache.org/jira/browse/MESOS-7546 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.2.0, 1.3.0 >Reporter: Gastón Kleiman >Assignee: Gastón Kleiman >Priority: Critical > Labels: containerizer, mesosphere > > {{WAIT_NESTED_CONTAINER}} sometimes returns 404s even though the nested > container has already exited and the parent task/executor is still running. > This happens when an agent uses more than one containerizer (e.g., > {{docker,mesos}}, {{WAIT_NESTED_CONTAINER}} and the exit status of the > nested container has already been checkpointed. > The root cause of this is a bug in the {{ComposingContainerizer}} in the > following lines: > https://github.com/apache/mesos/blob/1c7ffbeb505b3f5ab759202195f0b946a20cb803/src/slave/containerizer/composing.cpp#L620-L628 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7546) WAIT_NESTED_CONTAINER sometimes returns 404
[ https://issues.apache.org/jira/browse/MESOS-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022075#comment-16022075 ] Jie Yu commented on MESOS-7546: --- Yup, that sounds like very reasonable too! > WAIT_NESTED_CONTAINER sometimes returns 404 > --- > > Key: MESOS-7546 > URL: https://issues.apache.org/jira/browse/MESOS-7546 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.2.0, 1.3.0 >Reporter: Gastón Kleiman >Assignee: Gastón Kleiman >Priority: Critical > Labels: containerizer, mesosphere > > {{WAIT_NESTED_CONTAINER}} sometimes returns 404s even though the nested > container has already exited and the parent task/executor is still running. > This happens when an agent uses more than one containerizer (e.g., > {{docker,mesos}}, {{WAIT_NESTED_CONTAINER}} and the exit status of the > nested container has already been checkpointed. > The root cause of this is a bug in the {{ComposingContainerizer}} in the > following lines: > https://github.com/apache/mesos/blob/1c7ffbeb505b3f5ab759202195f0b946a20cb803/src/slave/containerizer/composing.cpp#L620-L628 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7546) WAIT_NESTED_CONTAINER sometimes returns 404
[ https://issues.apache.org/jira/browse/MESOS-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022060#comment-16022060 ] Gastón Kleiman commented on MESOS-7546: --- What about using the containerizer that was used to start the root container? > WAIT_NESTED_CONTAINER sometimes returns 404 > --- > > Key: MESOS-7546 > URL: https://issues.apache.org/jira/browse/MESOS-7546 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.2.0, 1.3.0 >Reporter: Gastón Kleiman >Priority: Critical > Labels: containerizer, mesosphere > > {{WAIT_NESTED_CONTAINER}} sometimes returns 404s even though the nested > container has already exited and the parent task/executor is still running. > This happens when an agent uses more than one containerizer (e.g., > {{docker,mesos}}, {{WAIT_NESTED_CONTAINER}} and the exit status of the > nested container has already been checkpointed. > The root cause of this is a bug in the {{ComposingContainerizer}} in the > following lines: > https://github.com/apache/mesos/blob/1c7ffbeb505b3f5ab759202195f0b946a20cb803/src/slave/containerizer/composing.cpp#L620-L628 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7546) WAIT_NESTED_CONTAINER sometimes returns 404
[ https://issues.apache.org/jira/browse/MESOS-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022051#comment-16022051 ] Jie Yu commented on MESOS-7546: --- A short term solution would be to not delete bookkeeping data structure for nested containers in composing containerizer. > WAIT_NESTED_CONTAINER sometimes returns 404 > --- > > Key: MESOS-7546 > URL: https://issues.apache.org/jira/browse/MESOS-7546 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Gastón Kleiman > Labels: containerizer, mesosphere > > {{WAIT_NESTED_CONTAINER}} sometimes returns 404s even though the nested > container has already exited and the parent task/executor is still running. > This happens when an agent uses more than one containerizer (e.g., > {{docker,mesos}}, {{WAIT_NESTED_CONTAINER}} and the exit status of the > nested container has already been checkpointed. > The root cause of this is a bug in the {{ComposingContainerizer}} in the > following lines: > https://github.com/apache/mesos/blob/1c7ffbeb505b3f5ab759202195f0b946a20cb803/src/slave/containerizer/composing.cpp#L620-L628 -- This message was sent by Atlassian JIRA (v6.3.15#6346)