[jira] [Commented] (MESOS-7546) WAIT_NESTED_CONTAINER sometimes returns 404

2017-05-30 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030104#comment-16030104
 ] 

Jie Yu commented on MESOS-7546:
---

commit 55e7ea5ed788acb0e4f810dd4575a5a4479520d1
Author: Gastón Kleiman 
Date:   Tue May 30 13:50:06 2017 -0700

Fixed a bug in 'ComposingContainerizerProcess::wait()'.

Fixed a bug in the Composing Containerizer that would make it always
immediately return 'None' when trying to wait on a nested container
that had already been terminated and whose exit status was checkpointed.

Review: https://reviews.apache.org/r/59537/

> WAIT_NESTED_CONTAINER sometimes returns 404
> ---
>
> Key: MESOS-7546
> URL: https://issues.apache.org/jira/browse/MESOS-7546
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Gastón Kleiman
>Assignee: Gastón Kleiman
>Priority: Critical
>  Labels: containerizer, mesosphere
>
> {{WAIT_NESTED_CONTAINER}} sometimes returns 404s even though the nested 
> container has already exited and the parent task/executor is still running.
> This happens when an agent uses more than one containerizer (e.g.,
>  {{docker,mesos}}, {{WAIT_NESTED_CONTAINER}} and the exit status of the 
> nested container has already been checkpointed.
> The root cause of this is a bug in the {{ComposingContainerizer}} in the 
> following lines: 
> https://github.com/apache/mesos/blob/1c7ffbeb505b3f5ab759202195f0b946a20cb803/src/slave/containerizer/composing.cpp#L620-L628



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7546) WAIT_NESTED_CONTAINER sometimes returns 404

2017-05-23 Thread Anand Mazumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022257#comment-16022257
 ] 

Anand Mazumdar commented on MESOS-7546:
---

[~gkleiman] Can you add info about the shepherd to this issue?

> WAIT_NESTED_CONTAINER sometimes returns 404
> ---
>
> Key: MESOS-7546
> URL: https://issues.apache.org/jira/browse/MESOS-7546
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Gastón Kleiman
>Assignee: Gastón Kleiman
>Priority: Critical
>  Labels: containerizer, mesosphere
>
> {{WAIT_NESTED_CONTAINER}} sometimes returns 404s even though the nested 
> container has already exited and the parent task/executor is still running.
> This happens when an agent uses more than one containerizer (e.g.,
>  {{docker,mesos}}, {{WAIT_NESTED_CONTAINER}} and the exit status of the 
> nested container has already been checkpointed.
> The root cause of this is a bug in the {{ComposingContainerizer}} in the 
> following lines: 
> https://github.com/apache/mesos/blob/1c7ffbeb505b3f5ab759202195f0b946a20cb803/src/slave/containerizer/composing.cpp#L620-L628



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7546) WAIT_NESTED_CONTAINER sometimes returns 404

2017-05-23 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022075#comment-16022075
 ] 

Jie Yu commented on MESOS-7546:
---

Yup, that sounds like very reasonable too!

> WAIT_NESTED_CONTAINER sometimes returns 404
> ---
>
> Key: MESOS-7546
> URL: https://issues.apache.org/jira/browse/MESOS-7546
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Gastón Kleiman
>Assignee: Gastón Kleiman
>Priority: Critical
>  Labels: containerizer, mesosphere
>
> {{WAIT_NESTED_CONTAINER}} sometimes returns 404s even though the nested 
> container has already exited and the parent task/executor is still running.
> This happens when an agent uses more than one containerizer (e.g.,
>  {{docker,mesos}}, {{WAIT_NESTED_CONTAINER}} and the exit status of the 
> nested container has already been checkpointed.
> The root cause of this is a bug in the {{ComposingContainerizer}} in the 
> following lines: 
> https://github.com/apache/mesos/blob/1c7ffbeb505b3f5ab759202195f0b946a20cb803/src/slave/containerizer/composing.cpp#L620-L628



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7546) WAIT_NESTED_CONTAINER sometimes returns 404

2017-05-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/MESOS-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022060#comment-16022060
 ] 

Gastón Kleiman commented on MESOS-7546:
---

What about using the containerizer that was used to start the root container?

> WAIT_NESTED_CONTAINER sometimes returns 404
> ---
>
> Key: MESOS-7546
> URL: https://issues.apache.org/jira/browse/MESOS-7546
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Gastón Kleiman
>Priority: Critical
>  Labels: containerizer, mesosphere
>
> {{WAIT_NESTED_CONTAINER}} sometimes returns 404s even though the nested 
> container has already exited and the parent task/executor is still running.
> This happens when an agent uses more than one containerizer (e.g.,
>  {{docker,mesos}}, {{WAIT_NESTED_CONTAINER}} and the exit status of the 
> nested container has already been checkpointed.
> The root cause of this is a bug in the {{ComposingContainerizer}} in the 
> following lines: 
> https://github.com/apache/mesos/blob/1c7ffbeb505b3f5ab759202195f0b946a20cb803/src/slave/containerizer/composing.cpp#L620-L628



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7546) WAIT_NESTED_CONTAINER sometimes returns 404

2017-05-23 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022051#comment-16022051
 ] 

Jie Yu commented on MESOS-7546:
---

A short term solution would be to not delete bookkeeping data structure for 
nested containers in composing containerizer.

> WAIT_NESTED_CONTAINER sometimes returns 404
> ---
>
> Key: MESOS-7546
> URL: https://issues.apache.org/jira/browse/MESOS-7546
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Gastón Kleiman
>  Labels: containerizer, mesosphere
>
> {{WAIT_NESTED_CONTAINER}} sometimes returns 404s even though the nested 
> container has already exited and the parent task/executor is still running.
> This happens when an agent uses more than one containerizer (e.g.,
>  {{docker,mesos}}, {{WAIT_NESTED_CONTAINER}} and the exit status of the 
> nested container has already been checkpointed.
> The root cause of this is a bug in the {{ComposingContainerizer}} in the 
> following lines: 
> https://github.com/apache/mesos/blob/1c7ffbeb505b3f5ab759202195f0b946a20cb803/src/slave/containerizer/composing.cpp#L620-L628



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)