[ 
https://issues.apache.org/jira/browse/MESOS-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16406116#comment-16406116
 ] 

Alexander Rukletsov edited comment on MESOS-8572 at 3/20/18 10:50 AM:
----------------------------------------------------------------------

[~brat002] reports a similar issue. Below is my loose translation of the 
message he sent over private channels.

"Sometimes docker task does not finish correctly on {{docker stop}}. For 
example in https://pastebin.com/NwgA7d7M, {{docker stop}} hung 10 days (!). 
Manually issued {{docker stop}} from terminal hangs for 20-30 seconds and then 
exits cleanly, but does not stop the container. However, if {{kill -9}} is sent 
to the corresponding {{mesos-docker-executor}}, the whole process tree 
terminates correctly and {{docker ps}} does not list the container any more.

The hypothesis is that docker cannot terminate a container while someone is 
listening to its stdin/stderr. Hence it might make sense to send {{SIGTERM}} 
followed by {{SIGKILL}} instead of retrying {{docker stop}}."


was (Author: alexr):
[~brat002] reports a very similar issue. Below is my loose translation of the 
message he sent over private channels.

"Sometimes docker task does not finish correctly on {{docker stop}}. For 
example in https://pastebin.com/NwgA7d7M, {{docker stop}} hung 10 days (!). 
Manually issued {{docker stop}} from terminal hangs for 20-30 seconds and then 
exits cleanly, but does not stop the container. However, if {{kill -9}} is sent 
to the corresponding {{mesos-docker-executor}}, the whole process tree 
terminates correctly and {{docker ps}} does not list the container any more.

The hypothesis is that docker cannot terminate a container while someone is 
listening to its stdin/stderr. Hence it might make sense to send {{SIGTERM}} 
followed by {{SIGKILL}} instead of retrying {{docker stop}}."

> Make Docker executor/containerizer resilient to Docker daemon failures.
> -----------------------------------------------------------------------
>
>                 Key: MESOS-8572
>                 URL: https://issues.apache.org/jira/browse/MESOS-8572
>             Project: Mesos
>          Issue Type: Epic
>          Components: containerization, docker, executor
>    Affects Versions: 1.5.0
>            Reporter: Greg Mann
>            Assignee: Greg Mann
>            Priority: Major
>              Labels: mesosphere
>
> Experience has shown that the Docker CLI can hang indefinitely at times. 
> There are many variations of this behavior, and it occurs across many 
> versions of Docker. For these reasons, and since many users of Mesos still 
> make heavy use of the Docker containerizer and the Docker executor, it will 
> improve the user experience to make the Docker containerizer/executor 
> resilient to such Docker daemon failures.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to