[ https://issues.apache.org/jira/browse/MESOS-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16406116#comment-16406116 ]
Alexander Rukletsov edited comment on MESOS-8572 at 3/20/18 10:50 AM: ---------------------------------------------------------------------- [~brat002] reports a similar issue. Below is my loose translation of the message he sent over private channels. "Sometimes docker task does not finish correctly on {{docker stop}}. For example in https://pastebin.com/NwgA7d7M, {{docker stop}} hung 10 days (!). Manually issued {{docker stop}} from terminal hangs for 20-30 seconds and then exits cleanly, but does not stop the container. However, if {{kill -9}} is sent to the corresponding {{mesos-docker-executor}}, the whole process tree terminates correctly and {{docker ps}} does not list the container any more. The hypothesis is that docker cannot terminate a container while someone is listening to its stdin/stderr. Hence it might make sense to send {{SIGTERM}} followed by {{SIGKILL}} instead of retrying {{docker stop}}." was (Author: alexr): [~brat002] reports a very similar issue. Below is my loose translation of the message he sent over private channels. "Sometimes docker task does not finish correctly on {{docker stop}}. For example in https://pastebin.com/NwgA7d7M, {{docker stop}} hung 10 days (!). Manually issued {{docker stop}} from terminal hangs for 20-30 seconds and then exits cleanly, but does not stop the container. However, if {{kill -9}} is sent to the corresponding {{mesos-docker-executor}}, the whole process tree terminates correctly and {{docker ps}} does not list the container any more. The hypothesis is that docker cannot terminate a container while someone is listening to its stdin/stderr. Hence it might make sense to send {{SIGTERM}} followed by {{SIGKILL}} instead of retrying {{docker stop}}." > Make Docker executor/containerizer resilient to Docker daemon failures. > ----------------------------------------------------------------------- > > Key: MESOS-8572 > URL: https://issues.apache.org/jira/browse/MESOS-8572 > Project: Mesos > Issue Type: Epic > Components: containerization, docker, executor > Affects Versions: 1.5.0 > Reporter: Greg Mann > Assignee: Greg Mann > Priority: Major > Labels: mesosphere > > Experience has shown that the Docker CLI can hang indefinitely at times. > There are many variations of this behavior, and it occurs across many > versions of Docker. For these reasons, and since many users of Mesos still > make heavy use of the Docker containerizer and the Docker executor, it will > improve the user experience to make the Docker containerizer/executor > resilient to such Docker daemon failures. -- This message was sent by Atlassian JIRA (v7.6.3#76005)