[ https://issues.apache.org/jira/browse/MESOS-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16442279#comment-16442279 ]
Andrei Budnik edited comment on MESOS-8732 at 4/24/18 12:39 PM: ---------------------------------------------------------------- After setting composing c'zer by default, some tests (e.g. `AgentAPITest.AttachContainerInputValidation`) started to hang due to a paused clocks and the use of clock-dependent methods, like `await()`, `delay()`, etc. by the docker library. It hangs in [`Docker::validateVersion()`|https://github.com/apache/mesos/blob/ca21ca82071f2c53d5817424569977728260da65/src/docker/docker.cpp#L241], which is called from [`Docker::create()`|https://github.com/apache/mesos/blob/ca21ca82071f2c53d5817424569977728260da65/src/docker/docker.cpp#L145]. After I added `Clock::resume()` before calling `version.await(DOCKER_VERSION_WAIT_TIMEOUT)`, tests have started to hang due to the hanging docker recovery: docker c'zer launches `docker ps -a` subprocess and [subscribes for its termination|https://github.com/apache/mesos/blob/ca21ca82071f2c53d5817424569977728260da65/src/docker/docker.cpp#L1466-L1467]. As a reaper process [uses `delay()`|https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/reap.cpp#L112], this leads to a hanging recovery process for the docker c'zer. was (Author: abudnik): After setting composing c'zer by default, some tests (e.g. `AgentAPITest.AttachContainerInputValidation`) started to hang due to a paused clocks and the use of clock-dependant methods, like `await()`, `delay()`, etc. by the docker library. It hangs in [`Docker::validateVersion()`|https://github.com/apache/mesos/blob/ca21ca82071f2c53d5817424569977728260da65/src/docker/docker.cpp#L241], which is called from [`Docker::create()`|https://github.com/apache/mesos/blob/ca21ca82071f2c53d5817424569977728260da65/src/docker/docker.cpp#L145]. After I added `Clock::resume()` before calling `version.await(DOCKER_VERSION_WAIT_TIMEOUT)`, tests have started to hang due to the hanging docker recovery: docker c'zer launches `docker ps -a` subprocess and [subscribes for its termination|https://github.com/apache/mesos/blob/ca21ca82071f2c53d5817424569977728260da65/src/docker/docker.cpp#L1466-L1467]. As a reaper process [uses `delay()`|https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/reap.cpp#L112], this leads to a hanging recovery process for the docker c'zer. > Use composing containerizer by default in tests. > ------------------------------------------------ > > Key: MESOS-8732 > URL: https://issues.apache.org/jira/browse/MESOS-8732 > Project: Mesos > Issue Type: Task > Components: containerization > Reporter: Andrei Budnik > Assignee: Andrei Budnik > Priority: Major > Labels: containerizer, mesosphere, tests > > If we assign "docker,mesos" to the `containerizers` flag for an agent, then > `ComposingContainerizer` will be used for many tests that do not specify > `containerizers` flag. That's the goal of this task. > I tried to do that by adding [`flags.containerizers = > "docker,mesos";`|https://github.com/apache/mesos/blob/master/src/tests/mesos.cpp#L273], > but it turned out that some tests are started to hang due to a paused > clocks, while docker c'zer and docker library use libprocess clocks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)