[ https://issues.apache.org/jira/browse/MESOS-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16260729#comment-16260729 ]
Alexander Rukletsov commented on MESOS-7506: -------------------------------------------- {noformat} Commit: 95decd404438abd422794524e01d72a889821566 [95decd4] Author: Andrei Budnik <abud...@mesosphere.com> Date: 21 November 2017 at 14:30:47 GMT+1 Committer: Alexander Rukletsov <al...@apache.org> Commit Date: 21 November 2017 at 14:34:31 GMT+1 Fixed `wait()` and `destroy()` in composing containerizer. Previously, `wait()` and `destroy()` methods of composing containerizer returned a future that might be set to `READY` state while the internal state of composing containerizer is not yet cleaned up. This patch adds a `termination` promise to `Container` struct, which is used to return a future from `wait()` and `destroy()` methods. This promise is set to `READY` state iff related container is completely destroyed. `_destroy()` callback is subscribed to a future from `wait()`, which is called on related containerizer, to propagate a value to the `termination` promise and do the cleanup. Review: https://reviews.apache.org/r/63887/ {noformat} {noformat} Commit: 84365a140c3730e2d6579ad500118d6749d2f87f [84365a1] Author: Andrei Budnik <abud...@mesosphere.com> Date: 21 November 2017 at 14:31:37 GMT+1 Committer: Alexander Rukletsov <al...@apache.org> Commit Date: 21 November 2017 at 14:34:32 GMT+1 Updated composing containerizer tests. Review: https://reviews.apache.org/r/63888/ {noformat} > Multiple tests leave orphan containers. > --------------------------------------- > > Key: MESOS-7506 > URL: https://issues.apache.org/jira/browse/MESOS-7506 > Project: Mesos > Issue Type: Bug > Components: containerization > Environment: Ubuntu 16.04 > Fedora 23 > other Linux distros > Reporter: Alexander Rukletsov > Assignee: Andrei Budnik > Labels: containerizer, flaky-test, mesosphere > Attachments: KillMultipleTasks-badrun.txt, > ROOT_IsolatorFlags-badrun.txt, ResourceLimitation-badrun.txt, > ResourceLimitation-badrun2.txt, > RestartSlaveRequireExecutorAuthentication-badrun.txt, > TaskWithFileURI-badrun.txt > > > I've observed a number of flaky tests that leave orphan containers upon > cleanup. A typical log looks like this: > {noformat} > ../../src/tests/cluster.cpp:580: Failure > Value of: containers->empty() > Actual: false > Expected: true > Failed to destroy containers: { da3e8aa8-98e7-4e72-a8fd-5d0bae960014 } > {noformat} > All currently affected tests: > {noformat} > ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.KillTask/0 > ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.TaskWithFileURI/0 > ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.ResourceLimitation/0 > ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.KillMultipleTasks/0 > SlaveTest.RestartSlaveRequireExecutorAuthentication > LinuxCapabilitiesIsolatorFlagsTest.ROOT_IsolatorFlags > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)