Jan Schlicht created MESOS-9153: ----------------------------------- Summary: Failures when isolating cgroups can leak containers Key: MESOS-9153 URL: https://issues.apache.org/jira/browse/MESOS-9153 Project: Mesos Issue Type: Bug Affects Versions: 1.5.1 Reporter: Jan Schlicht Attachments: health_check_leak.txt
When the isolation of cgroups fail (e.g., if cgroup hierarchies changed, as described in [MESOS-3488|https://issues.apache.org/jira/browse/MESOS-3488]) this will lead to a leaked container. Maybe only for nested container. The attached log is a {{VLOG(2)}} logs of a nested container that's started as part of a command health check for Kafka. I've removed all log lines unrelated to this container. Also, the cgroup hierarchy has been manipulated, to run into MESOS-3488. The linux launcher fails while the containerizer is in {{ISOLATING}} state. The containerizer transitions to {{DESTROYING}} and tries to cleanup the isolators. The isolators ignore the cleanup requests, because the container ID seems to be unknown to them. In case of the Linux Filesystem Isolator, this leads to the container directory not getting cleaned up. -- This message was sent by Atlassian JIRA (v7.6.3#76005)