[ https://issues.apache.org/jira/browse/MESOS-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373946#comment-15373946 ]
John Garcia commented on MESOS-5836: ------------------------------------ [Bug #124641 filed|https://bugzilla.kernel.org/show_bug.cgi?id=124641] > Memory cgroup leakage in 4.2, 4.4, 4.5 kernels > ---------------------------------------------- > > Key: MESOS-5836 > URL: https://issues.apache.org/jira/browse/MESOS-5836 > Project: Mesos > Issue Type: Bug > Components: containerization > Affects Versions: 0.28.1, 0.28.2, 1.0.0, 1.1.0 > Reporter: John Garcia > Labels: mesosphere > > We've noticed an issue with kernel versions 4.2, 4.4, and 4.5 where memory > cgroups are not cleaned up by the system. When the register fills up with > 65336 cgroups, additional cgroups cannot be formed because there's no IDs for > the new cgroup, and ENOSPC is returned. This is a concern for the Mesos > project because no further containers can be created by Mesos in this state. > We tested Docker 1.8.3, and Docker 1.8.3 will silently fail to build the > memory cgroup, resulting in rogue containers that are memory-unbound. > h3. Steps to reproduce: > *NOTE: Mesos is not required to reproduce this issue* > - Start a new instance using kernel 4.2, 4.4, or 4.5 (CoreOS 766-1010, Ubuntu > 16.04) > - ssh to the machine > - {{cat /proc/cgroups}} to determine the number of memory cgroups > - Run several docker containers using the {{--memory}} or {{-m}} option to > set a memory isolator, either in parallel or in series > - Stop all containers > - {{cat /proc/cgroups}} to review the number of memory cgroups and compare to > previous run > - Optional: Run 65,336 docker containers using memory isolation and then try > to launch a Mesos container > h3. Differential diagnosis: > When the cgroup limit is exceeded, subsequent container terminations will > draw the following error in {{dmesg}}: > {code}idr_remove called for id=65536 which is not allocated.{code} > Subsequent efforts to create a cgroup folder will fail: > {code}/sys/fs/cgroup/memory/mesos $ df . > Filesystem 1K-blocks Used Available Use% Mounted on > cgroup 0 0 0 - /sys/fs/cgroup/memory > /sys/fs/cgroup/memory/mesos $ sudo mkdir foo > mkdir: cannot create directory 'foo': No space left on device{code} > Subsequently launched Docker containers will fail to utilize memory > isolation: {code}/sys/fs/cgroup/memory/mesos $ docker run -m 32m -d > example/busybox sleep 10000 > ... > /sys/fs/cgroup/memory/mesos $ docker ps | grep busybox > 849c66081229 example/busybox > "sleep 10000" 6 seconds ago Up 4 seconds > > suspicious_mahavira > /sys/fs/cgroup/memory/mesos $ find /sys/fs/cgroup -name "*849c66081229*" > /sys/fs/cgroup/blkio/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope > /sys/fs/cgroup/freezer/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope > /sys/fs/cgroup/devices/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope > /sys/fs/cgroup/cpu,cpuacct/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope > /sys/fs/cgroup/cpuset/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope > /sys/fs/cgroup/net_cls,net_prio/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope > /sys/fs/cgroup/systemd/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope > /sys/fs/cgroup/memory/mesos $ {code} > Mesos containerizer will fail with {{No space left on device}}: > {code}E0707 20:17:29.091142 105665 slave.cpp:3802] Container > 'ef5419cf-9d00-425a-a9ee-a848d330bfb2' for executor > 'node-0_executor__42a4fafe-f64d-4b41-91d2-efc20a86a6a3' of framework > d6ab251a-064a-46a0-a1c8-9ee559f3b44a-0023 failed to start: Failed to prepare > isolator: Failed to create directory > '/sys/fs/cgroup/memory/mesos/ef5419cf-9d00-425a-a9ee-a848d330bfb2': No space > left on device > {code} > h3. Remediation > Once a system is found to be affected, the following command can be used to > drop all page caches, which allows the system to reap all of the old cgroups > and return to normal operation. > {code}echo 1 > /proc/sys/vm/drop_caches{code} > We suspect that [patch 9184539|https://patchwork.kernel.org/patch/9184539/] > could fix it, but we have not yet tested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)