[jira] [Commented] (MESOS-5836) Memory cgroup leakage in 4.2, 4.4, 4.5 kernels

John Garcia (JIRA) Tue, 12 Jul 2016 16:30:32 -0700

    [ 
https://issues.apache.org/jira/browse/MESOS-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373946#comment-15373946
 ]


John Garcia commented on MESOS-5836:
------------------------------------

[Bug #124641 filed|https://bugzilla.kernel.org/show_bug.cgi?id=124641]

> Memory cgroup leakage in 4.2, 4.4, 4.5 kernels
> ----------------------------------------------
>
>                 Key: MESOS-5836
>                 URL: https://issues.apache.org/jira/browse/MESOS-5836
>             Project: Mesos
>          Issue Type: Bug
>          Components: containerization
>    Affects Versions: 0.28.1, 0.28.2, 1.0.0, 1.1.0
>            Reporter: John Garcia
>              Labels: mesosphere
>
> We've noticed an issue with kernel versions 4.2, 4.4, and 4.5 where memory 
> cgroups are not cleaned up by the system. When the register fills up with 
> 65336 cgroups, additional cgroups cannot be formed because there's no IDs for 
> the new cgroup, and ENOSPC is returned. This is a concern for the Mesos 
> project because no further containers can be created by Mesos in this state. 
> We tested Docker 1.8.3, and Docker 1.8.3 will silently fail to build the 
> memory cgroup, resulting in rogue containers that are memory-unbound.
> h3. Steps to reproduce:
> *NOTE: Mesos is not required to reproduce this issue*
> - Start a new instance using kernel 4.2, 4.4, or 4.5 (CoreOS 766-1010, Ubuntu 
> 16.04) 
> - ssh to the machine
> - {{cat /proc/cgroups}} to determine the number of memory cgroups
> - Run several docker containers using the {{--memory}} or {{-m}} option to 
> set a memory isolator, either in parallel or in series
> - Stop all containers
> - {{cat /proc/cgroups}} to review the number of memory cgroups and compare to 
> previous run
> - Optional: Run 65,336 docker containers using memory isolation and then try 
> to launch a Mesos container
> h3. Differential diagnosis:
> When the cgroup limit is exceeded, subsequent container terminations will 
> draw the following error in {{dmesg}}:
> {code}idr_remove called for id=65536 which is not allocated.{code}
> Subsequent efforts to create a cgroup folder will fail:
> {code}/sys/fs/cgroup/memory/mesos $ df .
> Filesystem     1K-blocks  Used Available Use% Mounted on
> cgroup                 0     0         0    - /sys/fs/cgroup/memory
> /sys/fs/cgroup/memory/mesos $ sudo mkdir foo
> mkdir: cannot create directory 'foo': No space left on device{code}
> Subsequently launched Docker containers will fail to utilize memory 
> isolation: {code}/sys/fs/cgroup/memory/mesos $ docker run -m 32m -d 
> example/busybox sleep 10000
> ...
> /sys/fs/cgroup/memory/mesos $ docker ps | grep busybox
> 849c66081229        example/busybox                                           
>               "sleep 10000"            6 seconds ago       Up 4 seconds       
>                                                                              
> suspicious_mahavira
> /sys/fs/cgroup/memory/mesos $ find /sys/fs/cgroup -name "*849c66081229*"
> /sys/fs/cgroup/blkio/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope
> /sys/fs/cgroup/freezer/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope
> /sys/fs/cgroup/devices/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope
> /sys/fs/cgroup/cpu,cpuacct/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope
> /sys/fs/cgroup/cpuset/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope
> /sys/fs/cgroup/net_cls,net_prio/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope
> /sys/fs/cgroup/systemd/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope
> /sys/fs/cgroup/memory/mesos $ {code}
> Mesos containerizer will fail with {{No space left on device}}:
> {code}E0707 20:17:29.091142 105665 slave.cpp:3802] Container 
> 'ef5419cf-9d00-425a-a9ee-a848d330bfb2' for executor 
> 'node-0_executor__42a4fafe-f64d-4b41-91d2-efc20a86a6a3' of framework 
> d6ab251a-064a-46a0-a1c8-9ee559f3b44a-0023 failed to start: Failed to prepare 
> isolator: Failed to create directory 
> '/sys/fs/cgroup/memory/mesos/ef5419cf-9d00-425a-a9ee-a848d330bfb2': No space 
> left on device
> {code}
> h3. Remediation
> Once a system is found to be affected, the following command can be used to 
> drop all page caches, which allows the system to reap all of the old cgroups 
> and return to normal operation.
> {code}echo 1 > /proc/sys/vm/drop_caches{code}
> We suspect that [patch 9184539|https://patchwork.kernel.org/patch/9184539/] 
> could fix it, but we have not yet tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5836) Memory cgroup leakage in 4.2, 4.4, 4.5 kernels

Reply via email to