Jim Brennan created YARN-8648:
---------------------------------

             Summary: Container cgroups are leaked when using docker
                 Key: YARN-8648
                 URL: https://issues.apache.org/jira/browse/YARN-8648
             Project: Hadoop YARN
          Issue Type: Bug
            Reporter: Jim Brennan
            Assignee: Jim Brennan


When you run with docker and enable cgroups for cpu, docker creates cgroups for 
all resources on the system, not just for cpu.  For instance, if the 
{{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, 
the nodemanager will create a cgroup for each container under 
{{/sys/fs/cgroup/cpu/hadoop-yarn}}.  In the docker case, we pass this path via 
the {{--cgroup-parent}} command line argument.   Docker then creates a cgroup 
for the docker container under that, for instance: 
{{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}.

When the container exits, docker cleans up the {{docker_container_id}} cgroup, 
and the nodemanager cleans up the {{container_id}} cgroup,   All is good under 
{{/sys/fs/cgroup/hadoop-yarn}}.

The problem is that docker also creates that same hierarchy under every 
resource under {{/sys/fs/cgroup}}.  On the rhel7 system I am using, these are: 
blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, 
perf_event, and systemd.    So for instance, docker creates 
{{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but it 
only cleans up the leaf cgroup {{docker_container_id}}.  Nobody cleans up the 
{{container_id}} cgroups for these other resources.  On one of our busy 
clusters, we found > 100,000 of these leaked cgroups.

I found this in our 2.8-based version of hadoop, but I have been able to repro 
with current hadoop.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to