[ https://issues.apache.org/jira/browse/YARN-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16402162#comment-16402162 ]
Miklos Szegedi commented on YARN-8031: -------------------------------------- [~jayceAu], thank you for raising this. If you have CGroups already mounted, you should set the mount option to false as described here: [https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeManagerCgroups.html] {code:java} Discover CGroups mounted already This should be used on newer systems like RHEL7 or Ubuntu16 or if the administrator mounts CGroups before YARN starts. Set yarn.nodemanager.linux-container-executor.cgroups.mount to false and leave other settings set to their defaults. YARN will locate the mount points in /proc/mounts. Common locations include /sys/fs/cgroup and /cgroup. The default location can vary depending on the Linux distribution in use.{code} > NodeManager will fail to start if cpu subsystem is already mounted > ------------------------------------------------------------------ > > Key: YARN-8031 > URL: https://issues.apache.org/jira/browse/YARN-8031 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 2.5.0 > Reporter: JayceAu > Priority: Major > Attachments: image-2018-03-15-14-47-30-583.png > > > if *yarn.nodemanager.linux-container-executor.cgroups.mount* is set to true > and cpu subsystem is not yet mounted, NodeManager will mount the cpu > subsystem and then create the control group whose default name is > *hadoop-yarn* if the mount step is successful. This procedure works well if > cpu subsystem is not yet mounted. However, under some situation cpu subsystem > is already mounted before NodeManager starts and NodeManager will fail to > start because of no write permission to the *hadoop-yarn* path . For example: > # in OS that use systemd such as centos7 will have cpu subsystem mounted by > default on machine startup > # some deamon whose start order is more precedent than NodeManager may also > rely on the mounted state of cpu subsystem. In our production environment, we > limit the cpu usage of the monitoring and control agent, which starts on > reboot > In order to solve this problem, container-executor must be able to create the > control group *hadoop-yarn* if mounting controller is successful or this > controller is already mounted. Besides, if cpu subsystem is used in > combination with other subsystem and it's already mounted, container-executor > should use the latest mount point of cpu subsystem instread of the one > provided by NodeManager. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org