subject:"\[jira\] \[Commented\] \(YARN\-8031\) NodeManager will fail to start if cpu subsystem is already mounted"

[jira] [Commented] (YARN-8031) NodeManager will fail to start if cpu subsystem is already mounted

2018-03-17 Thread JayceAu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403294#comment-16403294
 ] 

JayceAu commented on YARN-8031:
---

@Miklos Szegedi, after reading the source code and according to my test result, 
if set this *yarn.nodemanager.linux-container-executor.cgroups.mount* to false, 
NM won't create the hierarchy directory hadoop-yarn with cpu controller 
mounted, which is conflict with what is mentioned in the doc:

[https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeManagerCgroups.html]
{code:java}
// code placeholder
The cgroups hierarchy under which to place YARN proccesses(cannot contain 
commas). If yarn.nodemanager.linux-container-executor.cgroups.mount is false 
(that is, if cgroups have been pre-configured) and the YARN user has write 
access to the parent directory, then the directory will be created. If the 
directory already exists, the administrator has to give YARN write permissions 
to it recursively.
{code}

> NodeManager will fail to start if cpu subsystem is already mounted
> --
>
> Key: YARN-8031
> URL: https://issues.apache.org/jira/browse/YARN-8031
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: JayceAu
>Priority: Major
> Attachments: YARN-8031.001.patch
>
>
> if *yarn.nodemanager.linux-container-executor.cgroups.mount* is set to true 
> and cpu subsystem is not yet mounted, NodeManager will mount the cpu 
> subsystem and then create the control group whose default name is 
> *hadoop-yarn* if the mount step is successful. This procedure works well if 
> cpu subsystem is not yet mounted. However, under some situation cpu subsystem 
> is already mounted before NodeManager starts and NodeManager will fail to 
> start because of no write permission to the *hadoop-yarn* path . For example:
>  # in OS that use systemd such as centos7 will have cpu subsystem mounted by 
> default on machine startup
>  # some deamon whose start order is more precedent than NodeManager may also 
> rely on the mounted state of cpu subsystem. In our production environment, we 
> limit the cpu usage of the monitoring and control agent, which starts on 
> reboot
> In order to solve this problem, container-executor must be able to create the 
> control group *hadoop-yarn* if mounting controller is successful or this 
> controller is already mounted. Besides, if cpu subsystem is used in 
> combination with other subsystem and it's already mounted, container-executor 
> should use the latest mount point of cpu subsystem instread of the one 
> provided by NodeManager.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8031) NodeManager will fail to start if cpu subsystem is already mounted

2018-03-16 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16402162#comment-16402162
 ] 

Miklos Szegedi commented on YARN-8031:
--

[~jayceAu], thank you for raising this. If you have CGroups already mounted, 
you should set the mount option to false as described here:

[https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeManagerCgroups.html]
{code:java}
Discover CGroups mounted alreadyThis should be used on newer systems 
like RHEL7 or Ubuntu16 or if the administrator mounts CGroups before YARN 
starts. Set yarn.nodemanager.linux-container-executor.cgroups.mount to false 
and leave other settings set to their defaults. YARN will locate the mount 
points in /proc/mounts. Common locations include /sys/fs/cgroup and /cgroup. 
The default location can vary depending on the Linux distribution in use.{code}

> NodeManager will fail to start if cpu subsystem is already mounted
> --
>
> Key: YARN-8031
> URL: https://issues.apache.org/jira/browse/YARN-8031
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: JayceAu
>Priority: Major
> Attachments: image-2018-03-15-14-47-30-583.png
>
>
> if *yarn.nodemanager.linux-container-executor.cgroups.mount* is set to true 
> and cpu subsystem is not yet mounted, NodeManager will mount the cpu 
> subsystem and then create the control group whose default name is 
> *hadoop-yarn* if the mount step is successful. This procedure works well if 
> cpu subsystem is not yet mounted. However, under some situation cpu subsystem 
> is already mounted before NodeManager starts and NodeManager will fail to 
> start because of no write permission to the *hadoop-yarn* path . For example:
>  # in OS that use systemd such as centos7 will have cpu subsystem mounted by 
> default on machine startup
>  # some deamon whose start order is more precedent than NodeManager may also 
> rely on the mounted state of cpu subsystem. In our production environment, we 
> limit the cpu usage of the monitoring and control agent, which starts on 
> reboot
> In order to solve this problem, container-executor must be able to create the 
> control group *hadoop-yarn* if mounting controller is successful or this 
> controller is already mounted. Besides, if cpu subsystem is used in 
> combination with other subsystem and it's already mounted, container-executor 
> should use the latest mount point of cpu subsystem instread of the one 
> provided by NodeManager.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8031) NodeManager will fail to start if cpu subsystem is already mounted

[jira] [Commented] (YARN-8031) NodeManager will fail to start if cpu subsystem is already mounted

2 matches

Site Navigation

Mail list logo

Footer information