[jira] [Commented] (YARN-2918) Don't fail RM if queue's configured labels are not existed in cluster-node-labels

2015-07-28 Thread Dheeren Beborrtha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644869#comment-14644869
 ] 

Dheeren Beborrtha commented on YARN-2918:
-

Thanks Wangda Tan. The problem is that I had a cluster up and running with HDP 
2.2.0. Followed HortonWorks instructions to add the labels and configured the 
CS queues and that was easy.Few days went by. The cluster went down one day and 
on restart RM wouldn't come back up. The only way we could bring it back was to 
use an older version of the CS xml to bring it back to life. Unfortunately this 
was not documented any where. 
By the way, where are the node labels persisted? I had to re-add the labels too!

> Don't fail RM if queue's configured labels are not existed in 
> cluster-node-labels
> -
>
> Key: YARN-2918
> URL: https://issues.apache.org/jira/browse/YARN-2918
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Wangda Tan
>  Labels: 2.6.1-candidate
> Fix For: 2.8.0, 2.7.1
>
> Attachments: YARN-2918.1.patch, YARN-2918.2.patch, YARN-2918.3.patch
>
>
> Currently, if admin setup labels on queues 
> {{.accessible-node-labels = ...}}. And the label is not added to 
> RM, queue's initialization will fail and RM will fail too:
> {noformat}
> 2014-12-03 20:11:50,126 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
> ResourceManager
> ...
> Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
> please check.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.(AbstractCSQueue.java:109)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.(LeafQueue.java:120)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> {noformat}
> This is not a good user experience, we should stop fail RM so that admin can 
> configure queue/labels in following steps:
> - Configure queue (with label)
> - Start RM
> - Add labels to RM
> - Submit applications
> Now admin has to:
> - Configure queue (without label)
> - Start RM
> - Add labels to RM
> - Refresh queue's config (with label)
> - Submit applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2918) Don't fail RM if queue's configured labels are not existed in cluster-node-labels

2015-07-27 Thread Dheeren Beborrtha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643650#comment-14643650
 ] 

Dheeren Beborrtha commented on YARN-2918:
-

This is a major issue and a big inconvenience. Can this be backported to Hadoop 
2.6.0?

> Don't fail RM if queue's configured labels are not existed in 
> cluster-node-labels
> -
>
> Key: YARN-2918
> URL: https://issues.apache.org/jira/browse/YARN-2918
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Wangda Tan
> Fix For: 2.8.0, 2.7.1
>
> Attachments: YARN-2918.1.patch, YARN-2918.2.patch, YARN-2918.3.patch
>
>
> Currently, if admin setup labels on queues 
> {{.accessible-node-labels = ...}}. And the label is not added to 
> RM, queue's initialization will fail and RM will fail too:
> {noformat}
> 2014-12-03 20:11:50,126 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
> ResourceManager
> ...
> Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
> please check.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.(AbstractCSQueue.java:109)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.(LeafQueue.java:120)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> {noformat}
> This is not a good user experience, we should stop fail RM so that admin can 
> configure queue/labels in following steps:
> - Configure queue (with label)
> - Start RM
> - Add labels to RM
> - Submit applications
> Now admin has to:
> - Configure queue (without label)
> - Start RM
> - Add labels to RM
> - Refresh queue's config (with label)
> - Submit applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers

2015-07-01 Thread Dheeren Beborrtha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14610649#comment-14610649
 ] 

Dheeren Beborrtha commented on YARN-2140:
-

How do you support port level isolation for Docker containers? 
For example, lets say I would like to run multiple docker containers on the 
same Datanode. If each of the conatiners needs to be long running and need to 
advertise their ports, what is the mechanism for doing so? 

> Add support for network IO isolation/scheduling for containers
> --
>
> Key: YARN-2140
> URL: https://issues.apache.org/jira/browse/YARN-2140
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wei Yan
>Assignee: Sidharta Seethana
> Attachments: NetworkAsAResourceDesign.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)