[jira] [Commented] (YARN-1008) MiniYARNCluster with multiple nodemanagers, all nodes have same key for allocations
[ https://issues.apache.org/jira/browse/YARN-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735677#comment-13735677 ] Hadoop QA commented on YARN-1008: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12597235/YARN-1008.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests: org.apache.hadoop.yarn.logaggregation.TestAggregatedLogsBlock org.apache.hadoop.yarn.api.TestNodeId org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerResync org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue org.apache.hadoop.yarn.server.TestContainerManagerSecurity {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1689//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1689//console This message is automatically generated. > MiniYARNCluster with multiple nodemanagers, all nodes have same key for > allocations > --- > > Key: YARN-1008 > URL: https://issues.apache.org/jira/browse/YARN-1008 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.1.0-beta >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: YARN-1008.patch, YARN-1008.patch > > > While the NMs are keyed using the NodeId, the allocation is done based on the > hostname. > This makes the different nodes indistinguishable to the scheduler. > There should be an option to enabled the host:port instead just port for > allocations. The nodes reported to the AM should report the 'key' (host or > host:port). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1008) MiniYARNCluster with multiple nodemanagers, all nodes have same key for allocations
[ https://issues.apache.org/jira/browse/YARN-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735655#comment-13735655 ] Omkar Vinit Joshi commented on YARN-1008: - [~tucu00] I don't see why we need to change all the NodeId stuff? Can't we simply call toString() method where ever we need host:port?? why we need new configuration and modification to NodeId? I see where ever we are using FiCaSchedulerNode.getHostName()..we should replace it with FiCaSchedulerNode.getNodeId().toString()?? thoughts please? > MiniYARNCluster with multiple nodemanagers, all nodes have same key for > allocations > --- > > Key: YARN-1008 > URL: https://issues.apache.org/jira/browse/YARN-1008 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.1.0-beta >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: YARN-1008.patch, YARN-1008.patch > > > While the NMs are keyed using the NodeId, the allocation is done based on the > hostname. > This makes the different nodes indistinguishable to the scheduler. > There should be an option to enabled the host:port instead just port for > allocations. The nodes reported to the AM should report the 'key' (host or > host:port). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator
[ https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735649#comment-13735649 ] Hadoop QA commented on YARN-1021: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12597233/YARN-1021.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-assemblies hadoop-tools/hadoop-sls hadoop-tools/hadoop-tools-dist. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1688//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1688//console This message is automatically generated. > Yarn Scheduler Load Simulator > - > > Key: YARN-1021 > URL: https://issues.apache.org/jira/browse/YARN-1021 > Project: Hadoop YARN > Issue Type: New Feature > Components: scheduler >Reporter: Wei Yan >Assignee: Wei Yan > Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, > YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, > YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf > > > The Yarn Scheduler is a fertile area of interest with different > implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile, > several optimizations are also made to improve scheduler performance for > different scenarios and workload. Each scheduler algorithm has its own set of > features, and drives scheduling decisions by many factors, such as fairness, > capacity guarantee, resource availability, etc. It is very important to > evaluate a scheduler algorithm very well before we deploy it in a production > cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling > algorithm. Evaluating in a real cluster is always time and cost consuming, > and it is also very hard to find a large-enough cluster. Hence, a simulator > which can predict how well a scheduler algorithm for some specific workload > would be quite useful. > We want to build a Scheduler Load Simulator to simulate large-scale Yarn > clusters and application loads in a single machine. This would be invaluable > in furthering Yarn by providing a tool for researchers and developers to > prototype new scheduler features and predict their behavior and performance > with reasonable amount of confidence, there-by aiding rapid innovation. > The simulator will exercise the real Yarn ResourceManager removing the > network factor by simulating NodeManagers and ApplicationMasters via handling > and dispatching NM/AMs heartbeat events from within the same JVM. > To keep tracking of scheduler behavior and performance, a scheduler wrapper > will wrap the real scheduler. > The simulator will produce real time metrics while executing, including: > * Resource usages for whole cluster and each queue, which can be utilized to > configure cluster and queue's capacity. > * The detailed application execution trace (recorded in relation to simulated > time), which can be analyzed to understand/validate the scheduler behavior > (individual jobs turn around time, throughput, fairness, capacity guarantee, > etc). > * Several key metrics of scheduler algorithm, such as time cost of each > scheduler operation (allocate, handle, etc), which can be utilized by Hadoop > developers to find the code spots and scalability limits. > The simulator will provide real time charts showing the behavior of the > scheduler and its performance. > A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing > how to use simulator to simulate Fair Scheduler and Capacity Scheduler. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1008) MiniYARNCluster with multiple nodemanagers, all nodes have same key for allocations
[ https://issues.apache.org/jira/browse/YARN-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated YARN-1008: - Attachment: YARN-1008.patch trying again, it apply locally. in short: * i've added a new property to NodeId 'nodeName' * i've added a config to the NMs to set the host or host:port there, default is host (current behavior) * i've renamed SchedulerNode.getHostname to getNodeName, and made its implementations to use NodeId.getNodeName() * testcase with minicluster that test that node reports have the right value according to the setting. * testcase for FS that verify nothing is broken leveraging strict locality test (didn't find similar tests in capacity and fifo schedulers) > MiniYARNCluster with multiple nodemanagers, all nodes have same key for > allocations > --- > > Key: YARN-1008 > URL: https://issues.apache.org/jira/browse/YARN-1008 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.1.0-beta >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: YARN-1008.patch, YARN-1008.patch > > > While the NMs are keyed using the NodeId, the allocation is done based on the > hostname. > This makes the different nodes indistinguishable to the scheduler. > There should be an option to enabled the host:port instead just port for > allocations. The nodes reported to the AM should report the 'key' (host or > host:port). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator
[ https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated YARN-1021: -- Attachment: (was: YARN-1021.pdf) > Yarn Scheduler Load Simulator > - > > Key: YARN-1021 > URL: https://issues.apache.org/jira/browse/YARN-1021 > Project: Hadoop YARN > Issue Type: New Feature > Components: scheduler >Reporter: Wei Yan >Assignee: Wei Yan > Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, > YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, > YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf > > > The Yarn Scheduler is a fertile area of interest with different > implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile, > several optimizations are also made to improve scheduler performance for > different scenarios and workload. Each scheduler algorithm has its own set of > features, and drives scheduling decisions by many factors, such as fairness, > capacity guarantee, resource availability, etc. It is very important to > evaluate a scheduler algorithm very well before we deploy it in a production > cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling > algorithm. Evaluating in a real cluster is always time and cost consuming, > and it is also very hard to find a large-enough cluster. Hence, a simulator > which can predict how well a scheduler algorithm for some specific workload > would be quite useful. > We want to build a Scheduler Load Simulator to simulate large-scale Yarn > clusters and application loads in a single machine. This would be invaluable > in furthering Yarn by providing a tool for researchers and developers to > prototype new scheduler features and predict their behavior and performance > with reasonable amount of confidence, there-by aiding rapid innovation. > The simulator will exercise the real Yarn ResourceManager removing the > network factor by simulating NodeManagers and ApplicationMasters via handling > and dispatching NM/AMs heartbeat events from within the same JVM. > To keep tracking of scheduler behavior and performance, a scheduler wrapper > will wrap the real scheduler. > The simulator will produce real time metrics while executing, including: > * Resource usages for whole cluster and each queue, which can be utilized to > configure cluster and queue's capacity. > * The detailed application execution trace (recorded in relation to simulated > time), which can be analyzed to understand/validate the scheduler behavior > (individual jobs turn around time, throughput, fairness, capacity guarantee, > etc). > * Several key metrics of scheduler algorithm, such as time cost of each > scheduler operation (allocate, handle, etc), which can be utilized by Hadoop > developers to find the code spots and scalability limits. > The simulator will provide real time charts showing the behavior of the > scheduler and its performance. > A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing > how to use simulator to simulate Fair Scheduler and Capacity Scheduler. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator
[ https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated YARN-1021: -- Attachment: YARN-1021.patch Update a patch according to [~tucu00]'s comments. > Yarn Scheduler Load Simulator > - > > Key: YARN-1021 > URL: https://issues.apache.org/jira/browse/YARN-1021 > Project: Hadoop YARN > Issue Type: New Feature > Components: scheduler >Reporter: Wei Yan >Assignee: Wei Yan > Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, > YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, > YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf > > > The Yarn Scheduler is a fertile area of interest with different > implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile, > several optimizations are also made to improve scheduler performance for > different scenarios and workload. Each scheduler algorithm has its own set of > features, and drives scheduling decisions by many factors, such as fairness, > capacity guarantee, resource availability, etc. It is very important to > evaluate a scheduler algorithm very well before we deploy it in a production > cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling > algorithm. Evaluating in a real cluster is always time and cost consuming, > and it is also very hard to find a large-enough cluster. Hence, a simulator > which can predict how well a scheduler algorithm for some specific workload > would be quite useful. > We want to build a Scheduler Load Simulator to simulate large-scale Yarn > clusters and application loads in a single machine. This would be invaluable > in furthering Yarn by providing a tool for researchers and developers to > prototype new scheduler features and predict their behavior and performance > with reasonable amount of confidence, there-by aiding rapid innovation. > The simulator will exercise the real Yarn ResourceManager removing the > network factor by simulating NodeManagers and ApplicationMasters via handling > and dispatching NM/AMs heartbeat events from within the same JVM. > To keep tracking of scheduler behavior and performance, a scheduler wrapper > will wrap the real scheduler. > The simulator will produce real time metrics while executing, including: > * Resource usages for whole cluster and each queue, which can be utilized to > configure cluster and queue's capacity. > * The detailed application execution trace (recorded in relation to simulated > time), which can be analyzed to understand/validate the scheduler behavior > (individual jobs turn around time, throughput, fairness, capacity guarantee, > etc). > * Several key metrics of scheduler algorithm, such as time cost of each > scheduler operation (allocate, handle, etc), which can be utilized by Hadoop > developers to find the code spots and scalability limits. > The simulator will provide real time charts showing the behavior of the > scheduler and its performance. > A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing > how to use simulator to simulate Fair Scheduler and Capacity Scheduler. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator
[ https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated YARN-1021: -- Attachment: YARN-1021.pdf > Yarn Scheduler Load Simulator > - > > Key: YARN-1021 > URL: https://issues.apache.org/jira/browse/YARN-1021 > Project: Hadoop YARN > Issue Type: New Feature > Components: scheduler >Reporter: Wei Yan >Assignee: Wei Yan > Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, > YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, > YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf > > > The Yarn Scheduler is a fertile area of interest with different > implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile, > several optimizations are also made to improve scheduler performance for > different scenarios and workload. Each scheduler algorithm has its own set of > features, and drives scheduling decisions by many factors, such as fairness, > capacity guarantee, resource availability, etc. It is very important to > evaluate a scheduler algorithm very well before we deploy it in a production > cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling > algorithm. Evaluating in a real cluster is always time and cost consuming, > and it is also very hard to find a large-enough cluster. Hence, a simulator > which can predict how well a scheduler algorithm for some specific workload > would be quite useful. > We want to build a Scheduler Load Simulator to simulate large-scale Yarn > clusters and application loads in a single machine. This would be invaluable > in furthering Yarn by providing a tool for researchers and developers to > prototype new scheduler features and predict their behavior and performance > with reasonable amount of confidence, there-by aiding rapid innovation. > The simulator will exercise the real Yarn ResourceManager removing the > network factor by simulating NodeManagers and ApplicationMasters via handling > and dispatching NM/AMs heartbeat events from within the same JVM. > To keep tracking of scheduler behavior and performance, a scheduler wrapper > will wrap the real scheduler. > The simulator will produce real time metrics while executing, including: > * Resource usages for whole cluster and each queue, which can be utilized to > configure cluster and queue's capacity. > * The detailed application execution trace (recorded in relation to simulated > time), which can be analyzed to understand/validate the scheduler behavior > (individual jobs turn around time, throughput, fairness, capacity guarantee, > etc). > * Several key metrics of scheduler algorithm, such as time cost of each > scheduler operation (allocate, handle, etc), which can be utilized by Hadoop > developers to find the code spots and scalability limits. > The simulator will provide real time charts showing the behavior of the > scheduler and its performance. > A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing > how to use simulator to simulate Fair Scheduler and Capacity Scheduler. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-978) [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation
[ https://issues.apache.org/jira/browse/YARN-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735624#comment-13735624 ] Hitesh Shah commented on YARN-978: -- In terms of duplication: diagnostics, host and rpcPort are unique for each attempt and can be useful to know about. I am not sure from a historical point of view whether rpc port is useful but definitely needed for a running attempt. The finalApplicationStatus is only set for the last attempt. The internal *hidden behind proxy* tracker url is different for each attempt but not exposed to the user. The tracker url either points to the current running attempt or to the history server hence a question as to whether it is needed for each attempt. > [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation > -- > > Key: YARN-978 > URL: https://issues.apache.org/jira/browse/YARN-978 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Xuan Gong > Fix For: YARN-321 > > Attachments: YARN-978-1.patch, YARN-978.2.patch, YARN-978.3.patch > > > We dont have ApplicationAttemptReport and Protobuf implementation. > Adding that. > Thanks, > Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-899) Get queue administration ACLs working
[ https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735605#comment-13735605 ] Sandy Ryza commented on YARN-899: - Oh, makes sense because the NodeManager uses it as well. In that case the current approach is fine. > Get queue administration ACLs working > - > > Key: YARN-899 > URL: https://issues.apache.org/jira/browse/YARN-899 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Xuan Gong > Attachments: YARN-899.1.patch, YARN-899.2.patch, YARN-899.3.patch, > YARN-899.4.patch > > > The Capacity Scheduler documents the > yarn.scheduler.capacity.root..acl_administer_queue config option > for controlling who can administer a queue, but it is not hooked up to > anything. The Fair Scheduler could make use of a similar option as well. > This is a feature-parity regression from MR1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1008) MiniYARNCluster with multiple nodemanagers, all nodes have same key for allocations
[ https://issues.apache.org/jira/browse/YARN-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735581#comment-13735581 ] Hadoop QA commented on YARN-1008: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12597228/YARN-1008.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1687//console This message is automatically generated. > MiniYARNCluster with multiple nodemanagers, all nodes have same key for > allocations > --- > > Key: YARN-1008 > URL: https://issues.apache.org/jira/browse/YARN-1008 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.1.0-beta >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: YARN-1008.patch > > > While the NMs are keyed using the NodeId, the allocation is done based on the > hostname. > This makes the different nodes indistinguishable to the scheduler. > There should be an option to enabled the host:port instead just port for > allocations. The nodes reported to the AM should report the 'key' (host or > host:port). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1008) MiniYARNCluster with multiple nodemanagers, all nodes have same key for allocations
[ https://issues.apache.org/jira/browse/YARN-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated YARN-1008: - Attachment: YARN-1008.patch > MiniYARNCluster with multiple nodemanagers, all nodes have same key for > allocations > --- > > Key: YARN-1008 > URL: https://issues.apache.org/jira/browse/YARN-1008 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.1.0-beta >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: YARN-1008.patch > > > While the NMs are keyed using the NodeId, the allocation is done based on the > hostname. > This makes the different nodes indistinguishable to the scheduler. > There should be an option to enabled the host:port instead just port for > allocations. The nodes reported to the AM should report the 'key' (host or > host:port). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-978) [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation
[ https://issues.apache.org/jira/browse/YARN-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735572#comment-13735572 ] Zhijie Shen commented on YARN-978: -- I'm not pretty sure how users would make use of the attempt information, but I think it does no harm to expose the attempt information listed in the patch, which on the other side, give users more options. Compared with application report, the duplicated fields: * diagnostics * finalApplicationStatus * host * rpcPort * trackingUrl However, the fields in an application report, is extracted from the *current* attempt. Therefore, through the application report, users can only know this information of the current attempt. In the case of application history, the information should belong to that of the last attempt when the application is finished. In contrast, an application may have launched several attempts before it is finished. The aforementioned information of each attempt may be different. Therefore, this information should not be duplicate. bq. For a non-running attempt, what is the tracker url and the original tracking url? hm... Do you mean the tracking url only work for a running attempt? bq. Failed attempts have a failed yarn state. Failed attempts would register an app state with the RM ( during unregister ) if that attempt is the final attempt Sorry, I mean all attempts should have a finalApplicationStatus, shouldn't they? > [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation > -- > > Key: YARN-978 > URL: https://issues.apache.org/jira/browse/YARN-978 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Xuan Gong > Fix For: YARN-321 > > Attachments: YARN-978-1.patch, YARN-978.2.patch, YARN-978.3.patch > > > We dont have ApplicationAttemptReport and Protobuf implementation. > Adding that. > Thanks, > Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-899) Get queue administration ACLs working
[ https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735564#comment-13735564 ] Xuan Gong commented on YARN-899: bq.Do we need a QueueACLsManager separate from ApplicationACLsManager? I.e. could we just have a reference to ResourceScheduler in ApplicationACLsManager and check access based on the queue name there? ResourceScheduler is in hadoop.yarn.server.resourcemanager package, but ApplicationACLsManager is in hadoop.yarn.common package, and looks like we can not add ResourceScheduler reference into ApplicationACLsManager. > Get queue administration ACLs working > - > > Key: YARN-899 > URL: https://issues.apache.org/jira/browse/YARN-899 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Xuan Gong > Attachments: YARN-899.1.patch, YARN-899.2.patch, YARN-899.3.patch, > YARN-899.4.patch > > > The Capacity Scheduler documents the > yarn.scheduler.capacity.root..acl_administer_queue config option > for controlling who can administer a queue, but it is not hooked up to > anything. The Fair Scheduler could make use of a similar option as well. > This is a feature-parity regression from MR1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1054) Invalid state transition exception caught when tearing down a (mini) cluster
[ https://issues.apache.org/jira/browse/YARN-1054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735545#comment-13735545 ] Steve Loughran commented on YARN-1054: -- logs with stack {code} 2013-08-09 16:46:31,208 [main] INFO yarn.cluster.YarnMiniClusterTestBase (YarnMiniClusterTestBase.groovy:describe(121)) - 2013-08-09 16:46:31,208 [main] INFO yarn.cluster.YarnMiniClusterTestBase (YarnMiniClusterTestBase.groovy:describe(122)) - === 2013-08-09 16:46:31,208 [main] INFO yarn.cluster.YarnMiniClusterTestBase (YarnMiniClusterTestBase.groovy:describe(123)) - teardown 2013-08-09 16:46:31,208 [main] INFO yarn.cluster.YarnMiniClusterTestBase (YarnMiniClusterTestBase.groovy:describe(124)) - === 2013-08-09 16:46:31,208 [main] INFO yarn.cluster.YarnMiniClusterTestBase (YarnMiniClusterTestBase.groovy:describe(125)) - 2013-08-09 16:46:31,216 [main] INFO org.mortbay.log (Slf4jLog.java:info(67)) - Stopped SelectChannelConnector@stevel-2.local:0 2013-08-09 16:46:31,319 [main] INFO hadoop.ipc.Server (Server.java:stop(2429)) - Stopping server on 57710 2013-08-09 16:46:31,324 [IPC Server listener on 57710] INFO hadoop.ipc.Server (Server.java:run(720)) - Stopping IPC Server listener on 57710 2013-08-09 16:46:31,326 [IPC Server Responder] INFO hadoop.ipc.Server (Server.java:run(866)) - Stopping IPC Server Responder 2013-08-09 16:46:31,334 [main] INFO hadoop.ipc.Server (Server.java:stop(2429)) - Stopping server on 57711 2013-08-09 16:46:31,336 [IPC Server listener on 57711] INFO hadoop.ipc.Server (Server.java:run(720)) - Stopping IPC Server listener on 57711 2013-08-09 16:46:31,336 [IPC Server Responder] INFO hadoop.ipc.Server (Server.java:run(866)) - Stopping IPC Server Responder 2013-08-09 16:46:31,339 [Public Localizer] INFO containermanager.localizer.ResourceLocalizationService (ResourceLocalizationService.java:run(728)) - Public cache exiting 2013-08-09 16:46:31,340 [main] INFO server.nodemanager.NodeManager (NodeManager.java:cleanupContainers(261)) - Containers still running on SHUTDOWN : [container_1376091973496_0001_01_01, container_1376091973496_0001_01_02] 2013-08-09 16:46:31,342 [main] INFO server.nodemanager.NodeManager (NodeManager.java:cleanupContainers(270)) - Waiting for containers to be killed 2013-08-09 16:46:31,343 [AsyncDispatcher event handler] INFO containermanager.container.Container (ContainerImpl.java:handle(860)) - Container container_1376091973496_0001_01_01 transitioned from RUNNING to KILLING 2013-08-09 16:46:31,343 [AsyncDispatcher event handler] INFO containermanager.container.Container (ContainerImpl.java:handle(860)) - Container container_1376091973496_0001_01_02 transitioned from RUNNING to KILLING 2013-08-09 16:46:31,344 [AsyncDispatcher event handler] INFO containermanager.launcher.ContainerLaunch (ContainerLaunch.java:cleanupContainer(323)) - Cleaning up container container_1376091973496_0001_01_01 2013-08-09 16:46:31,386 [ContainersLauncher #0] WARN server.nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:launchContainer(207)) - Exit code from container container_1376091973496_0001_01_01 is : 143 2013-08-09 16:46:31,403 [AsyncDispatcher event handler] INFO containermanager.launcher.ContainerLaunch (ContainerLaunch.java:cleanupContainer(323)) - Cleaning up container container_1376091973496_0001_01_02 2013-08-09 16:46:31,441 [ContainersLauncher #1] WARN server.nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:launchContainer(207)) - Exit code from container container_1376091973496_0001_01_02 is : 143 2013-08-09 16:46:31,456 [AsyncDispatcher event handler] INFO containermanager.container.Container (ContainerImpl.java:handle(860)) - Container container_1376091973496_0001_01_01 transitioned from KILLING to CONTAINER_CLEANEDUP_AFTER_KILL 2013-08-09 16:46:31,457 [AsyncDispatcher event handler] WARN containermanager.container.Container (ContainerImpl.java:handle(856)) - Can't handle this event at current state: Current: [CONTAINER_CLEANEDUP_AFTER_KILL], eventType: [CONTAINER_KILLED_ON_REQUEST] org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: CONTAINER_KILLED_ON_REQUEST at CONTAINER_CLEANEDUP_AFTER_KILL at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:853) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:73) at org.apache.
[jira] [Commented] (YARN-979) [YARN-321] Adding application attempt and container to ApplicationHistoryProtocol
[ https://issues.apache.org/jira/browse/YARN-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735543#comment-13735543 ] Hadoop QA commented on YARN-979: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12597216/YARN-979-1.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1686//console This message is automatically generated. > [YARN-321] Adding application attempt and container to > ApplicationHistoryProtocol > - > > Key: YARN-979 > URL: https://issues.apache.org/jira/browse/YARN-979 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-979-1.patch > > > Adding application attempt and container to ApplicationHistoryProtocol > Thanks, > Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1054) Invalid state transition exception caught when tearing down a (mini) cluster
Steve Loughran created YARN-1054: Summary: Invalid state transition exception caught when tearing down a (mini) cluster Key: YARN-1054 URL: https://issues.apache.org/jira/browse/YARN-1054 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Steve Loughran Priority: Minor When I'm tearing down a MiniYARNCluster I get a stack trace warning that an invalid state transition has been attempted {code} [CONTAINER_KILLED_ON_REQUEST]org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: CONTAINER_KILLED_ON_REQUEST {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-979) [YARN-321] Adding application attempt and container to ApplicationHistoryProtocol
[ https://issues.apache.org/jira/browse/YARN-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated YARN-979: --- Attachment: YARN-979-1.patch Attaching patch for the protocol change and adding supporting classes. Thanks, Mayank > [YARN-321] Adding application attempt and container to > ApplicationHistoryProtocol > - > > Key: YARN-979 > URL: https://issues.apache.org/jira/browse/YARN-979 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-979-1.patch > > > Adding application attempt and container to ApplicationHistoryProtocol > Thanks, > Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1046) Disable mem monitoring by default in MiniYARNCluster
[ https://issues.apache.org/jira/browse/YARN-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735412#comment-13735412 ] Hitesh Shah commented on YARN-1046: --- [~kkambatl] Looking more at MAPREDUCE-5094, looks like Vinod had a comment that specific tests already exist for verifying that container mem monitoring works. My main concern was that having a path always testing container monitoring would be useful to catch edge cases in different environments. However, as Vinod mentioned, given that as we have specific tests for that, it should be fine to disable both. To summarize, nothing more to be done. > Disable mem monitoring by default in MiniYARNCluster > > > Key: YARN-1046 > URL: https://issues.apache.org/jira/browse/YARN-1046 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Fix For: 2.1.0-beta > > Attachments: yarn-1046-1.patch, yarn-1046-2.patch > > > Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 > machines. However, when I try to run it independently on the machines, I have > not been able to reproduce it. > {noformat} > 2013-08-07 19:17:35,048 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - > Container [pid=16556,containerID=container_1375928243488_0001_01_01] is > running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB > physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container. > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1046) Disable mem monitoring by default in MiniYARNCluster
[ https://issues.apache.org/jira/browse/YARN-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735406#comment-13735406 ] Sandy Ryza commented on YARN-1046: -- Vinod's comment on MAPREDUCE-5094 for disabling pmem as well was bq. Let's also disable pmem, no point in doing it for tests. If we want to validate the monitoring, specific tests should do that. > Disable mem monitoring by default in MiniYARNCluster > > > Key: YARN-1046 > URL: https://issues.apache.org/jira/browse/YARN-1046 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Fix For: 2.1.0-beta > > Attachments: yarn-1046-1.patch, yarn-1046-2.patch > > > Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 > machines. However, when I try to run it independently on the machines, I have > not been able to reproduce it. > {noformat} > 2013-08-07 19:17:35,048 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - > Container [pid=16556,containerID=container_1375928243488_0001_01_01] is > running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB > physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container. > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1046) Disable mem monitoring by default in MiniYARNCluster
[ https://issues.apache.org/jira/browse/YARN-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735402#comment-13735402 ] Karthik Kambatla commented on YARN-1046: [~hitesh], I was wondering the same. In the end, just followed what MAPREDUCE-5094 did. It does make sense to only disable vmem monitoring. Also, we shouldn't need to do MAPREDUCE-5094 any more. Should I post an addendum that does all this or file a new JIRA? > Disable mem monitoring by default in MiniYARNCluster > > > Key: YARN-1046 > URL: https://issues.apache.org/jira/browse/YARN-1046 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Fix For: 2.1.0-beta > > Attachments: yarn-1046-1.patch, yarn-1046-2.patch > > > Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 > machines. However, when I try to run it independently on the machines, I have > not been able to reproduce it. > {noformat} > 2013-08-07 19:17:35,048 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - > Container [pid=16556,containerID=container_1375928243488_0001_01_01] is > running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB > physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container. > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1053) Diagnostic message from ContainerExitEvent is ignored in ContainerImpl
[ https://issues.apache.org/jira/browse/YARN-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-1053: Labels: newbie (was: ) > Diagnostic message from ContainerExitEvent is ignored in ContainerImpl > -- > > Key: YARN-1053 > URL: https://issues.apache.org/jira/browse/YARN-1053 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi > Labels: newbie > Fix For: 2.1.0-beta, 2.1.1-beta > > Attachments: YARN-1053.20130809.patch > > > If the container launch fails then we send ContainerExitEvent. This event > contains exitCode and diagnostic message. Today we are ignoring diagnostic > message while handling this event inside ContainerImpl. Fixing it as it is > useful in diagnosing the failure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735386#comment-13735386 ] Steve Loughran commented on YARN-1042: -- Maybe -but now that Hadoop topologies can express failure domains, we could pick that up to say "spread these containers across >1 failure domain" without having to execute any cluster-specific tuning. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1053) Diagnostic message from ContainerExitEvent is ignored in ContainerImpl
[ https://issues.apache.org/jira/browse/YARN-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735374#comment-13735374 ] Omkar Vinit Joshi commented on YARN-1053: - simple fix...adding no tests. > Diagnostic message from ContainerExitEvent is ignored in ContainerImpl > -- > > Key: YARN-1053 > URL: https://issues.apache.org/jira/browse/YARN-1053 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi > Fix For: 2.1.0-beta, 2.1.1-beta > > Attachments: YARN-1053.20130809.patch > > > If the container launch fails then we send ContainerExitEvent. This event > contains exitCode and diagnostic message. Today we are ignoring diagnostic > message while handling this event inside ContainerImpl. Fixing it as it is > useful in diagnosing the failure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1053) Diagnostic message from ContainerExitEvent is ignored in ContainerImpl
[ https://issues.apache.org/jira/browse/YARN-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-1053: Attachment: YARN-1053.20130809.patch > Diagnostic message from ContainerExitEvent is ignored in ContainerImpl > -- > > Key: YARN-1053 > URL: https://issues.apache.org/jira/browse/YARN-1053 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi > Fix For: 2.1.0-beta, 2.1.1-beta > > Attachments: YARN-1053.20130809.patch > > > If the container launch fails then we send ContainerExitEvent. This event > contains exitCode and diagnostic message. Today we are ignoring diagnostic > message while handling this event inside ContainerImpl. Fixing it as it is > useful in diagnosing the failure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1046) Disable mem monitoring by default in MiniYARNCluster
[ https://issues.apache.org/jira/browse/YARN-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735367#comment-13735367 ] Hitesh Shah commented on YARN-1046: --- Sorry to chime in late on this. Was there a reason why both vmem and pmem checks were disabled? Having seen such failures reported by other folks at times, in almost all situations ( including this one ), only the vmem limits were exceeded. > Disable mem monitoring by default in MiniYARNCluster > > > Key: YARN-1046 > URL: https://issues.apache.org/jira/browse/YARN-1046 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Fix For: 2.1.0-beta > > Attachments: yarn-1046-1.patch, yarn-1046-2.patch > > > Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 > machines. However, when I try to run it independently on the machines, I have > not been able to reproduce it. > {noformat} > 2013-08-07 19:17:35,048 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - > Container [pid=16556,containerID=container_1375928243488_0001_01_01] is > running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB > physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container. > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1053) Diagnostic message from ContainerExitEvent is ignored in ContainerImpl
Omkar Vinit Joshi created YARN-1053: --- Summary: Diagnostic message from ContainerExitEvent is ignored in ContainerImpl Key: YARN-1053 URL: https://issues.apache.org/jira/browse/YARN-1053 Project: Hadoop YARN Issue Type: Bug Reporter: Omkar Vinit Joshi Assignee: Omkar Vinit Joshi Fix For: 2.1.0-beta, 2.1.1-beta If the container launch fails then we send ContainerExitEvent. This event contains exitCode and diagnostic message. Today we are ignoring diagnostic message while handling this event inside ContainerImpl. Fixing it as it is useful in diagnosing the failure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1052) Enforce submit application queue ACLs outside the scheduler
Sandy Ryza created YARN-1052: Summary: Enforce submit application queue ACLs outside the scheduler Key: YARN-1052 URL: https://issues.apache.org/jira/browse/YARN-1052 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager, scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Per discussion in YARN-899, schedulers should not need to enforce queue ACLs on their own. Currently schedulers do this for application submission, and this should be done in the RM code instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-899) Get queue administration ACLs working
[ https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735358#comment-13735358 ] Sandy Ryza commented on YARN-899: - The general approach and patch are looking good. Some comments: Do we need a QueueACLsManager separate from ApplicationACLsManager? I.e. could we just have a reference to ResourceScheduler in ApplicationACLsManager and check access based on the queue name there? The LOG.debugs in the scheduler checkAccess methods should be surrounded by isDebugEnabled() Filed YARN-1052 for moving submit ACLs outside the scheduler. > Get queue administration ACLs working > - > > Key: YARN-899 > URL: https://issues.apache.org/jira/browse/YARN-899 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Xuan Gong > Attachments: YARN-899.1.patch, YARN-899.2.patch, YARN-899.3.patch, > YARN-899.4.patch > > > The Capacity Scheduler documents the > yarn.scheduler.capacity.root..acl_administer_queue config option > for controlling who can administer a queue, but it is not hooked up to > anything. The Fair Scheduler could make use of a similar option as well. > This is a feature-parity regression from MR1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-814) Difficult to diagnose a failed container launch when error due to invalid environment variable
[ https://issues.apache.org/jira/browse/YARN-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735355#comment-13735355 ] Omkar Vinit Joshi commented on YARN-814: We forgot to fix one issue ..fixing it in another ticket. from ContainerImpl.java.. we are ignoring diagnostic messages... {code} /** * Transitions upon receiving KILL_CONTAINER: * - LOCALIZED -> KILLING * - RUNNING -> KILLING */ @SuppressWarnings("unchecked") // dispatcher not typed static class KillTransition implements SingleArcTransition { @Override public void transition(ContainerImpl container, ContainerEvent event) { // Kill the process/process-grp container.dispatcher.getEventHandler().handle( new ContainersLauncherEvent(container, ContainersLauncherEventType.CLEANUP_CONTAINER)); ContainerKillEvent killEvent = (ContainerKillEvent) event; } } /** * Transition from KILLING to CONTAINER_CLEANEDUP_AFTER_KILL * upon receiving CONTAINER_KILLED_ON_REQUEST. */ static class ContainerKilledTransition implements SingleArcTransition { @Override public void transition(ContainerImpl container, ContainerEvent event) { ContainerExitEvent exitEvent = (ContainerExitEvent) event; container.exitCode = exitEvent.getExitCode(); // The process/process-grp is killed. Decrement reference counts and // cleanup resources container.cleanup(); } } {code} > Difficult to diagnose a failed container launch when error due to invalid > environment variable > -- > > Key: YARN-814 > URL: https://issues.apache.org/jira/browse/YARN-814 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Hitesh Shah >Assignee: Jian He > Fix For: 2.1.0-beta > > Attachments: YARN-814.1.patch, YARN-814.2.patch, YARN-814.3.patch, > YARN-814.4.patch, YARN-814.5.patch, YARN-814.6.patch, YARN-814.7.patch, > YARN-814.patch > > > The container's launch script sets up environment variables, symlinks etc. > If there is any failure when setting up the basic context ( before the actual > user's process is launched ), nothing is captured by the NM. This makes it > impossible to diagnose the reason for the failure. > To reproduce, set an env var where the value contains characters that throw > syntax errors in bash. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1024) Define a virtual core unambigiously
[ https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735321#comment-13735321 ] Luke Lu commented on YARN-1024: --- As many have already pointed out, using vcore to express "unit of computation" causes more confusion than the "predictability" it tries to address. (whatever)core as a unit of thread-affinity/locality is an important concept to preserve. OTOH, I can see the need to define a YCU (YARN Computation Unit) for the properties that Arun tried to address. YCU can be implemented as deployment specific mapping to fractional cores, which preserves both thread-locality and cpu/core-slicing. > Define a virtual core unambigiously > --- > > Key: YARN-1024 > URL: https://issues.apache.org/jira/browse/YARN-1024 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun C Murthy >Assignee: Arun C Murthy > > We need to clearly define the meaning of a virtual core unambiguously so that > it's easy to migrate applications between clusters. > For e.g. here is Amazon EC2 definition of ECU: > http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it > Essentially we need to clearly define a YARN Virtual Core (YVC). > Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the > equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.* -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1046) Disable mem monitoring by default in MiniYARNCluster
[ https://issues.apache.org/jira/browse/YARN-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735317#comment-13735317 ] Hudson commented on YARN-1046: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4241 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4241/]) YARN-1046. Disable mem monitoring my default in MiniYARNCluster (Karthik Kambatla via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1512493) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java > Disable mem monitoring by default in MiniYARNCluster > > > Key: YARN-1046 > URL: https://issues.apache.org/jira/browse/YARN-1046 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Fix For: 2.1.0-beta > > Attachments: yarn-1046-1.patch, yarn-1046-2.patch > > > Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 > machines. However, when I try to run it independently on the machines, I have > not been able to reproduce it. > {noformat} > 2013-08-07 19:17:35,048 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - > Container [pid=16556,containerID=container_1375928243488_0001_01_01] is > running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB > physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container. > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1046) Disable mem monitoring by default in MiniYARNCluster
[ https://issues.apache.org/jira/browse/YARN-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735310#comment-13735310 ] Sandy Ryza commented on YARN-1046: -- Thanks Karthik! Just committed this to trunk, branch-2, branch-2.1-beta, and branch-2.1.0-beta > Disable mem monitoring by default in MiniYARNCluster > > > Key: YARN-1046 > URL: https://issues.apache.org/jira/browse/YARN-1046 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-1046-1.patch, yarn-1046-2.patch > > > Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 > machines. However, when I try to run it independently on the machines, I have > not been able to reproduce it. > {noformat} > 2013-08-07 19:17:35,048 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - > Container [pid=16556,containerID=container_1375928243488_0001_01_01] is > running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB > physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container. > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1046) Disable mem monitoring by default in MiniYARNCluster
[ https://issues.apache.org/jira/browse/YARN-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1046: - Assignee: Karthik Kambatla (was: Sandy Ryza) > Disable mem monitoring by default in MiniYARNCluster > > > Key: YARN-1046 > URL: https://issues.apache.org/jira/browse/YARN-1046 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-1046-1.patch, yarn-1046-2.patch > > > Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 > machines. However, when I try to run it independently on the machines, I have > not been able to reproduce it. > {noformat} > 2013-08-07 19:17:35,048 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - > Container [pid=16556,containerID=container_1375928243488_0001_01_01] is > running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB > physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container. > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-1046) TestDistributedShell fails intermittently
[ https://issues.apache.org/jira/browse/YARN-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza reassigned YARN-1046: Assignee: Sandy Ryza (was: Karthik Kambatla) > TestDistributedShell fails intermittently > - > > Key: YARN-1046 > URL: https://issues.apache.org/jira/browse/YARN-1046 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Karthik Kambatla >Assignee: Sandy Ryza > Attachments: yarn-1046-1.patch, yarn-1046-2.patch > > > Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 > machines. However, when I try to run it independently on the machines, I have > not been able to reproduce it. > {noformat} > 2013-08-07 19:17:35,048 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - > Container [pid=16556,containerID=container_1375928243488_0001_01_01] is > running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB > physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container. > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1046) Disable mem monitoring by default in MiniYARNCluster
[ https://issues.apache.org/jira/browse/YARN-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1046: - Summary: Disable mem monitoring by default in MiniYARNCluster (was: TestDistributedShell fails intermittently) > Disable mem monitoring by default in MiniYARNCluster > > > Key: YARN-1046 > URL: https://issues.apache.org/jira/browse/YARN-1046 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Karthik Kambatla >Assignee: Sandy Ryza > Attachments: yarn-1046-1.patch, yarn-1046-2.patch > > > Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 > machines. However, when I try to run it independently on the machines, I have > not been able to reproduce it. > {noformat} > 2013-08-07 19:17:35,048 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - > Container [pid=16556,containerID=container_1375928243488_0001_01_01] is > running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB > physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container. > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735281#comment-13735281 ] Alejandro Abdelnur commented on YARN-1042: -- it seems much of this could be achieved via YARN-796 > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-656) In scheduler UI, including reserved memory in "Memory Total" can make it exceed cluster capacity.
[ https://issues.apache.org/jira/browse/YARN-656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735270#comment-13735270 ] Hudson commented on YARN-656: - SUCCESS: Integrated in Hadoop-trunk-Commit #4240 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4240/]) YARN-656. In scheduler UI, including reserved memory in Memory Total can make it exceed cluster capacity. (Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1512480) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/ClusterMetricsInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java Reverting 1512475 (labeled as YARN-656). Accidentally committed two patches together (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1512478) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/ClusterMetricsInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java > In scheduler UI, including reserved memory in "Memory Total" can make it > exceed cluster capacity. > - > > Key: YARN-656 > URL: https://issues.apache.org/jira/browse/YARN-656 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.4-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Fix For: 2.1.0-beta > > Attachments: YARN-656-1.patch, YARN-656.patch > > > "Memory Total" is currently a sum of availableMB, allocatedMB, and > reservedMB. Including reservedMB in this sum can make the total exceed the > capacity of the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-656) In scheduler UI, including reserved memory in "Memory Total" can make it exceed cluster capacity.
[ https://issues.apache.org/jira/browse/YARN-656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735249#comment-13735249 ] Hudson commented on YARN-656: - SUCCESS: Integrated in Hadoop-trunk-Commit #4239 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4239/]) YARN-656. In scheduler UI, including reserved memory in Memory Total can make it exceed cluster capacity. (Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1512475) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/ClusterMetricsInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java > In scheduler UI, including reserved memory in "Memory Total" can make it > exceed cluster capacity. > - > > Key: YARN-656 > URL: https://issues.apache.org/jira/browse/YARN-656 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.4-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-656-1.patch, YARN-656.patch > > > "Memory Total" is currently a sum of availableMB, allocatedMB, and > reservedMB. Including reservedMB in this sum can make the total exceed the > capacity of the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1030) Adding AHS as service of RM
[ https://issues.apache.org/jira/browse/YARN-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735210#comment-13735210 ] Hadoop QA commented on YARN-1030: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12597159/YARN-1030.1.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1685//console This message is automatically generated. > Adding AHS as service of RM > --- > > Key: YARN-1030 > URL: https://issues.apache.org/jira/browse/YARN-1030 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Zhijie Shen > Attachments: YARN-1030.1.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1030) Adding AHS as service of RM
[ https://issues.apache.org/jira/browse/YARN-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1030: -- Attachment: YARN-1030.1.patch Created the patch to add AHS as the a service of RM, and fix a but in AHS.serviceInit(). > Adding AHS as service of RM > --- > > Key: YARN-1030 > URL: https://issues.apache.org/jira/browse/YARN-1030 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Zhijie Shen > Attachments: YARN-1030.1.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-899) Get queue administration ACLs working
[ https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735204#comment-13735204 ] Hadoop QA commented on YARN-899: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12597148/YARN-899.4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 10 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1684//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1684//console This message is automatically generated. > Get queue administration ACLs working > - > > Key: YARN-899 > URL: https://issues.apache.org/jira/browse/YARN-899 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Xuan Gong > Attachments: YARN-899.1.patch, YARN-899.2.patch, YARN-899.3.patch, > YARN-899.4.patch > > > The Capacity Scheduler documents the > yarn.scheduler.capacity.root..acl_administer_queue config option > for controlling who can administer a queue, but it is not hooked up to > anything. The Fair Scheduler could make use of a similar option as well. > This is a feature-parity regression from MR1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator
[ https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735170#comment-13735170 ] Alejandro Abdelnur commented on YARN-1021: -- Patch looks good, some comments: Documentation: * 'Usage': you are assuming the user built hadoop, instead, you should explain things assuming the user is standing in the hadoop install directory, after they expand the tarball, standing where share/ and etc/. * 'Usage step 1': mention the sls-runner.xml has all the default values for this properties (it should if not). Also, mention that the hadoop RM/scheduler configuration will be taken from the Hadoop conf directory. * 'Usage step 2': use path to rumen2sls.sh from install directory. * 'Usage step 3': use path to slsrun.sh from install directory. Code: * AMSimulator: var definitions, one var def per line, don't comma them. * AMSimulator: middleStep() method, no need to use 'this.' when invoking methods. * AMSimulator: instead using SLF4J MessageFormatter, use JDK MessageFormat as in other places (note you'll have to change from {} to {#} in the templates). * ResourceSchedulerWrapper: instead using getRuntime() to add a shutdown hook it should use Hadoop's ShutdownHookManager html/simulateTraces.html should be renamed to showSimulationTrace.html. The documentation should explain clearly that a SLS run produces a simulation trace that can be seen after running the simulation. > Yarn Scheduler Load Simulator > - > > Key: YARN-1021 > URL: https://issues.apache.org/jira/browse/YARN-1021 > Project: Hadoop YARN > Issue Type: New Feature > Components: scheduler >Reporter: Wei Yan >Assignee: Wei Yan > Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, > YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, > YARN-1021.patch, YARN-1021.pdf > > > The Yarn Scheduler is a fertile area of interest with different > implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile, > several optimizations are also made to improve scheduler performance for > different scenarios and workload. Each scheduler algorithm has its own set of > features, and drives scheduling decisions by many factors, such as fairness, > capacity guarantee, resource availability, etc. It is very important to > evaluate a scheduler algorithm very well before we deploy it in a production > cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling > algorithm. Evaluating in a real cluster is always time and cost consuming, > and it is also very hard to find a large-enough cluster. Hence, a simulator > which can predict how well a scheduler algorithm for some specific workload > would be quite useful. > We want to build a Scheduler Load Simulator to simulate large-scale Yarn > clusters and application loads in a single machine. This would be invaluable > in furthering Yarn by providing a tool for researchers and developers to > prototype new scheduler features and predict their behavior and performance > with reasonable amount of confidence, there-by aiding rapid innovation. > The simulator will exercise the real Yarn ResourceManager removing the > network factor by simulating NodeManagers and ApplicationMasters via handling > and dispatching NM/AMs heartbeat events from within the same JVM. > To keep tracking of scheduler behavior and performance, a scheduler wrapper > will wrap the real scheduler. > The simulator will produce real time metrics while executing, including: > * Resource usages for whole cluster and each queue, which can be utilized to > configure cluster and queue's capacity. > * The detailed application execution trace (recorded in relation to simulated > time), which can be analyzed to understand/validate the scheduler behavior > (individual jobs turn around time, throughput, fairness, capacity guarantee, > etc). > * Several key metrics of scheduler algorithm, such as time cost of each > scheduler operation (allocate, handle, etc), which can be utilized by Hadoop > developers to find the code spots and scalability limits. > The simulator will provide real time charts showing the behavior of the > scheduler and its performance. > A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing > how to use simulator to simulate Fair Scheduler and Capacity Scheduler. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-899) Get queue administration ACLs working
[ https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-899: --- Attachment: YARN-899.4.patch > Get queue administration ACLs working > - > > Key: YARN-899 > URL: https://issues.apache.org/jira/browse/YARN-899 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Xuan Gong > Attachments: YARN-899.1.patch, YARN-899.2.patch, YARN-899.3.patch, > YARN-899.4.patch > > > The Capacity Scheduler documents the > yarn.scheduler.capacity.root..acl_administer_queue config option > for controlling who can administer a queue, but it is not hooked up to > anything. The Fair Scheduler could make use of a similar option as well. > This is a feature-parity regression from MR1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-899) Get queue administration ACLs working
[ https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735126#comment-13735126 ] Hadoop QA commented on YARN-899: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12597140/YARN-899.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 20 new or modified test files. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1683//console This message is automatically generated. > Get queue administration ACLs working > - > > Key: YARN-899 > URL: https://issues.apache.org/jira/browse/YARN-899 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Xuan Gong > Attachments: YARN-899.1.patch, YARN-899.2.patch, YARN-899.3.patch > > > The Capacity Scheduler documents the > yarn.scheduler.capacity.root..acl_administer_queue config option > for controlling who can administer a queue, but it is not hooked up to > anything. The Fair Scheduler could make use of a similar option as well. > This is a feature-parity regression from MR1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-899) Get queue administration ACLs working
[ https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-899: --- Attachment: YARN-899.3.patch > Get queue administration ACLs working > - > > Key: YARN-899 > URL: https://issues.apache.org/jira/browse/YARN-899 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Xuan Gong > Attachments: YARN-899.1.patch, YARN-899.2.patch, YARN-899.3.patch > > > The Capacity Scheduler documents the > yarn.scheduler.capacity.root..acl_administer_queue config option > for controlling who can administer a queue, but it is not hooked up to > anything. The Fair Scheduler could make use of a similar option as well. > This is a feature-parity regression from MR1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore
[ https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735088#comment-13735088 ] Karthik Kambatla commented on YARN-353: --- Not quite sure why we see the findbugs warning, it is in files that I haven't touched. Filed HDFS-5082 to handle remove zookeeper version info from hadoop-hdfs/pom.xml. > Add Zookeeper-based store implementation for RMStateStore > - > > Key: YARN-353 > URL: https://issues.apache.org/jira/browse/YARN-353 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Hitesh Shah >Assignee: Bikas Saha > Attachments: YARN-353.10.patch, YARN-353.11.patch, YARN-353.12.patch, > yarn-353-12-wip.patch, YARN-353.1.patch, YARN-353.2.patch, YARN-353.3.patch, > YARN-353.4.patch, YARN-353.5.patch, YARN-353.6.patch, YARN-353.7.patch, > YARN-353.8.patch, YARN-353.9.patch > > > Add store that write RM state data to ZK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1045) Improve toString implementation for PBImpls
[ https://issues.apache.org/jira/browse/YARN-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735035#comment-13735035 ] Jian He commented on YARN-1045: --- YARN tests all passed, MR tests have some failures which also occur on trunk > Improve toString implementation for PBImpls > --- > > Key: YARN-1045 > URL: https://issues.apache.org/jira/browse/YARN-1045 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.0.4-alpha >Reporter: Siddharth Seth >Assignee: Jian He > Attachments: YARN-1045.1.patch, YARN-1045.patch > > > The generic toString implementation that is used in most of the PBImpls > {code}getProto().toString().replaceAll("\\n", ", ").replaceAll("\\s+", " > ");{code} is rather inefficient - replacing "\n" and "\s" to generate a one > line string. Instead, we can use > {code}TextFormat.shortDebugString(getProto());{code}. > If we can get this into 2.1.0 - great, otherwise the next release. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-896) Roll up for long lived YARN
[ https://issues.apache.org/jira/browse/YARN-896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734983#comment-13734983 ] Robert Joseph Evans commented on YARN-896: -- Sorry I have not responded sooner. I have been out on vacation and had a high severity issue that has consumed a lot of my time. [~lmccay] and [~thw] There are many different services that long lived processes need to communicate with. Many of these services use tokens and others may not. Each of these tokens or other credentials are specific to the services being accessed. In some cases like with HBase we probably can take advantage of the existing renewal feature in the RM. With other tokens or credentials it may be different, and may require AM specific support for them. I am not really that concerned with solving the renewal problem for all possible credentials here, although if we can solve this for a lot of common tokens at the same time that would be great. What I care most about is being sure that a long lived YARN application does not necessarily have to stop and restart because an HDFS token cannot be renewed any longer. If there are changes going into the HDFS security model that would make YARN-941 unnecessary that is great. I have not had much time to follow the security discussion so thank you for pointing this out. But it is also a question of time frames. YARN-941 and YARN-1041 would allow for secure, robust, long lived applications on YARN, and do not appear to be that difficult to accomplish. Do you know the time frame for the security rework? > Roll up for long lived YARN > --- > > Key: YARN-896 > URL: https://issues.apache.org/jira/browse/YARN-896 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Robert Joseph Evans > > YARN is intended to be general purpose, but it is missing some features to be > able to truly support long lived applications and long lived containers. > This ticket is intended to > # discuss what is needed to support long lived processes > # track the resulting JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1051) YARN Admission Control/Planner: enhancing the resource allocation model with time.
[ https://issues.apache.org/jira/browse/YARN-1051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734973#comment-13734973 ] Carlo Curino commented on YARN-1051: This umbrella JIRA proposes an extension of the YARN RM to allow for richer admission-control semantics (beside existing ACL checks). This allows jobs/users to negotiate with the RM at admission control time for time-bounded, guaranteed allocation of cluster resources (e.g., I need 100 containers for 2 hours at any time before 3pm today). Such request can be per-job or per-users (maybe we can call this a "session"). It provides the RM with an understanding of future resource demand, and exposes jobs timeand resource constraints, hence enabling the RM to lookahead and plan resource allocation over time (e.g., a job submitted now, but with lots of time before its deadline might be run after a job showing up later but in a rush to complete). This is an important step towards SLAs on the resources received by a job/user over time, which seems useful for long-running services, workflows, and can help ameliorate some of the gang-scheduling concerns (admission control will guarantee the resources to be available, hence hoarding is not likely to produce deadlocks). This will require: * addictive modifications to the job-submission API (to capture job's resource demands) * an internal API between admission control / planner (working on the planning aspects) and the scheduler (enforcing the plan, and handling containers etc...) * changes to the underlying scheduler (we started with the CapacityScheduler) to support queue addition/removal/resizing and cross-queues job migration, but this should ideally be pushed to the YarnScheduler API and be cross-scheduler (from various conversations, this seem to be needed/useful indepedently). * changes to the RM tracking datastructures to maintain metering of how many resources have been allocated to a job until now (also enables billing and accounting on the RM side, and other history-aware planning and scheduling). * implementation of (simple first) admission control mechanism, that verify whether a job with a certain Contract can be admitted, and perform basic planning (knapsack-like to start, can be extended to sophisticated economics models). We will track this in Sub-JIRAs. > YARN Admission Control/Planner: enhancing the resource allocation model with > time. > -- > > Key: YARN-1051 > URL: https://issues.apache.org/jira/browse/YARN-1051 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler, resourcemanager, scheduler >Reporter: Carlo Curino >Assignee: Carlo Curino > > In this umbrella JIRA we propose to extend the YARN RM to handle time > explicitly, allowing users to "reserve" capacity over time. This is an > important step towards SLAs, long-running services, workflows, and helps for > gang scheduling. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-899) Get queue administration ACLs working
[ https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734967#comment-13734967 ] Hadoop QA commented on YARN-899: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12596979/YARN-899.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1682//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1682//console This message is automatically generated. > Get queue administration ACLs working > - > > Key: YARN-899 > URL: https://issues.apache.org/jira/browse/YARN-899 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Xuan Gong > Attachments: YARN-899.1.patch, YARN-899.2.patch > > > The Capacity Scheduler documents the > yarn.scheduler.capacity.root..acl_administer_queue config option > for controlling who can administer a queue, but it is not hooked up to > anything. The Fair Scheduler could make use of a similar option as well. > This is a feature-parity regression from MR1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1051) YARN Admission Control/Planner: enhancing the resource allocation model with time.
Carlo Curino created YARN-1051: -- Summary: YARN Admission Control/Planner: enhancing the resource allocation model with time. Key: YARN-1051 URL: https://issues.apache.org/jira/browse/YARN-1051 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, resourcemanager, scheduler Reporter: Carlo Curino Assignee: Carlo Curino In this umbrella JIRA we propose to extend the YARN RM to handle time explicitly, allowing users to "reserve" capacity over time. This is an important step towards SLAs, long-running services, workflows, and helps for gang scheduling. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1050) Document the Fair Scheduler REST API
Sandy Ryza created YARN-1050: Summary: Document the Fair Scheduler REST API Key: YARN-1050 URL: https://issues.apache.org/jira/browse/YARN-1050 Project: Hadoop YARN Issue Type: Improvement Components: documentation Reporter: Sandy Ryza The documentation should be placed here along with the Capacity Scheduler documentation: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Scheduler_API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-589) Expose a REST API for monitoring the fair scheduler
[ https://issues.apache.org/jira/browse/YARN-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734935#comment-13734935 ] Sandy Ryza commented on YARN-589: - Good point - filed YARN-1050 > Expose a REST API for monitoring the fair scheduler > --- > > Key: YARN-589 > URL: https://issues.apache.org/jira/browse/YARN-589 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.0.3-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Fix For: 2.1.1-beta > > Attachments: fairscheduler.xml, YARN-589-1.patch, YARN-589-2.patch, > YARN-589.patch > > > The fair scheduler should have an HTTP interface that exposes information > such as applications per queue, fair shares, demands, current allocations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-589) Expose a REST API for monitoring the fair scheduler
[ https://issues.apache.org/jira/browse/YARN-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734834#comment-13734834 ] Hudson commented on YARN-589: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1513 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1513/]) Amending YARN-589. Adding missing file from patch (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1512112) * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesFairScheduler.java YARN-589. Expose a REST API for monitoring the fair scheduler (Sandy Ryza). (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1512111) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerLeafQueueInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/ResourceInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerInfo.java > Expose a REST API for monitoring the fair scheduler > --- > > Key: YARN-589 > URL: https://issues.apache.org/jira/browse/YARN-589 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.0.3-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Fix For: 2.1.1-beta > > Attachments: fairscheduler.xml, YARN-589-1.patch, YARN-589-2.patch, > YARN-589.patch > > > The fair scheduler should have an HTTP interface that exposes information > such as applications per queue, fair shares, demands, current allocations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1043) YARN Queue metrics are getting pushed to neither file nor Ganglia
[ https://issues.apache.org/jira/browse/YARN-1043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734835#comment-13734835 ] Hudson commented on YARN-1043: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1513 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1513/]) YARN-1043. Push all metrics consistently. Contributed by Jian He. (acmurthy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1512081) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueMetrics.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestQueueMetrics.java > YARN Queue metrics are getting pushed to neither file nor Ganglia > - > > Key: YARN-1043 > URL: https://issues.apache.org/jira/browse/YARN-1043 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Yusaku Sako >Assignee: Jian He > Fix For: 2.1.0-beta > > Attachments: YARN-1043.1.patch, YARN-1043.patch > > > YARN Queue metrics are not getting pushed to file or Ganglia via Hadoop > Metrics 2. > QueueMetrics are still accessible via JMX and RM REST API > (:8088/ws/v1/cluster/scheduler). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-589) Expose a REST API for monitoring the fair scheduler
[ https://issues.apache.org/jira/browse/YARN-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734756#comment-13734756 ] Thomas Graves commented on YARN-589: sorry for jumping in late on this, do we have another jira for adding documentation? > Expose a REST API for monitoring the fair scheduler > --- > > Key: YARN-589 > URL: https://issues.apache.org/jira/browse/YARN-589 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.0.3-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Fix For: 2.1.1-beta > > Attachments: fairscheduler.xml, YARN-589-1.patch, YARN-589-2.patch, > YARN-589.patch > > > The fair scheduler should have an HTTP interface that exposes information > such as applications per queue, fair shares, demands, current allocations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-589) Expose a REST API for monitoring the fair scheduler
[ https://issues.apache.org/jira/browse/YARN-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734731#comment-13734731 ] Hudson commented on YARN-589: - ABORTED: Integrated in Hadoop-Hdfs-trunk #1486 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1486/]) Amending YARN-589. Adding missing file from patch (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1512112) * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesFairScheduler.java YARN-589. Expose a REST API for monitoring the fair scheduler (Sandy Ryza). (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1512111) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerLeafQueueInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/ResourceInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerInfo.java > Expose a REST API for monitoring the fair scheduler > --- > > Key: YARN-589 > URL: https://issues.apache.org/jira/browse/YARN-589 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.0.3-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Fix For: 2.1.1-beta > > Attachments: fairscheduler.xml, YARN-589-1.patch, YARN-589-2.patch, > YARN-589.patch > > > The fair scheduler should have an HTTP interface that exposes information > such as applications per queue, fair shares, demands, current allocations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1043) YARN Queue metrics are getting pushed to neither file nor Ganglia
[ https://issues.apache.org/jira/browse/YARN-1043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734732#comment-13734732 ] Hudson commented on YARN-1043: -- ABORTED: Integrated in Hadoop-Hdfs-trunk #1486 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1486/]) YARN-1043. Push all metrics consistently. Contributed by Jian He. (acmurthy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1512081) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueMetrics.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestQueueMetrics.java > YARN Queue metrics are getting pushed to neither file nor Ganglia > - > > Key: YARN-1043 > URL: https://issues.apache.org/jira/browse/YARN-1043 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Yusaku Sako >Assignee: Jian He > Fix For: 2.1.0-beta > > Attachments: YARN-1043.1.patch, YARN-1043.patch > > > YARN Queue metrics are not getting pushed to file or Ganglia via Hadoop > Metrics 2. > QueueMetrics are still accessible via JMX and RM REST API > (:8088/ws/v1/cluster/scheduler). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-589) Expose a REST API for monitoring the fair scheduler
[ https://issues.apache.org/jira/browse/YARN-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734664#comment-13734664 ] Hudson commented on YARN-589: - SUCCESS: Integrated in Hadoop-Yarn-trunk #296 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/296/]) Amending YARN-589. Adding missing file from patch (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1512112) * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesFairScheduler.java YARN-589. Expose a REST API for monitoring the fair scheduler (Sandy Ryza). (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1512111) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerLeafQueueInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/ResourceInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerInfo.java > Expose a REST API for monitoring the fair scheduler > --- > > Key: YARN-589 > URL: https://issues.apache.org/jira/browse/YARN-589 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.0.3-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Fix For: 2.1.1-beta > > Attachments: fairscheduler.xml, YARN-589-1.patch, YARN-589-2.patch, > YARN-589.patch > > > The fair scheduler should have an HTTP interface that exposes information > such as applications per queue, fair shares, demands, current allocations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1043) YARN Queue metrics are getting pushed to neither file nor Ganglia
[ https://issues.apache.org/jira/browse/YARN-1043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734665#comment-13734665 ] Hudson commented on YARN-1043: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #296 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/296/]) YARN-1043. Push all metrics consistently. Contributed by Jian He. (acmurthy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1512081) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueMetrics.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestQueueMetrics.java > YARN Queue metrics are getting pushed to neither file nor Ganglia > - > > Key: YARN-1043 > URL: https://issues.apache.org/jira/browse/YARN-1043 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Yusaku Sako >Assignee: Jian He > Fix For: 2.1.0-beta > > Attachments: YARN-1043.1.patch, YARN-1043.patch > > > YARN Queue metrics are not getting pushed to file or Ganglia via Hadoop > Metrics 2. > QueueMetrics are still accessible via JMX and RM REST API > (:8088/ws/v1/cluster/scheduler). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore
[ https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734570#comment-13734570 ] Hadoop QA commented on YARN-353: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12597034/YARN-353.12.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1681//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/1681//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1681//console This message is automatically generated. > Add Zookeeper-based store implementation for RMStateStore > - > > Key: YARN-353 > URL: https://issues.apache.org/jira/browse/YARN-353 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Hitesh Shah >Assignee: Bikas Saha > Attachments: YARN-353.10.patch, YARN-353.11.patch, YARN-353.12.patch, > yarn-353-12-wip.patch, YARN-353.1.patch, YARN-353.2.patch, YARN-353.3.patch, > YARN-353.4.patch, YARN-353.5.patch, YARN-353.6.patch, YARN-353.7.patch, > YARN-353.8.patch, YARN-353.9.patch > > > Add store that write RM state data to ZK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira