[jira] [Commented] (YARN-2943) Add a node-labels page in RM web UI
[ https://issues.apache.org/jira/browse/YARN-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14260903#comment-14260903 ] Hadoop QA commented on YARN-2943: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689445/YARN-2943.6.patch against trunk revision 249cc90. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart The following test timeouts occurred in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6214//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6214//console This message is automatically generated. Add a node-labels page in RM web UI --- Key: YARN-2943 URL: https://issues.apache.org/jira/browse/YARN-2943 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: Node-labels-page.png, Nodes-page-with-label-filter.png, YARN-2943.1.patch, YARN-2943.2.patch, YARN-2943.3.patch, YARN-2943.4.patch, YARN-2943.5.patch, YARN-2943.6.patch Now we have node labels in the system, but there's no a very convenient to get information like how many active NM(s) assigned to a given label?, how much total resource for a give label?, For a given label, which queues can access it?, etc. It will be better to add a node-labels page in RM web UI, users/admins can have a centralized view to see such information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang Haoran updated YARN-149: - Affects Version/s: (was: 2.4.0) ResourceManager (RM) High-Availability (HA) --- Key: YARN-149 URL: https://issues.apache.org/jira/browse/YARN-149 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager Reporter: Harsh J Labels: patch Attachments: YARN ResourceManager Automatic Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic Failover-rev-08-04-13.pdf, rm-ha-phase1-approach-draft1.pdf, rm-ha-phase1-draft2.pdf This jira tracks work needed to be done to support one RM instance failing over to another RM instance so that we can have RM HA. Work includes leader election, transfer of control to leader and client re-direction to new leader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice
[ https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261027#comment-14261027 ] Hudson commented on YARN-2938: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #57 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/57/]) YARN-2938. Fixed new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice. Contributed by Varun Saxena. (zjshen: rev 241d3b3a50c6af92f023d8b2c24598f4813f4674) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/LeveldbTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/MemoryTimelineStore.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice -- Key: YARN-2938 URL: https://issues.apache.org/jira/browse/YARN-2938 Project: Hadoop YARN Issue Type: Improvement Reporter: Varun Saxena Assignee: Varun Saxena Fix For: 2.7.0 Attachments: FindBugs Report.html, YARN-2938.001.patch, YARN-2938.002.patch, YARN-2938.003.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice
[ https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261040#comment-14261040 ] Hudson commented on YARN-2938: -- FAILURE: Integrated in Hadoop-Yarn-trunk #791 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/791/]) YARN-2938. Fixed new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice. Contributed by Varun Saxena. (zjshen: rev 241d3b3a50c6af92f023d8b2c24598f4813f4674) * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/LeveldbTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/MemoryTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice -- Key: YARN-2938 URL: https://issues.apache.org/jira/browse/YARN-2938 Project: Hadoop YARN Issue Type: Improvement Reporter: Varun Saxena Assignee: Varun Saxena Fix For: 2.7.0 Attachments: FindBugs Report.html, YARN-2938.001.patch, YARN-2938.002.patch, YARN-2938.003.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261088#comment-14261088 ] Rohith commented on YARN-2991: -- Hi [~zjshen], kindly review the analysis and patch TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk -- Key: YARN-2991 URL: https://issues.apache.org/jira/browse/YARN-2991 Project: Hadoop YARN Issue Type: Test Reporter: Zhijie Shen Assignee: Rohith Priority: Blocker Attachments: 0001-YARN-2991.patch {code} Error Message test timed out after 6 milliseconds Stacktrace java.lang.Exception: test timed out after 6 milliseconds at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1281) at java.lang.Thread.join(Thread.java:1355) at org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) {code} It happened twice this months: https://builds.apache.org/job/PreCommit-YARN-Build/6096/ https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2797) TestWorkPreservingRMRestart should use ParametrizedSchedulerTestBase
[ https://issues.apache.org/jira/browse/YARN-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261090#comment-14261090 ] Rohith commented on YARN-2797: -- Make sense, +1 for approach. TestWorkPreservingRMRestart should use ParametrizedSchedulerTestBase Key: YARN-2797 URL: https://issues.apache.org/jira/browse/YARN-2797 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Minor Attachments: yarn-2797-1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2797) TestWorkPreservingRMRestart should use ParametrizedSchedulerTestBase
[ https://issues.apache.org/jira/browse/YARN-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261092#comment-14261092 ] Rohith commented on YARN-2797: -- YARN-2991 corresponding jira fixing one of the test case random failure. TestWorkPreservingRMRestart should use ParametrizedSchedulerTestBase Key: YARN-2797 URL: https://issues.apache.org/jira/browse/YARN-2797 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Minor Attachments: yarn-2797-1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice
[ https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261122#comment-14261122 ] Hudson commented on YARN-2938: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1989 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1989/]) YARN-2938. Fixed new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice. Contributed by Varun Saxena. (zjshen: rev 241d3b3a50c6af92f023d8b2c24598f4813f4674) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/MemoryTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/LeveldbTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice -- Key: YARN-2938 URL: https://issues.apache.org/jira/browse/YARN-2938 Project: Hadoop YARN Issue Type: Improvement Reporter: Varun Saxena Assignee: Varun Saxena Fix For: 2.7.0 Attachments: FindBugs Report.html, YARN-2938.001.patch, YARN-2938.002.patch, YARN-2938.003.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice
[ https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261130#comment-14261130 ] Hudson commented on YARN-2938: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #54 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/54/]) YARN-2938. Fixed new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice. Contributed by Varun Saxena. (zjshen: rev 241d3b3a50c6af92f023d8b2c24598f4813f4674) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/LeveldbTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/MemoryTimelineStore.java Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice -- Key: YARN-2938 URL: https://issues.apache.org/jira/browse/YARN-2938 Project: Hadoop YARN Issue Type: Improvement Reporter: Varun Saxena Assignee: Varun Saxena Fix For: 2.7.0 Attachments: FindBugs Report.html, YARN-2938.001.patch, YARN-2938.002.patch, YARN-2938.003.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice
[ https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261150#comment-14261150 ] Hudson commented on YARN-2938: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #58 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/58/]) YARN-2938. Fixed new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice. Contributed by Varun Saxena. (zjshen: rev 241d3b3a50c6af92f023d8b2c24598f4813f4674) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/MemoryTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/LeveldbTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice -- Key: YARN-2938 URL: https://issues.apache.org/jira/browse/YARN-2938 Project: Hadoop YARN Issue Type: Improvement Reporter: Varun Saxena Assignee: Varun Saxena Fix For: 2.7.0 Attachments: FindBugs Report.html, YARN-2938.001.patch, YARN-2938.002.patch, YARN-2938.003.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-2881: Attachment: YARN-2881.002.patch Kicking jenkins again as the failures seem unrelated Implement PlanFollower for FairScheduler Key: YARN-2881 URL: https://issues.apache.org/jira/browse/YARN-2881 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2881.001.patch, YARN-2881.002.patch, YARN-2881.002.patch, YARN-2881.prelim.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice
[ https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261172#comment-14261172 ] Hudson commented on YARN-2938: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2008 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2008/]) YARN-2938. Fixed new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice. Contributed by Varun Saxena. (zjshen: rev 241d3b3a50c6af92f023d8b2c24598f4813f4674) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/MemoryTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/LeveldbTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice -- Key: YARN-2938 URL: https://issues.apache.org/jira/browse/YARN-2938 Project: Hadoop YARN Issue Type: Improvement Reporter: Varun Saxena Assignee: Varun Saxena Fix For: 2.7.0 Attachments: FindBugs Report.html, YARN-2938.001.patch, YARN-2938.002.patch, YARN-2938.003.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2978) RM crashes with NPE while getting queue info
[ https://issues.apache.org/jira/browse/YARN-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-2978: --- Summary: RM crashes with NPE while getting queue info (was: Null pointer in YarnProtos) RM crashes with NPE while getting queue info Key: YARN-2978 URL: https://issues.apache.org/jira/browse/YARN-2978 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.5.1 Reporter: Jason Tufo Assignee: Varun Saxena java.lang.NullPointerException at org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto.isInitialized(YarnProtos.java:29625) at org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto$Builder.build(YarnProtos.java:29939) at org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.mergeLocalToProto(QueueInfoPBImpl.java:290) at org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.getProto(QueueInfoPBImpl.java:157) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.convertToProtoFormat(GetQueueInfoResponsePBImpl.java:128) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToBuilder(GetQueueInfoResponsePBImpl.java:104) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToProto(GetQueueInfoResponsePBImpl.java:111) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.getProto(GetQueueInfoResponsePBImpl.java:53) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:235) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2978) ResourceManager crashes with NPE while getting queue info
[ https://issues.apache.org/jira/browse/YARN-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-2978: --- Summary: ResourceManager crashes with NPE while getting queue info (was: RM crashes with NPE while getting queue info) ResourceManager crashes with NPE while getting queue info - Key: YARN-2978 URL: https://issues.apache.org/jira/browse/YARN-2978 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.5.1 Reporter: Jason Tufo Assignee: Varun Saxena java.lang.NullPointerException at org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto.isInitialized(YarnProtos.java:29625) at org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto$Builder.build(YarnProtos.java:29939) at org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.mergeLocalToProto(QueueInfoPBImpl.java:290) at org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.getProto(QueueInfoPBImpl.java:157) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.convertToProtoFormat(GetQueueInfoResponsePBImpl.java:128) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToBuilder(GetQueueInfoResponsePBImpl.java:104) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToProto(GetQueueInfoResponsePBImpl.java:111) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.getProto(GetQueueInfoResponsePBImpl.java:53) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:235) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2987) ClientRMService#getQueueInfo doesn't check app ACLs
[ https://issues.apache.org/jira/browse/YARN-2987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261259#comment-14261259 ] Varun Saxena commented on YARN-2987: [~jianhe], test failure is unrelated. Passing in my local ClientRMService#getQueueInfo doesn't check app ACLs --- Key: YARN-2987 URL: https://issues.apache.org/jira/browse/YARN-2987 Project: Hadoop YARN Issue Type: Bug Reporter: Jian He Assignee: Varun Saxena Attachments: YARN-2987.001.patch, YARN-2987.002.patch ClientRMService#getQueueInfo can return a list of applications belonging to the queue, but doesn't actually check if the user has the permission to view the applications. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261266#comment-14261266 ] Hadoop QA commented on YARN-2881: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689514/YARN-2881.002.patch against trunk revision 249cc90. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 14 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestRM org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart org.apache.hadoop.yarn.server.resourcemanager.TestRMHA The following test timeouts occurred in hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6216//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6216//artifact/patchprocess/newPatchFindbugsWarningshadoop-sls.html Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6216//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6216//console This message is automatically generated. Implement PlanFollower for FairScheduler Key: YARN-2881 URL: https://issues.apache.org/jira/browse/YARN-2881 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2881.001.patch, YARN-2881.002.patch, YARN-2881.002.patch, YARN-2881.prelim.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261350#comment-14261350 ] Zhijie Shen commented on YARN-2991: --- After [YARN-1121|https://issues.apache.org/jira/secure/EditComment!default.jspa?id=12666077commentId=13808704], AsyncDispatcher already has the draining logic, but only doesn't provide await() method to users. The bug you have found with DrainDispatcher seems to be handled properly in AsyncDispatcher. How about removing the overriding methods in DrainDispatcher, and only adding await()? TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk -- Key: YARN-2991 URL: https://issues.apache.org/jira/browse/YARN-2991 Project: Hadoop YARN Issue Type: Test Reporter: Zhijie Shen Assignee: Rohith Priority: Blocker Attachments: 0001-YARN-2991.patch {code} Error Message test timed out after 6 milliseconds Stacktrace java.lang.Exception: test timed out after 6 milliseconds at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1281) at java.lang.Thread.join(Thread.java:1355) at org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) {code} It happened twice this months: https://builds.apache.org/job/PreCommit-YARN-Build/6096/ https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261360#comment-14261360 ] Rohith commented on YARN-2991: -- bq. The bug you have found with DrainDispatcher seems to be handled properly in AsyncDispatcher Yes, it is handled. bq. How about removing the overriding methods in DrainDispatcher, and only adding await()? This I was thought initially before uploading current patch. But since visibility of instance variable *drained* will change from private to protected only for test framework. Given it is accepted, we can change visibility. TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk -- Key: YARN-2991 URL: https://issues.apache.org/jira/browse/YARN-2991 Project: Hadoop YARN Issue Type: Test Reporter: Zhijie Shen Assignee: Rohith Priority: Blocker Attachments: 0001-YARN-2991.patch {code} Error Message test timed out after 6 milliseconds Stacktrace java.lang.Exception: test timed out after 6 milliseconds at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1281) at java.lang.Thread.join(Thread.java:1355) at org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) {code} It happened twice this months: https://builds.apache.org/job/PreCommit-YARN-Build/6096/ https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2922) Concurrent Modification Exception in LeafQueue when collecting applications
[ https://issues.apache.org/jira/browse/YARN-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261367#comment-14261367 ] Rohith commented on YARN-2922: -- Given if patch is fine, could it commit please? Concurrent Modification Exception in LeafQueue when collecting applications --- Key: YARN-2922 URL: https://issues.apache.org/jira/browse/YARN-2922 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.5.1 Reporter: Jason Tufo Assignee: Rohith Attachments: 0001-YARN-2922.patch java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.collectSchedulerApplications(LeafQueue.java:1618) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getAppsInQueue(CapacityScheduler.java:1119) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueInfo(ClientRMService.java:798) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:234) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2958) RMStateStore seems to unnecessarily and wrongly store sequence number separately
[ https://issues.apache.org/jira/browse/YARN-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261385#comment-14261385 ] Varun Saxena commented on YARN-2958: [~jianhe] / [~zjshen], kindly review RMStateStore seems to unnecessarily and wrongly store sequence number separately Key: YARN-2958 URL: https://issues.apache.org/jira/browse/YARN-2958 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Zhijie Shen Assignee: Varun Saxena Priority: Blocker Attachments: YARN-2958.001.patch, YARN-2958.002.patch, YARN-2958.003.patch It seems that RMStateStore updates last sequence number when storing or updating each individual DT, to recover the latest sequence number when RM restarting. First, the current logic seems to be problematic: {code} public synchronized void updateRMDelegationTokenAndSequenceNumber( RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate, int latestSequenceNumber) { if(isFencedState()) { LOG.info(State store is in Fenced state. Can't update RM Delegation Token.); return; } try { updateRMDelegationTokenAndSequenceNumberInternal(rmDTIdentifier, renewDate, latestSequenceNumber); } catch (Exception e) { notifyStoreOperationFailed(e); } } {code} {code} @Override protected void updateStoredToken(RMDelegationTokenIdentifier id, long renewDate) { try { LOG.info(updating RMDelegation token with sequence number: + id.getSequenceNumber()); rmContext.getStateStore().updateRMDelegationTokenAndSequenceNumber(id, renewDate, id.getSequenceNumber()); } catch (Exception e) { LOG.error(Error in updating persisted RMDelegationToken with sequence number: + id.getSequenceNumber()); ExitUtil.terminate(1, e); } } {code} According to code above, even when renewing a DT, the last sequence number is updated in the store, which is wrong. For example, we have the following sequence: 1. Get DT 1 (seq = 1) 2. Get DT 2( seq = 2) 3. Renew DT 1 (seq = 1) 4. Restart RM The stored and then recovered last sequence number is 1. It makes the next created DT after RM restarting will conflict with DT 2 on sequence num. Second, the aforementioned bug doesn't happen actually, because the recovered last sequence num has been overwritten at by the correctly one. {code} public void recover(RMState rmState) throws Exception { LOG.info(recovering RMDelegationTokenSecretManager.); // recover RMDTMasterKeys for (DelegationKey dtKey : rmState.getRMDTSecretManagerState() .getMasterKeyState()) { addKey(dtKey); } // recover RMDelegationTokens MapRMDelegationTokenIdentifier, Long rmDelegationTokens = rmState.getRMDTSecretManagerState().getTokenState(); this.delegationTokenSequenceNumber = rmState.getRMDTSecretManagerState().getDTSequenceNumber(); for (Map.EntryRMDelegationTokenIdentifier, Long entry : rmDelegationTokens .entrySet()) { addPersistedDelegationToken(entry.getKey(), entry.getValue()); } } {code} The code above recovers delegationTokenSequenceNumber by reading the last sequence number in the store. It could be wrong. Fortunately, delegationTokenSequenceNumber updates it to the right number. {code} if (identifier.getSequenceNumber() getDelegationTokenSeqNum()) { setDelegationTokenSeqNum(identifier.getSequenceNumber()); } {code} All the stored identifiers will be gone through, and delegationTokenSequenceNumber will be set to the largest sequence number among these identifiers. Therefore, new DT will be assigned a sequence number which is always larger than that of all the recovered DT. To sum up, two negatives make a positive, but it's good to fix the issue. Please let me know if I've missed something here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261388#comment-14261388 ] Rohith commented on YARN-2991: -- I can think of having public boolean isDrained() in AsynDispatcher for getting drained status and do the same logic. I will try to run all the tests with above change for impact and will upload new patch. TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk -- Key: YARN-2991 URL: https://issues.apache.org/jira/browse/YARN-2991 Project: Hadoop YARN Issue Type: Test Reporter: Zhijie Shen Assignee: Rohith Priority: Blocker Attachments: 0001-YARN-2991.patch {code} Error Message test timed out after 6 milliseconds Stacktrace java.lang.Exception: test timed out after 6 milliseconds at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1281) at java.lang.Thread.join(Thread.java:1355) at org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) {code} It happened twice this months: https://builds.apache.org/job/PreCommit-YARN-Build/6096/ https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261410#comment-14261410 ] Zhijie Shen commented on YARN-2991: --- bq. I can think of having public boolean isDrained() in AsynDispatcher for getting drained status Sounds good to me. TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk -- Key: YARN-2991 URL: https://issues.apache.org/jira/browse/YARN-2991 Project: Hadoop YARN Issue Type: Test Reporter: Zhijie Shen Assignee: Rohith Priority: Blocker Attachments: 0001-YARN-2991.patch {code} Error Message test timed out after 6 milliseconds Stacktrace java.lang.Exception: test timed out after 6 milliseconds at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1281) at java.lang.Thread.join(Thread.java:1355) at org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) {code} It happened twice this months: https://builds.apache.org/job/PreCommit-YARN-Build/6096/ https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-2991: - Attachment: 0002-YARN-2991.patch TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk -- Key: YARN-2991 URL: https://issues.apache.org/jira/browse/YARN-2991 Project: Hadoop YARN Issue Type: Test Reporter: Zhijie Shen Assignee: Rohith Priority: Blocker Attachments: 0001-YARN-2991.patch, 0002-YARN-2991.patch {code} Error Message test timed out after 6 milliseconds Stacktrace java.lang.Exception: test timed out after 6 milliseconds at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1281) at java.lang.Thread.join(Thread.java:1355) at org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) {code} It happened twice this months: https://builds.apache.org/job/PreCommit-YARN-Build/6096/ https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261465#comment-14261465 ] Hadoop QA commented on YARN-2991: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689552/0002-YARN-2991.patch against trunk revision 6621c35. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6217//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6217//console This message is automatically generated. TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk -- Key: YARN-2991 URL: https://issues.apache.org/jira/browse/YARN-2991 Project: Hadoop YARN Issue Type: Test Reporter: Zhijie Shen Assignee: Rohith Priority: Blocker Attachments: 0001-YARN-2991.patch, 0002-YARN-2991.patch {code} Error Message test timed out after 6 milliseconds Stacktrace java.lang.Exception: test timed out after 6 milliseconds at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1281) at java.lang.Thread.join(Thread.java:1355) at org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) {code} It happened twice this months: https://builds.apache.org/job/PreCommit-YARN-Build/6096/ https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2997) NM keeps sending finished containers to RM until app is finished
[ https://issues.apache.org/jira/browse/YARN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261526#comment-14261526 ] Karthik Kambatla commented on YARN-2997: With work-preserving restart, the NM is required to intimate the RM repeatedly in case the RM goes down and loses this information. I propose we ignore the latter updates, or add code to identify them duplicates and then ignore. NM keeps sending finished containers to RM until app is finished Key: YARN-2997 URL: https://issues.apache.org/jira/browse/YARN-2997 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.6.0 Reporter: Chengbing Liu Attachments: YARN-2997.patch We have seen in RM log a lot of {quote} INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Null container completed... {quote} It is caused by NM sending completed containers repeatedly until the app is finished. On the RM side, the container is already released, hence {{getRMContainer}} returns null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2217) Shared cache client side changes
[ https://issues.apache.org/jira/browse/YARN-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261542#comment-14261542 ] Karthik Kambatla commented on YARN-2217: Instead of a convenience method to calculate checksum, how about modifying {{use}} to take a Path instead of String. Or, have two {{use}} methods one each of Path and String? We should probably add unit tests too? Shared cache client side changes Key: YARN-2217 URL: https://issues.apache.org/jira/browse/YARN-2217 Project: Hadoop YARN Issue Type: Sub-task Reporter: Chris Trezzo Assignee: Chris Trezzo Attachments: YARN-2217-trunk-v1.patch, YARN-2217-trunk-v2.patch, YARN-2217-trunk-v3.patch, YARN-2217-trunk-v4.patch, YARN-2217-trunk-v5.patch Implement the client side changes for the shared cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-2998) Abstract out scheduler independant PlanFollower components into AbstractSchedulerPLanFollower
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-2998: --- Assignee: Anubhav Dhoot Abstract out scheduler independant PlanFollower components into AbstractSchedulerPLanFollower - Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2998) Abstract out scheduler independant PlanFollower components into AbstractSchedulerPLanFollower
Anubhav Dhoot created YARN-2998: --- Summary: Abstract out scheduler independant PlanFollower components into AbstractSchedulerPLanFollower Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Reporter: Anubhav Dhoot -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-2998: --- Summary: Abstract out scheduler independent PlanFollower components (was: Abstract out scheduler independant PlanFollower components into AbstractSchedulerPLanFollower) Abstract out scheduler independent PlanFollower components -- Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-2998: --- Description: Abstract out scheduler independent PlanFollower components into AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. Abstract out scheduler independent PlanFollower components -- Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Abstract out scheduler independent PlanFollower components into AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-2998: Attachment: YARN-2998.001.patch Creates AbstractSchedulerPlanFollower Abstract out scheduler independent PlanFollower components -- Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2998.001.patch Abstract out scheduler independent PlanFollower components into AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261630#comment-14261630 ] Anubhav Dhoot commented on YARN-2881: - Added patch that builds on top of YARN-2998 Implement PlanFollower for FairScheduler Key: YARN-2881 URL: https://issues.apache.org/jira/browse/YARN-2881 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2881.001.patch, YARN-2881.002.patch, YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.prelim.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-2881: Attachment: YARN-2881.003.patch Implement PlanFollower for FairScheduler Key: YARN-2881 URL: https://issues.apache.org/jira/browse/YARN-2881 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2881.001.patch, YARN-2881.002.patch, YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.prelim.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2814) RM: Clean-up the handling of fatal events
[ https://issues.apache.org/jira/browse/YARN-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-2814: --- Attachment: yarn-2814-0.patch yarn-2814-0.patch is a preview of what I was thinking. It removes RMFatalEvent* and associated code and cleans up a few other methods in the RM. The dispatcher, I believe, can be moved to RMActiveServices. Will update the patch again post that investigation. Welcome any early feedback. RM: Clean-up the handling of fatal events --- Key: YARN-2814 URL: https://issues.apache.org/jira/browse/YARN-2814 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: yarn-2814-0.patch YARN-2579 fixes a critical issue around handling fatal events in the RM, but does so minimally. This JIRA is to follow through that approach and do more clean-up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261644#comment-14261644 ] Zhijie Shen commented on YARN-2991: --- The patch looks good. Did you see any problem when running all test cases? TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk -- Key: YARN-2991 URL: https://issues.apache.org/jira/browse/YARN-2991 Project: Hadoop YARN Issue Type: Test Reporter: Zhijie Shen Assignee: Rohith Priority: Blocker Attachments: 0001-YARN-2991.patch, 0002-YARN-2991.patch {code} Error Message test timed out after 6 milliseconds Stacktrace java.lang.Exception: test timed out after 6 milliseconds at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1281) at java.lang.Thread.join(Thread.java:1355) at org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) {code} It happened twice this months: https://builds.apache.org/job/PreCommit-YARN-Build/6096/ https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261647#comment-14261647 ] Karthik Kambatla commented on YARN-2998: We can leave TestFairReservationSystem and ReservationSystemTestUtil#setupFSAllocationFile for YARN-2881. Abstract out scheduler independent PlanFollower components -- Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2998.001.patch Abstract out scheduler independent PlanFollower components into AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2943) Add a node-labels page in RM web UI
[ https://issues.apache.org/jira/browse/YARN-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261669#comment-14261669 ] Wangda Tan commented on YARN-2943: -- Verified in a local 3-nodes cluster, both nodes page and node-labels page work fine. Add a node-labels page in RM web UI --- Key: YARN-2943 URL: https://issues.apache.org/jira/browse/YARN-2943 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: Node-labels-page.png, Nodes-page-with-label-filter.png, YARN-2943.1.patch, YARN-2943.2.patch, YARN-2943.3.patch, YARN-2943.4.patch, YARN-2943.5.patch, YARN-2943.6.patch Now we have node labels in the system, but there's no a very convenient to get information like how many active NM(s) assigned to a given label?, how much total resource for a give label?, For a given label, which queues can access it?, etc. It will be better to add a node-labels page in RM web UI, users/admins can have a centralized view to see such information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2997) NM keeps sending finished containers to RM until app is finished
[ https://issues.apache.org/jira/browse/YARN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261682#comment-14261682 ] Jian He commented on YARN-2997: --- bq. add code to identify them duplicates and then ignore. I think it is now ignored. perhaps we should clarify the logging and move it to debug level. NM keeps sending finished containers to RM until app is finished Key: YARN-2997 URL: https://issues.apache.org/jira/browse/YARN-2997 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.6.0 Reporter: Chengbing Liu Attachments: YARN-2997.patch We have seen in RM log a lot of {quote} INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Null container completed... {quote} It is caused by NM sending completed containers repeatedly until the app is finished. On the RM side, the container is already released, hence {{getRMContainer}} returns null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261691#comment-14261691 ] Karthik Kambatla commented on YARN-2998: I got confused looking at the wrong method. The only part that doesn't belong here is ReservationSystem#setupFairScheduler. Will post a patch shortly removing that method. Could you include it in YARN-2881. Thanks. Abstract out scheduler independent PlanFollower components -- Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2998.001.patch Abstract out scheduler independent PlanFollower components into AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-2998: --- Attachment: yarn-2998-2.patch Abstract out scheduler independent PlanFollower components -- Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2998.001.patch, yarn-2998-2.patch Abstract out scheduler independent PlanFollower components into AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2932) Add entry for preemption setting to queue status screen and startup/refresh logging
[ https://issues.apache.org/jira/browse/YARN-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261719#comment-14261719 ] Wangda Tan commented on YARN-2932: -- Hi [~eepayne], Thanks for working on this patch, sorry for late response since I was on vacation in the past few weeks. Just took a look at your patch, some comments: 1) Since the QUEUE_PREEMPTION_DISABLED is an option for CS, I suggest to make it as a member of CapacitySchedulerConfiguration, like {{getUserLimitFactor}}/{{setUserLimit}}, etc. This will void some String operations. 2) Rename {{context}} in {{AbstractCSQueue}} to name like {{csContext}} since we have {{rmContext}} 3) I suggest to add a member var like preemptable to AbstractCSQueue, instead of calling: {code} + @Private + public boolean isPreemptable() { +return context.getConfiguration().isPreemptable(getQueuePath()); + } {code} The implementation of CSConfiguration.isPreemptable(..) seems too complex to me. CSConfiguration should only care about value of configuration file, such logic should put to AbstractCSQueue.setupQueueConfigs(...) 4) It's better to web UI name (preemptable) and configuration name (disable_preemption) consistent. I prefer preemptable personally. 5) {{testIsPreemptable}} should be a part of TestCapacityScheduler instead of putting it to TestProportionalCapacityPreemptionPolicy. 6) In ProportionalCapacityPreemptionPolicy.cloneQueues, preemptable field should get from Queue instead of getting from configuration. Please let me know your thoughts, Wangda Add entry for preemption setting to queue status screen and startup/refresh logging --- Key: YARN-2932 URL: https://issues.apache.org/jira/browse/YARN-2932 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 2.7.0 Reporter: Eric Payne Assignee: Eric Payne Attachments: YARN-2932.v1.txt YARN-2056 enables the ability to turn preemption on or off on a per-queue level. This JIRA will provide the preemption status for each queue in the {{HOST:8088/cluster/scheduler}} UI and in the RM log during startup/queue refresh. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261720#comment-14261720 ] Karthik Kambatla commented on YARN-2998: The refactoring makes sense. Few review comments: # AbstractSchedulerPlanFollower ## All newly added abstract methods can use a javadoc explaining what the method is supposed to do, so future followers don't have to look at other implementations. ## Rename #calculateTargetCapacity? Not very descriptive. ## Rename #isPlanResourcesLessThanReservations to #arePlanResourcesLessThanReservations or isPlanResourceLessThanReservations # ReservationSystemTestUtil#setupFairScheduler is not used. yarn-2992-2.patch removes it. Abstract out scheduler independent PlanFollower components -- Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2998.001.patch, yarn-2998-2.patch Abstract out scheduler independent PlanFollower components into AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2492) (Clone of YARN-796) Allow for (admin) labels on nodes and resource-requests
[ https://issues.apache.org/jira/browse/YARN-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261726#comment-14261726 ] Hudson commented on YARN-2492: -- FAILURE: Integrated in Hadoop-trunk-Commit #6796 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6796/]) YARN-2492. Added node-labels page on RM web UI. Contributed by Wangda Tan (jianhe: rev 5f57b904f550515693d93a2959e663b0d0260696) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/TestRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/NodeLabel.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/RMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/YarnWebParams.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NavBlock.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodeLabelsPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java (Clone of YARN-796) Allow for (admin) labels on nodes and resource-requests Key: YARN-2492 URL: https://issues.apache.org/jira/browse/YARN-2492 Project: Hadoop YARN Issue Type: Task Components: api, client, resourcemanager Reporter: Wangda Tan Since YARN-796 is a sub JIRA of YARN-397, this JIRA is used to create and track sub tasks and attach split patches for YARN-796. *Let's still keep over-all discussions on YARN-796.* -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2922) Concurrent Modification Exception in LeafQueue when collecting applications
[ https://issues.apache.org/jira/browse/YARN-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261745#comment-14261745 ] Wangda Tan commented on YARN-2922: -- Patch LGTM also, +1. Thanks, Wangda Concurrent Modification Exception in LeafQueue when collecting applications --- Key: YARN-2922 URL: https://issues.apache.org/jira/browse/YARN-2922 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.5.1 Reporter: Jason Tufo Assignee: Rohith Attachments: 0001-YARN-2922.patch java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.collectSchedulerApplications(LeafQueue.java:1618) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getAppsInQueue(CapacityScheduler.java:1119) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueInfo(ClientRMService.java:798) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:234) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2943) Add a node-labels page in RM web UI
[ https://issues.apache.org/jira/browse/YARN-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-2943: -- Target Version/s: 2.7.0 (was: 2.7.0, 2.6.1) Add a node-labels page in RM web UI --- Key: YARN-2943 URL: https://issues.apache.org/jira/browse/YARN-2943 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Fix For: 2.7.0 Attachments: Node-labels-page.png, Nodes-page-with-label-filter.png, YARN-2943.1.patch, YARN-2943.2.patch, YARN-2943.3.patch, YARN-2943.4.patch, YARN-2943.5.patch, YARN-2943.6.patch Now we have node labels in the system, but there's no a very convenient to get information like how many active NM(s) assigned to a given label?, how much total resource for a give label?, For a given label, which queues can access it?, etc. It will be better to add a node-labels page in RM web UI, users/admins can have a centralized view to see such information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2492) (Clone of YARN-796) Allow for (admin) labels on nodes and resource-requests
[ https://issues.apache.org/jira/browse/YARN-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261746#comment-14261746 ] Hudson commented on YARN-2492: -- FAILURE: Integrated in Hadoop-trunk-Commit #6797 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6797/]) Revert YARN-2492(wrong jira number). Added node-labels page on RM web UI. Contributed by Wangda Tan (jianhe: rev 746ad6e989683fe1dfc61a611702c9be7b5cd264) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NavBlock.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/RMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodeLabelsPage.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/NodeLabel.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/TestRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/YarnWebParams.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java (Clone of YARN-796) Allow for (admin) labels on nodes and resource-requests Key: YARN-2492 URL: https://issues.apache.org/jira/browse/YARN-2492 Project: Hadoop YARN Issue Type: Task Components: api, client, resourcemanager Reporter: Wangda Tan Since YARN-796 is a sub JIRA of YARN-397, this JIRA is used to create and track sub tasks and attach split patches for YARN-796. *Let's still keep over-all discussions on YARN-796.* -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2493) API changes for users
[ https://issues.apache.org/jira/browse/YARN-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261747#comment-14261747 ] Hudson commented on YARN-2493: -- FAILURE: Integrated in Hadoop-trunk-Commit #6797 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6797/]) YARN-2493. Added node-labels page on RM web UI. Contributed by Wangda Tan (jianhe: rev b7442bf92eb6e1ae64a0f9644ffc2eee4597aad5) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodeLabelsPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/YarnWebParams.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NavBlock.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/RMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/TestRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/NodeLabel.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java API changes for users - Key: YARN-2493 URL: https://issues.apache.org/jira/browse/YARN-2493 Project: Hadoop YARN Issue Type: Sub-task Components: api Reporter: Wangda Tan Assignee: Wangda Tan Fix For: 2.6.0 Attachments: YARN-2493-20141008.1.patch, YARN-2493.patch, YARN-2493.patch, YARN-2493.patch, YARN-2493.patch, YARN-2493.patch This JIRA includes API changes for users of YARN-796, like changes in {{ResourceRequest}}, {{ApplicationSubmissionContext}}, etc. This is a common part of YARN-796. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261750#comment-14261750 ] Hadoop QA commented on YARN-2998: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689587/YARN-2998.001.patch against trunk revision 6621c35. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 13 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6218//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6218//artifact/patchprocess/newPatchFindbugsWarningshadoop-sls.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6218//console This message is automatically generated. Abstract out scheduler independent PlanFollower components -- Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2998.001.patch, yarn-2998-2.patch Abstract out scheduler independent PlanFollower components into AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2999) Compilation error in AllocationConfiguration.java in java1.7 while running tests
Rohith created YARN-2999: Summary: Compilation error in AllocationConfiguration.java in java1.7 while running tests Key: YARN-2999 URL: https://issues.apache.org/jira/browse/YARN-2999 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Assignee: Rohith In AllocationConfiguration, in the below object creation, generic type must be specified as instance variable,otherwise java1.7 lead compilation error while running tests for RM and NM {{reservableQueues = new HashSet();}} Report : {code} java.lang.Error: Unresolved compilation problem: '' operator is not allowed for source level below 1.7 at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfiguration.init(AllocationConfiguration.java:150) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1276) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1320) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:559) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:985) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:251) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.init(TestRMRestart.java:2027) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.init(MockRM.java:108) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.init(TestRMRestart.java:2020) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testAppAttemptTokensRestoredOnRMRestart(TestRMRestart.java:1199) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261752#comment-14261752 ] Rohith commented on YARN-2991: -- I splitted into 2 level for verifying it # Those test cases which uses DrainDispatcher : test completed,all passed. # All test cases yanr/mr: Few tests failed which is not really because of the this patch. NodeManager Tests Report {code} Results : Failed tests: TestDockerContainerExecutorWithMocks.testContainerLaunch:209 expected:0 but was:127 Tests run: 279, Failures: 1, Errors: 0, Skipped: 6 {code} ResourceManager Tests {code} Result : Failed Tests : // Many test related to fair scheduler are there. YARN-2999 Tests run: 986, Failures: 0, Errors: 178, Skipped: 1 {code} TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk -- Key: YARN-2991 URL: https://issues.apache.org/jira/browse/YARN-2991 Project: Hadoop YARN Issue Type: Test Reporter: Zhijie Shen Assignee: Rohith Priority: Blocker Attachments: 0001-YARN-2991.patch, 0002-YARN-2991.patch {code} Error Message test timed out after 6 milliseconds Stacktrace java.lang.Exception: test timed out after 6 milliseconds at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1281) at java.lang.Thread.join(Thread.java:1355) at org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) {code} It happened twice this months: https://builds.apache.org/job/PreCommit-YARN-Build/6096/ https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2999) Compilation error in AllocationConfiguration.java in java1.7 while running tests
[ https://issues.apache.org/jira/browse/YARN-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-2999: - Attachment: 0001-YARN-2999.patch Straight forward patch to fix this issue. Compilation error in AllocationConfiguration.java in java1.7 while running tests Key: YARN-2999 URL: https://issues.apache.org/jira/browse/YARN-2999 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Assignee: Rohith Attachments: 0001-YARN-2999.patch In AllocationConfiguration, in the below object creation, generic type must be specified as instance variable,otherwise java1.7 lead compilation error while running tests for RM and NM {{reservableQueues = new HashSet();}} Report : {code} java.lang.Error: Unresolved compilation problem: '' operator is not allowed for source level below 1.7 at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfiguration.init(AllocationConfiguration.java:150) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1276) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1320) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:559) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:985) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:251) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.init(TestRMRestart.java:2027) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.init(MockRM.java:108) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.init(TestRMRestart.java:2020) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testAppAttemptTokensRestoredOnRMRestart(TestRMRestart.java:1199) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2922) Concurrent Modification Exception in LeafQueue when collecting applications
[ https://issues.apache.org/jira/browse/YARN-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261758#comment-14261758 ] Jian He commented on YARN-2922: --- looks like {{LeafQueue#getTotalResourcePending}} should be changed too. does it make sense to synchronize on the object itself ? Concurrent Modification Exception in LeafQueue when collecting applications --- Key: YARN-2922 URL: https://issues.apache.org/jira/browse/YARN-2922 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.5.1 Reporter: Jason Tufo Assignee: Rohith Attachments: 0001-YARN-2922.patch java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.collectSchedulerApplications(LeafQueue.java:1618) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getAppsInQueue(CapacityScheduler.java:1119) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueInfo(ClientRMService.java:798) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:234) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-2998: Attachment: YARN-2998.003.patch Addressed feedback Abstract out scheduler independent PlanFollower components -- Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2998.001.patch, YARN-2998.003.patch, yarn-2998-2.patch Abstract out scheduler independent PlanFollower components into AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2987) ClientRMService#getQueueInfo doesn't check app ACLs
[ https://issues.apache.org/jira/browse/YARN-2987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261769#comment-14261769 ] Hudson commented on YARN-2987: -- FAILURE: Integrated in Hadoop-trunk-Commit #6798 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6798/]) YARN-2987. Fixed ClientRMService#getQueueInfo to check against queue and app ACLs. Contributed by Varun Saxena (jianhe: rev e2351c7ae24cea9b217af4174512d279c55e8efd) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java ClientRMService#getQueueInfo doesn't check app ACLs --- Key: YARN-2987 URL: https://issues.apache.org/jira/browse/YARN-2987 Project: Hadoop YARN Issue Type: Bug Reporter: Jian He Assignee: Varun Saxena Fix For: 2.7.0 Attachments: YARN-2987.001.patch, YARN-2987.002.patch ClientRMService#getQueueInfo can return a list of applications belonging to the queue, but doesn't actually check if the user has the permission to view the applications. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2999) Compilation error in AllocationConfiguration.java in java1.7 while running tests
[ https://issues.apache.org/jira/browse/YARN-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261775#comment-14261775 ] Jian He commented on YARN-2999: --- +1 Compilation error in AllocationConfiguration.java in java1.7 while running tests Key: YARN-2999 URL: https://issues.apache.org/jira/browse/YARN-2999 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Assignee: Rohith Attachments: 0001-YARN-2999.patch In AllocationConfiguration, in the below object creation, generic type must be specified as instance variable,otherwise java1.7 lead compilation error while running tests for RM and NM {{reservableQueues = new HashSet();}} Report : {code} java.lang.Error: Unresolved compilation problem: '' operator is not allowed for source level below 1.7 at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfiguration.init(AllocationConfiguration.java:150) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1276) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1320) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:559) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:985) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:251) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.init(TestRMRestart.java:2027) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.init(MockRM.java:108) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.init(TestRMRestart.java:2020) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testAppAttemptTokensRestoredOnRMRestart(TestRMRestart.java:1199) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261781#comment-14261781 ] Hadoop QA commented on YARN-2998: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689600/yarn-2998-2.patch against trunk revision 6621c35. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 13 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6219//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6219//artifact/patchprocess/newPatchFindbugsWarningshadoop-sls.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6219//console This message is automatically generated. Abstract out scheduler independent PlanFollower components -- Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2998.001.patch, YARN-2998.003.patch, yarn-2998-2.patch Abstract out scheduler independent PlanFollower components into AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2997) NM keeps sending finished containers to RM until app is finished
[ https://issues.apache.org/jira/browse/YARN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261792#comment-14261792 ] Chengbing Liu commented on YARN-2997: - [~kasha] [~jianhe] Thanks for your review. I think NM will call {{getNMContainerStatuses()}} instead of {{getContainerStatuses()}} upon receiving RESYNC from restarted RM. {{getNMContainerStatuses()}} remains unchanged and still reports all the completed containers for non-completed apps. The uploaded patch will let the normal heartbeat (not after receiving RESYNC) send only useful container status information to RM. So I guess the work-preserving RM restart is not affected. NM keeps sending finished containers to RM until app is finished Key: YARN-2997 URL: https://issues.apache.org/jira/browse/YARN-2997 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.6.0 Reporter: Chengbing Liu Attachments: YARN-2997.patch We have seen in RM log a lot of {quote} INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Null container completed... {quote} It is caused by NM sending completed containers repeatedly until the app is finished. On the RM side, the container is already released, hence {{getRMContainer}} returns null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2997) NM keeps sending finished containers to RM until app is finished
[ https://issues.apache.org/jira/browse/YARN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261797#comment-14261797 ] Jian He commented on YARN-2997: --- bq. The uploaded patch will let the normal heartbeat The intention was to let NM remove containers from its context only after RM acks it has received these containers. More context in YARN-1372 NM keeps sending finished containers to RM until app is finished Key: YARN-2997 URL: https://issues.apache.org/jira/browse/YARN-2997 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.6.0 Reporter: Chengbing Liu Attachments: YARN-2997.patch We have seen in RM log a lot of {quote} INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Null container completed... {quote} It is caused by NM sending completed containers repeatedly until the app is finished. On the RM side, the container is already released, hence {{getRMContainer}} returns null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-2881: Attachment: YARN-2881.004.patch Moved code factored from YARN-2998 Implement PlanFollower for FairScheduler Key: YARN-2881 URL: https://issues.apache.org/jira/browse/YARN-2881 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2881.001.patch, YARN-2881.002.patch, YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.004.patch, YARN-2881.prelim.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2999) Compilation error in AllocationConfiguration.java in java1.7 while running tests
[ https://issues.apache.org/jira/browse/YARN-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261814#comment-14261814 ] Hadoop QA commented on YARN-2999: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689606/0001-YARN-2999.patch against trunk revision b7442bf. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6220//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6220//console This message is automatically generated. Compilation error in AllocationConfiguration.java in java1.7 while running tests Key: YARN-2999 URL: https://issues.apache.org/jira/browse/YARN-2999 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Assignee: Rohith Attachments: 0001-YARN-2999.patch In AllocationConfiguration, in the below object creation, generic type must be specified as instance variable,otherwise java1.7 lead compilation error while running tests for RM and NM {{reservableQueues = new HashSet();}} Report : {code} java.lang.Error: Unresolved compilation problem: '' operator is not allowed for source level below 1.7 at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfiguration.init(AllocationConfiguration.java:150) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1276) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1320) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:559) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:985) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:251) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.init(TestRMRestart.java:2027) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.init(MockRM.java:108) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.init(TestRMRestart.java:2020) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testAppAttemptTokensRestoredOnRMRestart(TestRMRestart.java:1199) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2997) NM keeps sending finished containers to RM until app is finished
[ https://issues.apache.org/jira/browse/YARN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261825#comment-14261825 ] Chengbing Liu commented on YARN-2997: - [~jianhe] The containers are not removed in this patch, they are just not reported to RM when the following conditions are met: * The application is not finished * The container was completed and was already in {{recentlyStoppedContainers}} * It is a normal heartbeat with RM, not after RM restart Note that the container is not removed from the NM context. In a resync with RM, these completed applications will still be reported for work-preserving recovery. NM keeps sending finished containers to RM until app is finished Key: YARN-2997 URL: https://issues.apache.org/jira/browse/YARN-2997 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.6.0 Reporter: Chengbing Liu Attachments: YARN-2997.patch We have seen in RM log a lot of {quote} INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Null container completed... {quote} It is caused by NM sending completed containers repeatedly until the app is finished. On the RM side, the container is already released, hence {{getRMContainer}} returns null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261833#comment-14261833 ] Hadoop QA commented on YARN-2998: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689607/YARN-2998.003.patch against trunk revision e2351c7. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 13 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6221//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6221//artifact/patchprocess/newPatchFindbugsWarningshadoop-sls.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6221//console This message is automatically generated. Abstract out scheduler independent PlanFollower components -- Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2998.001.patch, YARN-2998.003.patch, yarn-2998-2.patch Abstract out scheduler independent PlanFollower components into AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2560) Diagnostics is delayed to passed to ApplicationReport
[ https://issues.apache.org/jira/browse/YARN-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261855#comment-14261855 ] Jeff Zhang commented on YARN-2560: -- Check RMAppImpl RMAppAttemptImpl and found that FinalApplicationStatus would been retrieved from the currentAttempt, while diagnostics would been retrieved from the RMApp which would be updated until it gets the AttemptFinishedEvent. This is the root cause that make the FinalApplicationStatus and diagnostics inconsistent sometimes. Diagnostics is delayed to passed to ApplicationReport - Key: YARN-2560 URL: https://issues.apache.org/jira/browse/YARN-2560 Project: Hadoop YARN Issue Type: Bug Reporter: Jeff Zhang The diagnostics of Application may be delayed to pass to ApplicationReport. Here's one example when ApplicationStatus has changed to FAILED, but the diagnostics is still empty. And the next call of getApplicationReport could get the diagnostics. {code} while(true) { appReport = yarnClient.getApplicationReport(appId); Thread.sleep(1000); LOG.info(AppStatus: + appReport.getFinalApplicationStatus()); LOG.info(Diagnostics: + appReport.getDiagnostics()); } {code} *Output:* {code} AppStatus:FAILED Diagnostics: // empty // get diagnostics for the next getApplicationReport AppStatus:FAILED Diagnostics: // diagnostics info here {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2922) Concurrent Modification Exception in LeafQueue when collecting applications
[ https://issues.apache.org/jira/browse/YARN-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-2922: - Attachment: 0001-YARN-2922.patch Concurrent Modification Exception in LeafQueue when collecting applications --- Key: YARN-2922 URL: https://issues.apache.org/jira/browse/YARN-2922 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.5.1 Reporter: Jason Tufo Assignee: Rohith Attachments: 0001-YARN-2922.patch, 0001-YARN-2922.patch java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.collectSchedulerApplications(LeafQueue.java:1618) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getAppsInQueue(CapacityScheduler.java:1119) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueInfo(ClientRMService.java:798) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:234) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261869#comment-14261869 ] Karthik Kambatla commented on YARN-2998: +1. Checking this in. Abstract out scheduler independent PlanFollower components -- Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2998.001.patch, YARN-2998.003.patch, yarn-2998-2.patch Abstract out scheduler independent PlanFollower components into AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2922) Concurrent Modification Exception in LeafQueue when collecting applications
[ https://issues.apache.org/jira/browse/YARN-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261870#comment-14261870 ] Rohith commented on YARN-2922: -- You are right, it is possible to ocure ConcurrentModificatinException since {{LeafQueue#getTotalResourcePending}} is public interface. Updated the patch for the same, kindly review Concurrent Modification Exception in LeafQueue when collecting applications --- Key: YARN-2922 URL: https://issues.apache.org/jira/browse/YARN-2922 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.5.1 Reporter: Jason Tufo Assignee: Rohith Attachments: 0001-YARN-2922.patch, 0001-YARN-2922.patch java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.collectSchedulerApplications(LeafQueue.java:1618) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getAppsInQueue(CapacityScheduler.java:1119) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueInfo(ClientRMService.java:798) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:234) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261878#comment-14261878 ] Hudson commented on YARN-2998: -- FAILURE: Integrated in Hadoop-trunk-Commit #6799 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6799/]) YARN-2998. Abstract out scheduler independent PlanFollower components. (Anubhav Dhoot via kasha) (kasha: rev e7257acd8a7adb74d81cd1d009d4a99f023ed844) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacitySchedulerPlanFollower.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacitySchedulerPlanFollower.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestSchedulerPlanFollowerBase.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractSchedulerPlanFollower.java * hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java Abstract out scheduler independent PlanFollower components -- Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 2.7.0 Attachments: YARN-2998.001.patch, YARN-2998.003.patch, yarn-2998-2.patch Abstract out scheduler independent PlanFollower components into AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2814) RM: Clean-up the handling of fatal events
[ https://issues.apache.org/jira/browse/YARN-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261893#comment-14261893 ] Rohith commented on YARN-2814: -- The patch looks good overall, leaving 2nd point one comment # Can RMContext passed to RMStateStore and RM instance can be taken from context? I think point-3 was intended to this RM: Clean-up the handling of fatal events --- Key: YARN-2814 URL: https://issues.apache.org/jira/browse/YARN-2814 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: yarn-2814-0.patch YARN-2579 fixes a critical issue around handling fatal events in the RM, but does so minimally. This JIRA is to follow through that approach and do more clean-up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261894#comment-14261894 ] Hadoop QA commented on YARN-2881: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689622/YARN-2881.004.patch against trunk revision e7257ac. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6223//console This message is automatically generated. Implement PlanFollower for FairScheduler Key: YARN-2881 URL: https://issues.apache.org/jira/browse/YARN-2881 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2881.001.patch, YARN-2881.002.patch, YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.004.patch, YARN-2881.prelim.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-2881: Attachment: YARN-2881.005.patch Updated for the renamed methods in YARN-2998 Implement PlanFollower for FairScheduler Key: YARN-2881 URL: https://issues.apache.org/jira/browse/YARN-2881 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2881.001.patch, YARN-2881.002.patch, YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.004.patch, YARN-2881.005.patch, YARN-2881.prelim.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2922) Concurrent Modification Exception in LeafQueue when collecting applications
[ https://issues.apache.org/jira/browse/YARN-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261935#comment-14261935 ] Hadoop QA commented on YARN-2922: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689633/0001-YARN-2922.patch against trunk revision e2351c7. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6222//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6222//console This message is automatically generated. Concurrent Modification Exception in LeafQueue when collecting applications --- Key: YARN-2922 URL: https://issues.apache.org/jira/browse/YARN-2922 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.5.1 Reporter: Jason Tufo Assignee: Rohith Attachments: 0001-YARN-2922.patch, 0001-YARN-2922.patch java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.collectSchedulerApplications(LeafQueue.java:1618) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getAppsInQueue(CapacityScheduler.java:1119) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueInfo(ClientRMService.java:798) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:234) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261968#comment-14261968 ] Hadoop QA commented on YARN-2881: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689635/YARN-2881.005.patch against trunk revision e7257ac. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6224//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6224//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6224//console This message is automatically generated. Implement PlanFollower for FairScheduler Key: YARN-2881 URL: https://issues.apache.org/jira/browse/YARN-2881 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2881.001.patch, YARN-2881.002.patch, YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.004.patch, YARN-2881.005.patch, YARN-2881.prelim.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2922) Concurrent Modification Exception in LeafQueue when collecting applications
[ https://issues.apache.org/jira/browse/YARN-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261978#comment-14261978 ] Rohith commented on YARN-2922: -- Test failure is unrelated to patch, it is passing locally with the patch Concurrent Modification Exception in LeafQueue when collecting applications --- Key: YARN-2922 URL: https://issues.apache.org/jira/browse/YARN-2922 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.5.1 Reporter: Jason Tufo Assignee: Rohith Attachments: 0001-YARN-2922.patch, 0001-YARN-2922.patch java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.collectSchedulerApplications(LeafQueue.java:1618) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getAppsInQueue(CapacityScheduler.java:1119) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueInfo(ClientRMService.java:798) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:234) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2978) ResourceManager crashes with NPE while getting queue info
[ https://issues.apache.org/jira/browse/YARN-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261983#comment-14261983 ] Varun Saxena commented on YARN-2978: This issue is due to lack of synchronization. NPE happens in following piece of code while calling the line {noformat}getApplications(i).isInitialized(){noformat}. getApplications(int i) gets object at index i from the underlying applications list. This means List#get returns null object from the index location specified. {code:title=YarnProtos.java} public final boolean isInitialized() { for (int i = 0; i getApplicationsCount(); i++) { if (!getApplications(i).isInitialized()) { memoizedIsInitialized = 0; return false; } } } {code} This is because multiple threads can access the same QueueInfo object in case of Capacity scheduler and this can lead to underlying array list which stores applications being modified by different threads, which in turn can lead to NPE. YARN-2979 is also due to lack of synchronization ResourceManager crashes with NPE while getting queue info - Key: YARN-2978 URL: https://issues.apache.org/jira/browse/YARN-2978 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.5.1 Reporter: Jason Tufo Assignee: Varun Saxena java.lang.NullPointerException at org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto.isInitialized(YarnProtos.java:29625) at org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto$Builder.build(YarnProtos.java:29939) at org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.mergeLocalToProto(QueueInfoPBImpl.java:290) at org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.getProto(QueueInfoPBImpl.java:157) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.convertToProtoFormat(GetQueueInfoResponsePBImpl.java:128) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToBuilder(GetQueueInfoResponsePBImpl.java:104) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToProto(GetQueueInfoResponsePBImpl.java:111) at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.getProto(GetQueueInfoResponsePBImpl.java:53) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:235) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3000) YARN_PID_DIR should be visible in yarn-env.sh
Jeff Zhang created YARN-3000: Summary: YARN_PID_DIR should be visible in yarn-env.sh Key: YARN-3000 URL: https://issues.apache.org/jira/browse/YARN-3000 Project: Hadoop YARN Issue Type: Bug Components: scripts Affects Versions: 2.6.0 Reporter: Jeff Zhang Priority: Minor Currently I see YARN_PID_DIR only show in yarn-deamon.sh which is supposed not the place for user to set up enviroment variable. IMO, yarn-env.sh is the place for users to set up enviroment variable just like hadoop-env.sh, so it's better to put YARN_PID_DIR into yarn-env.sh. ( can put it into comment just like YARN_RESOURCEMANAGER_HEAPSIZE ) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3000) YARN_PID_DIR should be visible in yarn-env.sh
[ https://issues.apache.org/jira/browse/YARN-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261996#comment-14261996 ] Rohith commented on YARN-3000: -- Thanks for reporting issue. Would like to work on this? YARN_PID_DIR should be visible in yarn-env.sh - Key: YARN-3000 URL: https://issues.apache.org/jira/browse/YARN-3000 Project: Hadoop YARN Issue Type: Bug Components: scripts Affects Versions: 2.6.0 Reporter: Jeff Zhang Priority: Minor Currently I see YARN_PID_DIR only show in yarn-deamon.sh which is supposed not the place for user to set up enviroment variable. IMO, yarn-env.sh is the place for users to set up enviroment variable just like hadoop-env.sh, so it's better to put YARN_PID_DIR into yarn-env.sh. ( can put it into comment just like YARN_RESOURCEMANAGER_HEAPSIZE ) -- This message was sent by Atlassian JIRA (v6.3.4#6332)