[jira] [Commented] (YARN-634) Introduce a SerializedException
[ https://issues.apache.org/jira/browse/YARN-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13653619#comment-13653619 ] Hadoop QA commented on YARN-634: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12582583/YARN-634.patch.2 against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 14 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/908//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/908//console This message is automatically generated. Introduce a SerializedException --- Key: YARN-634 URL: https://issues.apache.org/jira/browse/YARN-634 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: YARN-634.patch.2, YARN-634.txt LocalizationProtocol sends an exception over the wire. This currently uses YarnRemoteException. Post YARN-627, this needs to be changed and a new serialized exception is required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-655) Fair scheduler metrics should subtract allocated memory from available memory
[ https://issues.apache.org/jira/browse/YARN-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654356#comment-13654356 ] Hudson commented on YARN-655: - Integrated in Hadoop-Yarn-trunk #205 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/205/]) YARN-655. Fair scheduler metrics should subtract allocated memory from available memory. (sandyr via tucu) (Revision 1480809) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480809 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java Fair scheduler metrics should subtract allocated memory from available memory - Key: YARN-655 URL: https://issues.apache.org/jira/browse/YARN-655 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.0.5-beta Attachments: YARN-655.patch In the scheduler web UI, cluster metrics reports that the Memory Total goes up when an application is allocated resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-507) Add interface visibility and stability annotations to FS interfaces/classes
[ https://issues.apache.org/jira/browse/YARN-507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654360#comment-13654360 ] Hudson commented on YARN-507: - Integrated in Hadoop-Yarn-trunk #205 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/205/]) YARN-507. Add interface visibility and stability annotations to FS interfaces/classes. (kkambatl via tucu) (Revision 1480799) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480799 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSParentQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerApp.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FifoPolicy.java Add interface visibility and stability annotations to FS interfaces/classes --- Key: YARN-507 URL: https://issues.apache.org/jira/browse/YARN-507 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Minor Labels: scheduler Fix For: 2.0.5-beta Attachments: yarn-507.patch Many of FS classes/interfaces are missing annotations on visibility and stability. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-598) Add virtual cores to queue metrics
[ https://issues.apache.org/jira/browse/YARN-598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654438#comment-13654438 ] Hudson commented on YARN-598: - Integrated in Hadoop-Hdfs-trunk #1394 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1394/]) YARN-598. Add virtual cores to queue metrics. (sandyr via tucu) (Revision 1480816) Result = FAILURE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480816 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestQueueMetrics.java Add virtual cores to queue metrics -- Key: YARN-598 URL: https://issues.apache.org/jira/browse/YARN-598 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager, scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.0.5-beta Attachments: YARN-598-1.patch, YARN-598.patch QueueMetrics includes allocatedMB, availableMB, pendingMB, reservedMB. It should have equivalents for CPU. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-568) FairScheduler: support for work-preserving preemption
[ https://issues.apache.org/jira/browse/YARN-568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654437#comment-13654437 ] Hudson commented on YARN-568: - Integrated in Hadoop-Hdfs-trunk #1394 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1394/]) YARN-568. Add support for work preserving preemption to the FairScheduler. Contributed by Carlo Curino and Sandy Ryza (Revision 1480778) Result = FAILURE cdouglas : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480778 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerApp.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java FairScheduler: support for work-preserving preemption -- Key: YARN-568 URL: https://issues.apache.org/jira/browse/YARN-568 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Reporter: Carlo Curino Assignee: Carlo Curino Fix For: 2.0.5-beta Attachments: YARN-568-1.patch, YARN-568-2.patch, YARN-568-2.patch, YARN-568.patch, YARN-568.patch In the attached patch, we modified the FairScheduler to substitute its preemption-by-killling with a work-preserving version of preemption (followed by killing if the AMs do not respond quickly enough). This should allows to run preemption checking more often, but kill less often (proper tuning to be investigated). Depends on YARN-567 and YARN-45, is related to YARN-569. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-655) Fair scheduler metrics should subtract allocated memory from available memory
[ https://issues.apache.org/jira/browse/YARN-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654441#comment-13654441 ] Hudson commented on YARN-655: - Integrated in Hadoop-Hdfs-trunk #1394 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1394/]) YARN-655. Fair scheduler metrics should subtract allocated memory from available memory. (sandyr via tucu) (Revision 1480809) Result = FAILURE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480809 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java Fair scheduler metrics should subtract allocated memory from available memory - Key: YARN-655 URL: https://issues.apache.org/jira/browse/YARN-655 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.0.5-beta Attachments: YARN-655.patch In the scheduler web UI, cluster metrics reports that the Memory Total goes up when an application is allocated resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-637) FS: maxAssign is not honored
[ https://issues.apache.org/jira/browse/YARN-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654446#comment-13654446 ] Hudson commented on YARN-637: - Integrated in Hadoop-Hdfs-trunk #1394 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1394/]) YARN-637. FS: maxAssign is not honored. (kkambatl via tucu) (Revision 1480802) Result = FAILURE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480802 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java FS: maxAssign is not honored Key: YARN-637 URL: https://issues.apache.org/jira/browse/YARN-637 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.4-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 2.0.5-beta Attachments: yarn-637.patch maxAssign limits the number of containers that can be assigned in a single heartbeat. Currently, FS doesn't keep track of number of assigned containers to check this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-568) FairScheduler: support for work-preserving preemption
[ https://issues.apache.org/jira/browse/YARN-568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654481#comment-13654481 ] Hudson commented on YARN-568: - Integrated in Hadoop-Mapreduce-trunk #1421 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1421/]) YARN-568. Add support for work preserving preemption to the FairScheduler. Contributed by Carlo Curino and Sandy Ryza (Revision 1480778) Result = FAILURE cdouglas : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480778 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerApp.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java FairScheduler: support for work-preserving preemption -- Key: YARN-568 URL: https://issues.apache.org/jira/browse/YARN-568 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Reporter: Carlo Curino Assignee: Carlo Curino Fix For: 2.0.5-beta Attachments: YARN-568-1.patch, YARN-568-2.patch, YARN-568-2.patch, YARN-568.patch, YARN-568.patch In the attached patch, we modified the FairScheduler to substitute its preemption-by-killling with a work-preserving version of preemption (followed by killing if the AMs do not respond quickly enough). This should allows to run preemption checking more often, but kill less often (proper tuning to be investigated). Depends on YARN-567 and YARN-45, is related to YARN-569. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-598) Add virtual cores to queue metrics
[ https://issues.apache.org/jira/browse/YARN-598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654482#comment-13654482 ] Hudson commented on YARN-598: - Integrated in Hadoop-Mapreduce-trunk #1421 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1421/]) YARN-598. Add virtual cores to queue metrics. (sandyr via tucu) (Revision 1480816) Result = FAILURE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480816 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestQueueMetrics.java Add virtual cores to queue metrics -- Key: YARN-598 URL: https://issues.apache.org/jira/browse/YARN-598 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager, scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.0.5-beta Attachments: YARN-598-1.patch, YARN-598.patch QueueMetrics includes allocatedMB, availableMB, pendingMB, reservedMB. It should have equivalents for CPU. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-655) Fair scheduler metrics should subtract allocated memory from available memory
[ https://issues.apache.org/jira/browse/YARN-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654485#comment-13654485 ] Hudson commented on YARN-655: - Integrated in Hadoop-Mapreduce-trunk #1421 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1421/]) YARN-655. Fair scheduler metrics should subtract allocated memory from available memory. (sandyr via tucu) (Revision 1480809) Result = FAILURE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480809 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java Fair scheduler metrics should subtract allocated memory from available memory - Key: YARN-655 URL: https://issues.apache.org/jira/browse/YARN-655 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.0.5-beta Attachments: YARN-655.patch In the scheduler web UI, cluster metrics reports that the Memory Total goes up when an application is allocated resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-507) Add interface visibility and stability annotations to FS interfaces/classes
[ https://issues.apache.org/jira/browse/YARN-507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654489#comment-13654489 ] Hudson commented on YARN-507: - Integrated in Hadoop-Mapreduce-trunk #1421 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1421/]) YARN-507. Add interface visibility and stability annotations to FS interfaces/classes. (kkambatl via tucu) (Revision 1480799) Result = FAILURE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480799 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSParentQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerApp.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FifoPolicy.java Add interface visibility and stability annotations to FS interfaces/classes --- Key: YARN-507 URL: https://issues.apache.org/jira/browse/YARN-507 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Minor Labels: scheduler Fix For: 2.0.5-beta Attachments: yarn-507.patch Many of FS classes/interfaces are missing annotations on visibility and stability. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-637) FS: maxAssign is not honored
[ https://issues.apache.org/jira/browse/YARN-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654490#comment-13654490 ] Hudson commented on YARN-637: - Integrated in Hadoop-Mapreduce-trunk #1421 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1421/]) YARN-637. FS: maxAssign is not honored. (kkambatl via tucu) (Revision 1480802) Result = FAILURE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480802 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java FS: maxAssign is not honored Key: YARN-637 URL: https://issues.apache.org/jira/browse/YARN-637 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.4-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 2.0.5-beta Attachments: yarn-637.patch maxAssign limits the number of containers that can be assigned in a single heartbeat. Currently, FS doesn't keep track of number of assigned containers to check this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-572) Remove duplication of data in Container
[ https://issues.apache.org/jira/browse/YARN-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen reassigned YARN-572: Assignee: Zhijie Shen (was: Hitesh Shah) Remove duplication of data in Container Key: YARN-572 URL: https://issues.apache.org/jira/browse/YARN-572 Project: Hadoop YARN Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Zhijie Shen Most of the information needed to launch a container is duplicated in both the Container class as well as in the ContainerToken object that the Container object already contains. It would be good to remove this level of duplication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-392) Make it possible to schedule to specific nodes without dropping locality
[ https://issues.apache.org/jira/browse/YARN-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654605#comment-13654605 ] Alejandro Abdelnur commented on YARN-392: - Agree with Sandy, the low level API contract requires deep understanding of how things work, the AMRMClient layer hides/handles much of that complexity already, this particular feature should be handle there in the similar way. [~sandyr] please open a JIRA for it. Make it possible to schedule to specific nodes without dropping locality Key: YARN-392 URL: https://issues.apache.org/jira/browse/YARN-392 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Sandy Ryza Attachments: YARN-392-1.patch, YARN-392-2.patch, YARN-392-2.patch, YARN-392-2.patch, YARN-392.patch Currently its not possible to specify scheduling requests for specific nodes and nowhere else. The RM automatically relaxes locality to rack and * and assigns non-specified machines to the app. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-392) Make it possible to schedule to specific nodes without dropping locality
[ https://issues.apache.org/jira/browse/YARN-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654655#comment-13654655 ] Bikas Saha commented on YARN-392: - Of course, changing the AMRMClient to support this would be a logical extension. Does that mean that the server can afford to not check for inconsistent requests that will result in a bad state for the server and/or incorrect results for the users? Perhaps only when AMRMClient is the only entity that is ever going to talk to the server. Is that the case? Not doing checks by assuming that pre-conditions will hold is a slippery path IMO. Currently, when ApplicationMasterService calls scheduler.allocate then the scheduler can throw an exception about invalid allocations which get returned to the client. So its fairly easy to solve this in YARN-394. Make it possible to schedule to specific nodes without dropping locality Key: YARN-392 URL: https://issues.apache.org/jira/browse/YARN-392 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Sandy Ryza Attachments: YARN-392-1.patch, YARN-392-2.patch, YARN-392-2.patch, YARN-392-2.patch, YARN-392.patch Currently its not possible to specify scheduling requests for specific nodes and nowhere else. The RM automatically relaxes locality to rack and * and assigns non-specified machines to the app. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-634) Make YarnRemoteException not backed by PB and introduce a SerializedException
[ https://issues.apache.org/jira/browse/YARN-634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-634: - Summary: Make YarnRemoteException not backed by PB and introduce a SerializedException (was: Introduce a SerializedException) Make YarnRemoteException not backed by PB and introduce a SerializedException - Key: YARN-634 URL: https://issues.apache.org/jira/browse/YARN-634 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: YARN-634.patch.2, YARN-634.txt LocalizationProtocol sends an exception over the wire. This currently uses YarnRemoteException. Post YARN-627, this needs to be changed and a new serialized exception is required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-660) Improve AMRMClient with cookies and matching requests
[ https://issues.apache.org/jira/browse/YARN-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654677#comment-13654677 ] Hitesh Shah commented on YARN-660: -- Comments: - DistributedShell#ApplicationMaster#onError should be handled. - Cookie should be a final var and part of StoredContainerRequest's constructor - Don't ContainerRequest and StoredContainerReq require hashCode() and equals() implementations? Improve AMRMClient with cookies and matching requests - Key: YARN-660 URL: https://issues.apache.org/jira/browse/YARN-660 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.5-beta Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.5-beta Attachments: YARN-660.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-638) Restore RMDelegationTokens after RM Restart
[ https://issues.apache.org/jira/browse/YARN-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654681#comment-13654681 ] Bikas Saha commented on YARN-638: - Should HDFS code change to use this (and basically do what it was doing earlier). Otherwise the code will still be obfuscated. {code} + protected void removeMasterKey(DelegationKey key) { +return; + } {code} Did not quite get what this is doing? {code} it.remove(); +if(!e.getValue().equals(currentKey)) + removeMasterKey(e.getValue()); } {code} Recover secret manager before apps? Apps should be able to assume secret manager has valid state. {code} rmAppManager.recover(state); + +// recover RMdelegationTokenSecretManager +rmDTSecretManager.recover(state); {code} Can we put all secret manager state in one rmDTState object (like appState) and then have masterKeyState and delegationTokenState inside it? {code} +// DTIdentifier - renewDate +MapRMDelegationTokenIdentifier, Long rmDTState = +new HashMapRMDelegationTokenIdentifier, Long(); + +SetDelegationKey rmDTMasterKeyState = +new HashSetDelegationKey( {code} Missing javadoc for new APIs in RMStateStore loadState()/store secret key impl is missing from filesystem store Why store redundant rmStateStore when its already available from context? {code} + protected final RMContext rmContext; + private RMStateStore rmStateStore; {code} Will this just end the current thread or actually stop the RM? {code} +} catch (Exception e) { + throw new RuntimeException(Error in removing master key, e); +} {code} Dont think we should be implementing these methods. There should be explicit store. {code} + /** + * remove all expired master keys except current key and store the new key + */ + @Override + protected void logUpdateMasterKey(DelegationKey newKey) { +try { + rmStateStore.storeMasterKey(newKey); +} catch (Exception e) { + throw new RuntimeException(Error in storing master key, e); +} + } + + @Override + protected void logExpireToken(RMDelegationTokenIdentifier ident) + throws IOException { +try { + rmStateStore.removeRMDelegationToken(ident); +} catch (Exception e) { + throw new IOException(Error in removing RMDelegationToken, e); +} + } {code} Can just pass in a mock context that returns the test state store {code} + // FOR TEST + public void setRMStateStore(RMStateStore rmStore) { +this.rmStateStore = rmStore; + } {code} Is the test ensuring that the recovered tokens can be renewed by the client? We need to carefully check the threads on which store is being called. We cannot block the AsyncDispatcher thread. Restore RMDelegationTokens after RM Restart --- Key: YARN-638 URL: https://issues.apache.org/jira/browse/YARN-638 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Jian He Attachments: YARN-638.1.patch, YARN-638.2.patch, YARN-638.3.patch, YARN-638.4.patch, YARN-638.5.patch This is missed in YARN-581. After RM restart, RMDelegationTokens need to be added both in DelegationTokenRenewer (addressed in YARN-581), and delegationTokenSecretManager -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-660) Improve AMRMClient with cookies and matching requests
[ https://issues.apache.org/jira/browse/YARN-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654683#comment-13654683 ] Bikas Saha commented on YARN-660: - As of now there has been no need to override the default hashcode and equals impls. For stored containers I think I prefer an object comparison because I would like the same objects to be passed around for efficiency. Making the other changes. Improve AMRMClient with cookies and matching requests - Key: YARN-660 URL: https://issues.apache.org/jira/browse/YARN-660 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.5-beta Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.5-beta Attachments: YARN-660.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-660) Improve AMRMClient with cookies and matching requests
[ https://issues.apache.org/jira/browse/YARN-660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-660: Attachment: YARN-660.2.patch New patch with comments addressed. Improve AMRMClient with cookies and matching requests - Key: YARN-660 URL: https://issues.apache.org/jira/browse/YARN-660 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.5-beta Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.5-beta Attachments: YARN-660.1.patch, YARN-660.2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-392) Make it possible to schedule to specific nodes without dropping locality
[ https://issues.apache.org/jira/browse/YARN-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654708#comment-13654708 ] Sandy Ryza commented on YARN-392: - I think the server should check weird requests and state. My point was mainly that we haven't been doing these checks up to this point, so I didn't think we should be blocked on it. Especially given that, in this case, the consequences of invalid requests are for the apps submitting them. There will be no other bad state inside the RM. That said, there is some sanity checking that we can do at request time, such as making sure that if a rack has disableAllocation turned on, the number of containers requested by nodes under it must sum to at least its number of containers. I can add this in to the patch. Make it possible to schedule to specific nodes without dropping locality Key: YARN-392 URL: https://issues.apache.org/jira/browse/YARN-392 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Sandy Ryza Attachments: YARN-392-1.patch, YARN-392-2.patch, YARN-392-2.patch, YARN-392-2.patch, YARN-392.patch Currently its not possible to specify scheduling requests for specific nodes and nowhere else. The RM automatically relaxes locality to rack and * and assigns non-specified machines to the app. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-590) Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown
[ https://issues.apache.org/jira/browse/YARN-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654709#comment-13654709 ] Hadoop QA commented on YARN-590: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12582655/YARN-590-trunk-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/909//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/909//console This message is automatically generated. Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown --- Key: YARN-590 URL: https://issues.apache.org/jira/browse/YARN-590 Project: Hadoop YARN Issue Type: Improvement Reporter: Vinod Kumar Vavilapalli Assignee: Mayank Bansal Attachments: YARN-590-trunk-1.patch, YARN-590-trunk-2.patch We should log such message in NM itself. Helps in debugging issues on NM directly instead of distributed debugging between RM and NM when such an action is received from RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-660) Improve AMRMClient with cookies and matching requests
[ https://issues.apache.org/jira/browse/YARN-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654713#comment-13654713 ] Hadoop QA commented on YARN-660: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12582656/YARN-660.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/910//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/910//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-client.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/910//console This message is automatically generated. Improve AMRMClient with cookies and matching requests - Key: YARN-660 URL: https://issues.apache.org/jira/browse/YARN-660 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.5-beta Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.5-beta Attachments: YARN-660.1.patch, YARN-660.2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-663) Change ResourceTracker API and LocalizationProtocol API to throw YarnRemoteException and IOException
[ https://issues.apache.org/jira/browse/YARN-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654789#comment-13654789 ] Vinod Kumar Vavilapalli commented on YARN-663: -- Looks good, can you please apply it on branch-2 and run tests and report? Tx. Change ResourceTracker API and LocalizationProtocol API to throw YarnRemoteException and IOException Key: YARN-663 URL: https://issues.apache.org/jira/browse/YARN-663 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-663.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-392) Make it possible to schedule to specific nodes without dropping locality
[ https://issues.apache.org/jira/browse/YARN-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654806#comment-13654806 ] Bikas Saha commented on YARN-392: - bq. My point was mainly that we haven't been doing these checks up to this point, so I didn't think we should be blocked on it. Would be great if you could help enumerate cases you know of. We can add them to YARN-394 for tracking. Recently, we started throwing InvalidResourceRequest in the RM for requests that are invalid (more than max resource allowed etc) in YARN-193. So that takes care of one of the known cases where checks were not being performed. The other case is when a Resource request is valid but later becomes invalid mainly related to nodes being lost. E.g. when a high memory machine is lost. Or when specific resources were requested (this jira) and they become unavailable later on. These cases motivated YARN-394 and are described therein. So we are tracking towards sanity checking IMO. In YARN-142 etc we are changing protocols so that such exceptions are visible to users and they can act on them programmatically. bq. Capacity scheduler changes are targeted for YARN-398. The title of that jira says white listing and black listing of nodes. So you may want to check with [~acmurthy] if the intent of that jira matches what you think it is. Make it possible to schedule to specific nodes without dropping locality Key: YARN-392 URL: https://issues.apache.org/jira/browse/YARN-392 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Sandy Ryza Attachments: YARN-392-1.patch, YARN-392-2.patch, YARN-392-2.patch, YARN-392-2.patch, YARN-392.patch Currently its not possible to specify scheduling requests for specific nodes and nowhere else. The RM automatically relaxes locality to rack and * and assigns non-specified machines to the app. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-663) Change ResourceTracker API and LocalizationProtocol API to throw YarnRemoteException and IOException
[ https://issues.apache.org/jira/browse/YARN-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654849#comment-13654849 ] Xuan Gong commented on YARN-663: All test cases are passing on branch-2 Change ResourceTracker API and LocalizationProtocol API to throw YarnRemoteException and IOException Key: YARN-663 URL: https://issues.apache.org/jira/browse/YARN-663 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-663.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-664) throw InvalidRequestException for requests with different capabilities at the same priority
Sandy Ryza created YARN-664: --- Summary: throw InvalidRequestException for requests with different capabilities at the same priority Key: YARN-664 URL: https://issues.apache.org/jira/browse/YARN-664 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Nothing stops an application from submitting a request with priority=1, location=*, memory=1024 and a request with priority=1, location=rack1, memory=1024. However, this does not make sense under the request model and can cause bad things to happen in the scheduler. It should be possible to detect this at AMRM heartbeat time and throw an exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-590) Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown
[ https://issues.apache.org/jira/browse/YARN-590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated YARN-590: --- Attachment: YARN-590-trunk-3.patch Thanks Vinod for the review. Update all your comments. Thanks, Mayank Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown --- Key: YARN-590 URL: https://issues.apache.org/jira/browse/YARN-590 Project: Hadoop YARN Issue Type: Improvement Reporter: Vinod Kumar Vavilapalli Assignee: Mayank Bansal Attachments: YARN-590-trunk-1.patch, YARN-590-trunk-2.patch, YARN-590-trunk-3.patch We should log such message in NM itself. Helps in debugging issues on NM directly instead of distributed debugging between RM and NM when such an action is received from RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-634) Make YarnRemoteException not backed by PB and introduce a SerializedException
[ https://issues.apache.org/jira/browse/YARN-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654875#comment-13654875 ] Vinod Kumar Vavilapalli commented on YARN-634: -- +1, this looks good. Checking it in.. Make YarnRemoteException not backed by PB and introduce a SerializedException - Key: YARN-634 URL: https://issues.apache.org/jira/browse/YARN-634 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: YARN-634.patch.2, YARN-634.txt LocalizationProtocol sends an exception over the wire. This currently uses YarnRemoteException. Post YARN-627, this needs to be changed and a new serialized exception is required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-590) Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown
[ https://issues.apache.org/jira/browse/YARN-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654896#comment-13654896 ] Hadoop QA commented on YARN-590: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12582697/YARN-590-trunk-3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/911//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/911//console This message is automatically generated. Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown --- Key: YARN-590 URL: https://issues.apache.org/jira/browse/YARN-590 Project: Hadoop YARN Issue Type: Improvement Reporter: Vinod Kumar Vavilapalli Assignee: Mayank Bansal Attachments: YARN-590-trunk-1.patch, YARN-590-trunk-2.patch, YARN-590-trunk-3.patch We should log such message in NM itself. Helps in debugging issues on NM directly instead of distributed debugging between RM and NM when such an action is received from RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-634) Make YarnRemoteException not backed by PB and introduce a SerializedException
[ https://issues.apache.org/jira/browse/YARN-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654901#comment-13654901 ] Hudson commented on YARN-634: - Integrated in Hadoop-trunk-Commit #3741 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3741/]) YARN-634. Modified YarnRemoteException to be not backed by PB and introduced a separate SerializedException record. Contributed by Siddharth Seth. MAPREDUCE-5239. Updated MR App to reflect YarnRemoteException changes after YARN-634. Contributed by Siddharth Seth. (Revision 1481205) Result = SUCCESS vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1481205 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/launcher/TestContainerLauncher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/launcher/TestContainerLauncherImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestClientRedirect.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/exceptions/YarnRemoteException.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/exceptions/impl * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/impl/pb/YarnRemoteExceptionFactoryPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factory/providers/YarnRemoteExceptionFactoryProvider.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestContainerLaunchRPC.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestRPC.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/SerializedException.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/impl/pb/SerializedExceptionPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/utils/YarnServerBuilderUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/api/protocolrecords/LocalResourceStatus.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/api/protocolrecords/impl/pb/LocalResourceStatusPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerResourceFailedEvent.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ContainerLocalizer.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalizedResource.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java *
[jira] [Updated] (YARN-660) Improve AMRMClient with cookies and matching requests
[ https://issues.apache.org/jira/browse/YARN-660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-660: Attachment: YARN-660.3.patch Fixing findbugs. Improve AMRMClient with cookies and matching requests - Key: YARN-660 URL: https://issues.apache.org/jira/browse/YARN-660 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.5-beta Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.5-beta Attachments: YARN-660.1.patch, YARN-660.2.patch, YARN-660.3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-663) Change ResourceTracker API and LocalizationProtocol API to throw YarnRemoteException and IOException
[ https://issues.apache.org/jira/browse/YARN-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654914#comment-13654914 ] Hudson commented on YARN-663: - Integrated in Hadoop-trunk-Commit #3742 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3742/]) YARN-663. Changed ResourceTracker API and LocalizationProtocol API to throw YarnRemoteException and IOException. Contributed by Xuan Gong. (Revision 1481215) Result = SUCCESS vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1481215 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/ResourceTracker.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/client/ResourceTrackerPBClientImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/service/ResourceTrackerPBServiceImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/TestRPCFactories.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/api/LocalizationProtocol.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/api/impl/pb/client/LocalizationProtocolPBClientImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/api/impl/pb/service/LocalizationProtocolPBServiceImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/LocalRMInterface.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/MockNodeStatusUpdater.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java Change ResourceTracker API and LocalizationProtocol API to throw YarnRemoteException and IOException Key: YARN-663 URL: https://issues.apache.org/jira/browse/YARN-663 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Fix For: 2.0.5-beta Attachments: YARN-663.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-617) In unsercure mode, AM can fake resource requirements
[ https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654919#comment-13654919 ] Daryn Sharp commented on YARN-617: -- In addition to Vinod: * {{NodeManager}}: Minor, does it really need to log that security is on? {{ContainerManagerImpl}} * {{getContainerTokenIdentifier}}: Why is it conditionalized to call {{selectContainerTokenIdentifier}} on either the remote ugi, or the container? It should be one or the other. Even if security is off, a client will pass a token, and a server will accept it if it has a secret manager. Until YARN-613 is complete, it should always use the ugi token else with security on I can auth with one token, but then put another in the launch context. * {{selectContainerTokenIdentifier(Container)}}: you can simply use {{Token#decodeIdentifier()}} * Minor, ContainerTokenIdentifier cannot be null! Null found for $containerID seems a bit redundant and confusing to the casual user. Perhaps a more succinctNo ContainerToken found for $containerID. * {{authorizeRequest}} ** {{launchContext.getUser().equals(tokenId.getApplicationSubmitter()}} - this is now a worthless check. The application submitter in the token is the trusted/authoritative value. The launch context user used to be used in lieu of a token, so now with tokens always used, all it does is catch a buggy AM. A malicious AM will just set the context user to match the token. I'd remove the check entirely, if not even remove the user from the launch context. ** {{resource == null}} is added, when resources should removed from the launch context. The check was there to ensure an AM in a secure cluster couldn't lie in the launch context. The premise of this jira is to not let AMs in an insecure cluster lie, so the value in the context is moot. ** Overall, seems this method should be reduced to verifying the validity of the token, and not that it matches the context values. * In general seems unnecessary for every caller of {{authorizerequest}} to have duplicate logic to extract and pass the token to the method. {{authorizerequest}} should locate the token itself. * I'd revert the order of the logic back to authorizing requests before checking if the container exists. By flipping them, I can now probe nodes for locations of containers. I didn't review the test cases due to time constraints. I'll look at them when the main issues Vinod and I have cited are addressed. In unsercure mode, AM can fake resource requirements - Key: YARN-617 URL: https://issues.apache.org/jira/browse/YARN-617 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Omkar Vinit Joshi Priority: Minor Attachments: YARN-617.20130501.1.patch, YARN-617.20130501.patch, YARN-617.20130502.patch, YARN-617-20130507.patch, YARN-617.20130508.patch Without security, it is impossible to completely avoid AMs faking resources. We can at the least make it as difficult as possible by using the same container tokens and the RM-NM shared key mechanism over unauthenticated RM-NM channel. In the minimum, this will avoid accidental bugs in AMs in unsecure mode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-660) Improve AMRMClient with cookies and matching requests
[ https://issues.apache.org/jira/browse/YARN-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654923#comment-13654923 ] Hadoop QA commented on YARN-660: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12582705/YARN-660.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/912//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/912//console This message is automatically generated. Improve AMRMClient with cookies and matching requests - Key: YARN-660 URL: https://issues.apache.org/jira/browse/YARN-660 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.5-beta Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.5-beta Attachments: YARN-660.1.patch, YARN-660.2.patch, YARN-660.3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-590) Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown
[ https://issues.apache.org/jira/browse/YARN-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654924#comment-13654924 ] Vinod Kumar Vavilapalli commented on YARN-590: -- The latest patch looks good, +1. Checking it in. Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown --- Key: YARN-590 URL: https://issues.apache.org/jira/browse/YARN-590 Project: Hadoop YARN Issue Type: Improvement Reporter: Vinod Kumar Vavilapalli Assignee: Mayank Bansal Attachments: YARN-590-trunk-1.patch, YARN-590-trunk-2.patch, YARN-590-trunk-3.patch We should log such message in NM itself. Helps in debugging issues on NM directly instead of distributed debugging between RM and NM when such an action is received from RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-660) Improve AMRMClient with cookies and matching requests
[ https://issues.apache.org/jira/browse/YARN-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654944#comment-13654944 ] Hitesh Shah commented on YARN-660: -- +1. Looks good to me. Improve AMRMClient with cookies and matching requests - Key: YARN-660 URL: https://issues.apache.org/jira/browse/YARN-660 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.5-beta Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.5-beta Attachments: YARN-660.1.patch, YARN-660.2.patch, YARN-660.3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-572) Remove duplication of data in Container
[ https://issues.apache.org/jira/browse/YARN-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654946#comment-13654946 ] Zhijie Shen commented on YARN-572: -- There are three fields that are duplicate in Container and ContainerTokenIdentifier: 1. ContainerId 2. NodeId 3. Resource Though the fields are duplicated, it seems not good to remove them from Container. 1 and 2 are used to compare the Container objects and the getters of all the three are referred in multiple places tens of times. Remove duplication of data in Container Key: YARN-572 URL: https://issues.apache.org/jira/browse/YARN-572 Project: Hadoop YARN Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Zhijie Shen Most of the information needed to launch a container is duplicated in both the Container class as well as in the ContainerToken object that the Container object already contains. It would be good to remove this level of duplication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-660) Improve AMRMClient with cookies and matching requests
[ https://issues.apache.org/jira/browse/YARN-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654952#comment-13654952 ] Vinod Kumar Vavilapalli commented on YARN-660: -- I'd like to have a quick look before commit. Looking right now. Improve AMRMClient with cookies and matching requests - Key: YARN-660 URL: https://issues.apache.org/jira/browse/YARN-660 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.5-beta Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.5-beta Attachments: YARN-660.1.patch, YARN-660.2.patch, YARN-660.3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-572) Remove duplication of data in Container
[ https://issues.apache.org/jira/browse/YARN-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654958#comment-13654958 ] Zhijie Shen commented on YARN-572: -- And in some cases, the three fields are set while container token is null in a Container object, e.g., MRAppBenchmark, LocalContainerAllocator. Therefore, if three fields are removed from Container, we need to constructor a ContainerTokenIdentifier object just to hold the three values in these cases. Remove duplication of data in Container Key: YARN-572 URL: https://issues.apache.org/jira/browse/YARN-572 Project: Hadoop YARN Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Zhijie Shen Most of the information needed to launch a container is duplicated in both the Container class as well as in the ContainerToken object that the Container object already contains. It would be good to remove this level of duplication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-660) Improve AMRMClient with cookies and matching requests
[ https://issues.apache.org/jira/browse/YARN-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654973#comment-13654973 ] Alejandro Abdelnur commented on YARN-660: - On the {{ResourceRequestInfo}}'s {{HashSetT containerRequests}}, wouldn't be more intuitive for AMRMClient users if they get the matching requests in the other they were submitted? This could easily be done using a {{LinkedHashSet}}. Improve AMRMClient with cookies and matching requests - Key: YARN-660 URL: https://issues.apache.org/jira/browse/YARN-660 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.5-beta Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.5-beta Attachments: YARN-660.1.patch, YARN-660.2.patch, YARN-660.3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-664) throw InvalidRequestException for requests with different capabilities at the same priority
[ https://issues.apache.org/jira/browse/YARN-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654981#comment-13654981 ] Hitesh Shah commented on YARN-664: -- Actually, isn't the example actually a valid request? throw InvalidRequestException for requests with different capabilities at the same priority --- Key: YARN-664 URL: https://issues.apache.org/jira/browse/YARN-664 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Nothing stops an application from submitting a request with priority=1, location=*, memory=1024 and a request with priority=1, location=rack1, memory=1024. However, this does not make sense under the request model and can cause bad things to happen in the scheduler. It should be possible to detect this at AMRM heartbeat time and throw an exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-664) throw InvalidRequestException for requests with different capabilities at the same priority
[ https://issues.apache.org/jira/browse/YARN-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13654986#comment-13654986 ] Hitesh Shah commented on YARN-664: -- Original comment still holds for an AM being able to request 1 container of 1 GB and 1 container of 4 GB both at priority 1. throw InvalidRequestException for requests with different capabilities at the same priority --- Key: YARN-664 URL: https://issues.apache.org/jira/browse/YARN-664 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Nothing stops an application from submitting a request with priority=1, location=*, memory=1024 and a request with priority=1, location=rack1, memory=1024. However, this does not make sense under the request model and can cause bad things to happen in the scheduler. It should be possible to detect this at AMRM heartbeat time and throw an exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-392) Make it possible to schedule to specific nodes without dropping locality
[ https://issues.apache.org/jira/browse/YARN-392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-392: Attachment: YARN-392-3.patch Make it possible to schedule to specific nodes without dropping locality Key: YARN-392 URL: https://issues.apache.org/jira/browse/YARN-392 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Sandy Ryza Attachments: YARN-392-1.patch, YARN-392-2.patch, YARN-392-2.patch, YARN-392-2.patch, YARN-392-3.patch, YARN-392.patch Currently its not possible to specify scheduling requests for specific nodes and nowhere else. The RM automatically relaxes locality to rack and * and assigns non-specified machines to the app. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-660) Improve AMRMClient with cookies and matching requests
[ https://issues.apache.org/jira/browse/YARN-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655012#comment-13655012 ] Vinod Kumar Vavilapalli commented on YARN-660: -- Quick comment: h4.Cookies: Seems like the cookies aren't used by the library itself. Is that true? If so, why does the library need to provide StoredContainerRequest, can't the users implement one and pass it on? h4. Generics/typing: Adding in this type information for every AMRMClient is an unnecessary burden for most cases IMO. Users can skip explicit typing, but the compiler will warn unnecessarily. Can't this be done without the explicit typing? I can see that the new API getMatchingRequests is useful, +1 for that. What if we just return a CollectionContainerRequest all the time, and get clients to do explicit conversion if they care about adding in more information. Similarly, add and remove APIs already take in a ContainerRequest, you can just continue to pass sub-types of ContainerRequest when you want to. I quickly did this on your latest patch, removing all typing in the library, it works. Have to dig deeper into the client libraries, haven't looked at them at all. Improve AMRMClient with cookies and matching requests - Key: YARN-660 URL: https://issues.apache.org/jira/browse/YARN-660 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.5-beta Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.5-beta Attachments: YARN-660.1.patch, YARN-660.2.patch, YARN-660.3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-613) Create NM proxy per NM instead of per container
[ https://issues.apache.org/jira/browse/YARN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655014#comment-13655014 ] Daryn Sharp commented on YARN-613: -- bq. That is true in general. And I am not sure how we can even contain such a break-in. I suppose going the way of DataNode to start the server on privileged ports will contain it [1]. If one can get hold of the keytab(owned by YARN user) I suppose at that point he can launch the container-executor binary too, which will give him root access. So it's all predicated on secure setup to not do stupid things Secure ports would help a bit, but it's another pain point to compensate for weakened security. A keytab might be leaked due to weak permissions, or maybe it's not even the keytab in the official path, but a copy a SE left sitting in their home dir. So I might or might not be the yarn user with my stolen keytab. Assuming you are the yarn user, I'm almost positive you can't get root with the container executor - I last looked it had a hardcoded check to reject root. The main concern I have is any NM will have the power to forge AM tokens for all other NMs. As the number of nodes scales in a cluster, its vulnerability increases. All I have to do is compromise 1 node out of thousands. I can then forge AM and container tokens, and launch jobs on the thousands of other nodes as any arbitrary user so I can compromise those hosts too. I steal those users' appTokens from running jobs and now I have access to hdfs and other services. Game over. So here's how I think we can achieve both our goals: A node token. When the RM returns container tokens, it also provides node tokens. The node token is for authentication, the container token authorizes the launch request. Now you can have one AM-NM connection. You can then decide if you want status and stop operations to authenticate and/or authorize via other tokens like AM tokens. If so, pass those tokens in the launch request. Now you've explicitly informed the NM of permitted (AM) tokens, instead of giving the NM the power to fabricate other (AM) tokens. Create NM proxy per NM instead of per container --- Key: YARN-613 URL: https://issues.apache.org/jira/browse/YARN-613 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Omkar Vinit Joshi Currently a new NM proxy has to be created per container since the secure authentication is using a containertoken from the container. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-572) Remove duplication of data in Container
[ https://issues.apache.org/jira/browse/YARN-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655018#comment-13655018 ] Hitesh Shah commented on YARN-572: -- Assuming that the ContainerToken is going to be changed into a bytebuffer to hide from the AM, the duplication of fields will be necessary. @Sid, @Vinod, can you confirm on the bytebuffer change? In that case, we can close this jira out. Remove duplication of data in Container Key: YARN-572 URL: https://issues.apache.org/jira/browse/YARN-572 Project: Hadoop YARN Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Zhijie Shen Most of the information needed to launch a container is duplicated in both the Container class as well as in the ContainerToken object that the Container object already contains. It would be good to remove this level of duplication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-392) Make it possible to schedule to specific nodes without dropping locality
[ https://issues.apache.org/jira/browse/YARN-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655019#comment-13655019 ] Hadoop QA commented on YARN-392: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12582732/YARN-392-3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/913//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/913//console This message is automatically generated. Make it possible to schedule to specific nodes without dropping locality Key: YARN-392 URL: https://issues.apache.org/jira/browse/YARN-392 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Sandy Ryza Attachments: YARN-392-1.patch, YARN-392-2.patch, YARN-392-2.patch, YARN-392-2.patch, YARN-392-3.patch, YARN-392.patch Currently its not possible to specify scheduling requests for specific nodes and nowhere else. The RM automatically relaxes locality to rack and * and assigns non-specified machines to the app. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-660) Improve AMRMClient with cookies and matching requests
[ https://issues.apache.org/jira/browse/YARN-660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-660: Attachment: YARN-660.4.patch New patch uses LinkedHashSet. Good point tucu. Improve AMRMClient with cookies and matching requests - Key: YARN-660 URL: https://issues.apache.org/jira/browse/YARN-660 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.5-beta Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.5-beta Attachments: YARN-660.1.patch, YARN-660.2.patch, YARN-660.3.patch, YARN-660.4.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-572) Remove duplication of data in Container
[ https://issues.apache.org/jira/browse/YARN-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655024#comment-13655024 ] Vinod Kumar Vavilapalli commented on YARN-572: -- ContainerToken is in a sense already non-interpretable, see org.apache.hadoop.yarn.api.records.ContainerToken and its super class : the ID is already a ByteBuffer. What we need to do is move ContainerTokenIdentifier off to server-common, so that Client will have absolutely have no way of interpreting the ByteBuffer. Arguably, they could still do it, but at that point, it isn't supported clearly. Remove duplication of data in Container Key: YARN-572 URL: https://issues.apache.org/jira/browse/YARN-572 Project: Hadoop YARN Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Zhijie Shen Most of the information needed to launch a container is duplicated in both the Container class as well as in the ContainerToken object that the Container object already contains. It would be good to remove this level of duplication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-665) Support parallel localizations per container
Siddharth Seth created YARN-665: --- Summary: Support parallel localizations per container Key: YARN-665 URL: https://issues.apache.org/jira/browse/YARN-665 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth Localization is currently serialized if only a single container per application runs on a node. Depending on the size / number of resources - this is a small performance issue which especially impacts AMs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-666) [Umbrella] Support rolling upgrades in YARN
Siddharth Seth created YARN-666: --- Summary: [Umbrella] Support rolling upgrades in YARN Key: YARN-666 URL: https://issues.apache.org/jira/browse/YARN-666 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth Jira to track changes required in YARN to allow rolling upgrades, including documentation and possible upgrade routes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN
[ https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655036#comment-13655036 ] Hitesh Shah commented on YARN-666: -- +1 to getting this built out. As they say, the devil is in the details. [Umbrella] Support rolling upgrades in YARN --- Key: YARN-666 URL: https://issues.apache.org/jira/browse/YARN-666 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth Jira to track changes required in YARN to allow rolling upgrades, including documentation and possible upgrade routes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-590) Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown
[ https://issues.apache.org/jira/browse/YARN-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655037#comment-13655037 ] Hudson commented on YARN-590: - Integrated in Hadoop-trunk-Commit #3743 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3743/]) YARN-590. Added an optional mesage to be returned by ResourceMaanger when RM asks an RM to shutdown/resync etc so that NMs can log this message locally for better debuggability. Contributed by Mayank Bansal. (Revision 1481234) Result = SUCCESS vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1481234 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatResponse.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerResponse.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatResponsePBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerResponsePBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestRMNMRPCResponseId.java Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown --- Key: YARN-590 URL: https://issues.apache.org/jira/browse/YARN-590 Project: Hadoop YARN Issue Type: Improvement Reporter: Vinod Kumar Vavilapalli Assignee: Mayank Bansal Fix For: 2.0.5-beta Attachments: YARN-590-trunk-1.patch, YARN-590-trunk-2.patch, YARN-590-trunk-3.patch We should log such message in NM itself. Helps in debugging issues on NM directly instead of distributed debugging between RM and NM when such an action is received from RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-666) [Umbrella] Support rolling upgrades in YARN
[ https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated YARN-666: Attachment: YARN_Rolling_Upgrades.pdf Initial writeup on changes that will be required, upgrade scenarios, some notes on compatible changes etc. [Umbrella] Support rolling upgrades in YARN --- Key: YARN-666 URL: https://issues.apache.org/jira/browse/YARN-666 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth Attachments: YARN_Rolling_Upgrades.pdf Jira to track changes required in YARN to allow rolling upgrades, including documentation and possible upgrade routes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-392) Make it possible to schedule to specific nodes without dropping locality
[ https://issues.apache.org/jira/browse/YARN-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655039#comment-13655039 ] Alejandro Abdelnur commented on YARN-392: - Sandy, patch looks good to me, only NIT is that the ResourceRequest does not have javadocs. Make it possible to schedule to specific nodes without dropping locality Key: YARN-392 URL: https://issues.apache.org/jira/browse/YARN-392 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Sandy Ryza Attachments: YARN-392-1.patch, YARN-392-2.patch, YARN-392-2.patch, YARN-392-2.patch, YARN-392-3.patch, YARN-392.patch Currently its not possible to specify scheduling requests for specific nodes and nowhere else. The RM automatically relaxes locality to rack and * and assigns non-specified machines to the app. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-667) Data persisted by YARN daemons should be versioned
Siddharth Seth created YARN-667: --- Summary: Data persisted by YARN daemons should be versioned Key: YARN-667 URL: https://issues.apache.org/jira/browse/YARN-667 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth Includes data persisted for RM restart, NodeManager directory structure and the Aggregated Log Format. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-660) Improve AMRMClient with cookies and matching requests
[ https://issues.apache.org/jira/browse/YARN-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655043#comment-13655043 ] Hadoop QA commented on YARN-660: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12582739/YARN-660.4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/914//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/914//console This message is automatically generated. Improve AMRMClient with cookies and matching requests - Key: YARN-660 URL: https://issues.apache.org/jira/browse/YARN-660 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.5-beta Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.5-beta Attachments: YARN-660.1.patch, YARN-660.2.patch, YARN-660.3.patch, YARN-660.4.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-668) TokenIdentifier serialization should consider Unknown fields
Siddharth Seth created YARN-668: --- Summary: TokenIdentifier serialization should consider Unknown fields Key: YARN-668 URL: https://issues.apache.org/jira/browse/YARN-668 Project: Hadoop YARN Issue Type: Sub-task Reporter: Siddharth Seth This would allow changing of the TokenIdentifier between versions. The current serialization is Writable. A simple way to achieve this would be to have a Proto object as the payload for TokenIdentifiers, instead of individual fields. TokenIdentifier continues to implement Writable to work with the RPC layer - but the payload itself is serialized using PB. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-572) Remove duplication of data in Container
[ https://issues.apache.org/jira/browse/YARN-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655046#comment-13655046 ] Zhijie Shen commented on YARN-572: -- ContainerTokenIdentifier, ContainerTokenSelector, ContainerManagerSecurityInfo move to server.security of server-common, BuilderUtils.newContainerToken moves to YarnServerBuilderUtils.newContainerToken Remove duplication of data in Container Key: YARN-572 URL: https://issues.apache.org/jira/browse/YARN-572 Project: Hadoop YARN Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Zhijie Shen Most of the information needed to launch a container is duplicated in both the Container class as well as in the ContainerToken object that the Container object already contains. It would be good to remove this level of duplication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-670) Add an Exception to indicate 'Maintenance' for NMs and add this to the JavaDoc for appropriate protocols
Siddharth Seth created YARN-670: --- Summary: Add an Exception to indicate 'Maintenance' for NMs and add this to the JavaDoc for appropriate protocols Key: YARN-670 URL: https://issues.apache.org/jira/browse/YARN-670 Project: Hadoop YARN Issue Type: Sub-task Reporter: Siddharth Seth -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-660) Improve AMRMClient with cookies and matching requests
[ https://issues.apache.org/jira/browse/YARN-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655049#comment-13655049 ] Bikas Saha commented on YARN-660: - Vinod, I have iterated through different combinations of generics including the one you suggest. I found that the current version is the least ugly wrt user code. Both the test code for the AMRMClient as well as my use of it in TEZ gave me that impression. If you write user code that has to always cast/convert back from ContainerRequest to user-defined types you will see that it looks ugly and I dont want users to keep having to cast these value when they use specific types. The only user overhead when they use ContainerRequest is to add a ContainerRequest when declaring the member variable and during new. Thereafter everything works. And ContainerRequest is something that they anyways have to include in the code. Look at the diff of TestAMRMClient to see the minimal changes needed to make the existing code compile. Ideally if there was a typedef in Java then even that could be hidden but there isnt. IMO, ContainerRequest is useful mostly in cases when the user wants a bunch of containers at *. For most of the other cases where requests are made with more specificity StoredContainerRequest is more useful. Its provided by the library because it will be commonly needed and also to support returning matching requests from within the AMRMClient for reasons outlined earlier. Storing and retrieving matching requests in a meaningful manner cannot be done until we limit the number of containers in an individual request to 1. StoredContainerRequest provides a clear type inside the AMRMClient to say what to store and also to limit the container count per request to 1. e.g. if AMRMClient will save simple ContainerRequest then lets say it saves the container request in addContainerRequest(ContainerRequest(P1,R1,Count=4)) and then if user calls removeContainerRequest(ContainerRequest(P1,R1,Count=3)) its hard for AMRMClient to tell if the stored container request should be removed or not. Cookies are not required but very useful in reducing user burden. If we think about using AMRMClient inside the MR app master (or see the impl of TEZ) then we will find that for every request we need to save a cookie somewhere (eg. Scheduler*LaunchEvent) that will be used when the request is matched to a container. Either the client can write a map to maintain and store this relationship or the library provides a helper cookie to keep the info in one place. Let me know if you have any other comments. Improve AMRMClient with cookies and matching requests - Key: YARN-660 URL: https://issues.apache.org/jira/browse/YARN-660 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.5-beta Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.5-beta Attachments: YARN-660.1.patch, YARN-660.2.patch, YARN-660.3.patch, YARN-660.4.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-672) Consider Localizing Resources within the NM process itself in non-secure deployments
Siddharth Seth created YARN-672: --- Summary: Consider Localizing Resources within the NM process itself in non-secure deployments Key: YARN-672 URL: https://issues.apache.org/jira/browse/YARN-672 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth Specifically when the LCE is not used, the localizer could be run as a separate thread within the NM, instead of starting a new process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN
[ https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655062#comment-13655062 ] Karthik Kambatla commented on YARN-666: --- Sid - thanks for creating this. Excited. Just went over the design doc (which BTW is well-articulated) and have the following comments: # Steps to upgrade a YARN cluster: do you think it would make sense to upgrade the NMs first before upgrading the RM. If something goes wrong (hopefully not), users can fall-back to the older version. # Considerations (Upgrading the MR runtime): Until YARN/MR go into separate projects and release cycles, upgrading YARN alone (say 2.1.0 to 2.1.2) shouldn't affect the clients (MR) - no? # Looks like we need to come up with an appropriate policy for YARN data formats in HADOOP-9517. # I am assuming the version check will be similar to the one in HDFS-2983. # Big +1 to drain decommission [Umbrella] Support rolling upgrades in YARN --- Key: YARN-666 URL: https://issues.apache.org/jira/browse/YARN-666 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth Attachments: YARN_Rolling_Upgrades.pdf Jira to track changes required in YARN to allow rolling upgrades, including documentation and possible upgrade routes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-673) Remove yarn-default.xml
Siddharth Seth created YARN-673: --- Summary: Remove yarn-default.xml Key: YARN-673 URL: https://issues.apache.org/jira/browse/YARN-673 Project: Hadoop YARN Issue Type: Improvement Reporter: Siddharth Seth The default configuration files serve 2 purposes 1. Documenting available config parameters, and their default values. 2. Specifying default values for these parameters. An xml file hidden inside a jar is not necessarily the best way to document parameters. This could be moved into the documentation itself. Default values already exist in code for most parameters. There's no need to specify them in two places. We need to make sure defaults exist for all parameters before attempting this. Having default configuration files just bloats job conf files; over 450 parameters, out of which 20 are likely job specific params. JobConf files end up being rather big, and the memory footprint of the conf object is large (300KB last I checked). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-613) Create NM proxy per NM instead of per container
[ https://issues.apache.org/jira/browse/YARN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655072#comment-13655072 ] Vinod Kumar Vavilapalli commented on YARN-613: -- bq. A keytab might be leaked due to weak permissions, or maybe it's not even the keytab in the official path, but a copy a SE left sitting in their home dir. So I might or might not be the yarn user with my stolen keytab. Assuming you are the yarn user, I'm almost positive you can't get root with the container executor - I last looked it had a hardcoded check to reject root. My example of container-executor is wrong I agree. But my intention was in the lines of - if some one steals a NM keytab, game is over anyways. Other examples (at the least one correct one this time I hope) - Someone steals NM keytab, they can do so many things: - Starts a custom NM which advertises infinite resources with the RM and keep heartbeating often and gobble up all containers - Act sane, gobble up some containers, get some containers, get app-ids, guess and construct new container-ids and send false reports about other containers of the app which are running on other nodes - Just keep heartbeating in a loop and bring down RM You get the idea. Arguably there are minor checks we can do for the last two, but not the first one. It is unsolvable in general. Now coming to your specific solution. It looks like a good idea but needs minor clarifications/extensions. Let see if I got what you are proposing: - You have 1 NMToken per NM for the whole cluster and all AMs get the same NMToken for a given node. - Authorization for startContainer is via ContainerTokens - Authorization for stopContainer is via AMToken. Right? That works for startContainer() but won't for stopContainer(). - startContainer() is good: You use NMToken to authenticate to a node but can only start-containers if you have a valid ContainerToken - stopContainer() needs a little more help: Again authentication with NMToken is good. But we can't just rely on NMToken to allow access to an AM to stop a container. Let's start with a simple authz with no acls - AMs can only kill the containers that they own. To do this, NM needs to check what the APPID is for this App and then allow access for the corresponding containers. Now in order to avoid AMs faking AppIds, the NMTokenIdentifier should have NodeId, AppId, and may be also user-name for doing more complex app-acls. If we do that, when an NM get a stopContainer, it get the user-name and appid and can perform the necessary authorization. So in sum, yes, NMToken sounds a good idea, but we need it to have per AM information. Given above, we should perhaps call this AMNMToken and rename the current AMToken to be AMRMToken. Does that sound good? Create NM proxy per NM instead of per container --- Key: YARN-613 URL: https://issues.apache.org/jira/browse/YARN-613 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Omkar Vinit Joshi Currently a new NM proxy has to be created per container since the secure authentication is using a containertoken from the container. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-673) Remove yarn-default.xml
[ https://issues.apache.org/jira/browse/YARN-673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655073#comment-13655073 ] Karthik Kambatla commented on YARN-673: --- While I understand that having two different default values in the default.xml and code could be error-prone, I believe we should be careful about removing yarn-default.xml altogether. Particularly because moving it further from the code would make it even harder to ensure the documentation reflects the same default as the code. Also, it is a lot easier to grep through default.xml for defaults than documentation. Remove yarn-default.xml --- Key: YARN-673 URL: https://issues.apache.org/jira/browse/YARN-673 Project: Hadoop YARN Issue Type: Improvement Reporter: Siddharth Seth The default configuration files serve 2 purposes 1. Documenting available config parameters, and their default values. 2. Specifying default values for these parameters. An xml file hidden inside a jar is not necessarily the best way to document parameters. This could be moved into the documentation itself. Default values already exist in code for most parameters. There's no need to specify them in two places. We need to make sure defaults exist for all parameters before attempting this. Having default configuration files just bloats job conf files; over 450 parameters, out of which 20 are likely job specific params. JobConf files end up being rather big, and the memory footprint of the conf object is large (300KB last I checked). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-672) Consider Localizing Resources within the NM process itself in non-secure deployments
[ https://issues.apache.org/jira/browse/YARN-672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reassigned YARN-672: - Assignee: Karthik Kambatla Consider Localizing Resources within the NM process itself in non-secure deployments Key: YARN-672 URL: https://issues.apache.org/jira/browse/YARN-672 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth Assignee: Karthik Kambatla Specifically when the LCE is not used, the localizer could be run as a separate thread within the NM, instead of starting a new process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-672) Consider Localizing Resources within the NM process itself in non-secure deployments
[ https://issues.apache.org/jira/browse/YARN-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655081#comment-13655081 ] Vinod Kumar Vavilapalli commented on YARN-672: -- Should be careful while doing this. What I'd not like is different code-paths for secure and non-secure cases. Consider Localizing Resources within the NM process itself in non-secure deployments Key: YARN-672 URL: https://issues.apache.org/jira/browse/YARN-672 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth Assignee: Karthik Kambatla Specifically when the LCE is not used, the localizer could be run as a separate thread within the NM, instead of starting a new process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-672) Consider Localizing Resources within the NM process itself in non-secure deployments
[ https://issues.apache.org/jira/browse/YARN-672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-672: - Issue Type: Sub-task (was: Improvement) Parent: YARN-543 Consider Localizing Resources within the NM process itself in non-secure deployments Key: YARN-672 URL: https://issues.apache.org/jira/browse/YARN-672 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth Assignee: Karthik Kambatla Specifically when the LCE is not used, the localizer could be run as a separate thread within the NM, instead of starting a new process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-572) Remove duplication of data in Container
[ https://issues.apache.org/jira/browse/YARN-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655084#comment-13655084 ] Vinod Kumar Vavilapalli commented on YARN-572: -- Zhijie, if you plan to do that, we should just close this and do that separately. I didn't close it so others (Sid/Bikas who mentioned this before) can have the final call. Remove duplication of data in Container Key: YARN-572 URL: https://issues.apache.org/jira/browse/YARN-572 Project: Hadoop YARN Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Zhijie Shen Most of the information needed to launch a container is duplicated in both the Container class as well as in the ContainerToken object that the Container object already contains. It would be good to remove this level of duplication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-642) Fix up RMWebServices#getNodes
[ https://issues.apache.org/jira/browse/YARN-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655089#comment-13655089 ] Sandy Ryza commented on YARN-642: - bq. A general suggestion: Please add a little more information to the title than it is. Understood bq. Even so, as multiple people requested on MAPREDUCE-3760, the default active list should only have active nodes and no unhealthy nodes. The Java/RPC API (ClientRMProtocol#getClusterNodes) returns the list of all active nodes, which includes unhealthy nodes. It seems confusing to me to have them return different things. Thoughts? bq. I don't see a point of multiple comma separated states, the default result when no state is passed should return all active nodes. When a state is passed, only those nodes should be removed. My thought was that being able to specify multiple comma separated states makes it possible to get a consistent view of the cluster. If users need to make calls serially to get nodes in different states, they might miss some nodes that change state. With Tucu's suggestion of an all option, though, maybe this isn't necessary, so I can take it out. Fix up RMWebServices#getNodes - Key: YARN-642 URL: https://issues.apache.org/jira/browse/YARN-642 Project: Hadoop YARN Issue Type: Bug Components: api, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Labels: incompatible Attachments: YARN-642-1.patch, YARN-642.patch The code behind the /nodes RM REST API is unnecessarily muddled, logs the same misspelled INFO message repeatedly, and does not return unhealthy nodes, even when asked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-642) Fix up /nodes REST API to remove unnecessary param and be consistent with the Java API
[ https://issues.apache.org/jira/browse/YARN-642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-642: Summary: Fix up /nodes REST API to remove unnecessary param and be consistent with the Java API (was: Fix up RMWebServices#getNodes) Fix up /nodes REST API to remove unnecessary param and be consistent with the Java API -- Key: YARN-642 URL: https://issues.apache.org/jira/browse/YARN-642 Project: Hadoop YARN Issue Type: Bug Components: api, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Labels: incompatible Attachments: YARN-642-1.patch, YARN-642.patch The code behind the /nodes RM REST API is unnecessarily muddled, logs the same misspelled INFO message repeatedly, and does not return unhealthy nodes, even when asked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-642) Fix up /nodes REST API to have 1 param and be consistent with the Java API
[ https://issues.apache.org/jira/browse/YARN-642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-642: Summary: Fix up /nodes REST API to have 1 param and be consistent with the Java API (was: Fix up /nodes REST API to remove unnecessary param and be consistent with the Java API) Fix up /nodes REST API to have 1 param and be consistent with the Java API -- Key: YARN-642 URL: https://issues.apache.org/jira/browse/YARN-642 Project: Hadoop YARN Issue Type: Bug Components: api, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Labels: incompatible Attachments: YARN-642-1.patch, YARN-642.patch The code behind the /nodes RM REST API is unnecessarily muddled, logs the same misspelled INFO message repeatedly, and does not return unhealthy nodes, even when asked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-642) Fix up /nodes REST API to have 1 param and be consistent with the Java API
[ https://issues.apache.org/jira/browse/YARN-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655119#comment-13655119 ] Vinod Kumar Vavilapalli commented on YARN-642: -- bq. With Tucu's suggestion of an all option, though, maybe this isn't necessary, so I can take it out. Yes, we should just do this. bq. The Java/RPC API (ClientRMProtocol#getClusterNodes) returns the list of all active nodes, which includes unhealthy nodes. It seems confusing to me to have them return different things. Thoughts? Good point. I can rationalize what others said on MAPREDUCE-3760. When one goes to the default nodes UI, you are used to seeing active/good nodes since Hadoop-1. I think what we should do is - by default show only active nodes - if given a state, show the corresponding nodes - and a special all to show all nodes Thoughs? And clearly, we need to fix the RPC impl to accomodate the above, accepting states etc. Can you file a separate ticket for that? Fix up /nodes REST API to have 1 param and be consistent with the Java API -- Key: YARN-642 URL: https://issues.apache.org/jira/browse/YARN-642 Project: Hadoop YARN Issue Type: Bug Components: api, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Labels: incompatible Attachments: YARN-642-1.patch, YARN-642.patch The code behind the /nodes RM REST API is unnecessarily muddled, logs the same misspelled INFO message repeatedly, and does not return unhealthy nodes, even when asked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-660) Improve AMRMClient with cookies and matching requests
[ https://issues.apache.org/jira/browse/YARN-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13655134#comment-13655134 ] Vinod Kumar Vavilapalli commented on YARN-660: -- Okay, I took this as an opportunity to learn more about AMRMClient. Have seen other API issues in AMRMCient, will file tickets. Clearly we need a validation of the APIs like you are doing in TEZ. We should definitely try to change MR AppMaster to optionally use AMRMClient. Coming back to current issues at hand. Correct me if I am wrong, but it looks like these StoredContainerRequests are only useful if I have a Task in my job which needs a container and the returned container will only be used by that task? If yes - That is a very narrow use-cases. May be the simplest use-case which we can address by building on top of AMRMClient, like for e.g., a SimpleAMRMClient which only deals with a container at a time. - AMRMClient should be built for the more common use-cases - I give you these requests, you tell me when the containers are allocated, I'll do the second schedule pass according to my own logic. Again, I think I should see a more concrete usage. The current test-case is clearly not doing enough to explain how it is usable - we can implement a simple job which has tasks and needs containers inside this test-case and then illustrate the library usage. Improve AMRMClient with cookies and matching requests - Key: YARN-660 URL: https://issues.apache.org/jira/browse/YARN-660 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.5-beta Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.5-beta Attachments: YARN-660.1.patch, YARN-660.2.patch, YARN-660.3.patch, YARN-660.4.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira