[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153262#comment-14153262 ] Hudson commented on YARN-1769: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1912 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1912/]) YARN-1769. CapacityScheduler: Improve reservations. Contributed by Thomas Graves (jlowe: rev 9c22065109a77681bc2534063eabe8692fbcb3cd) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestParentQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java * hadoop-yarn-project/CHANGES.txt > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Fix For: 2.6.0 > > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153205#comment-14153205 ] Hudson commented on YARN-1769: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1887 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1887/]) YARN-1769. CapacityScheduler: Improve reservations. Contributed by Thomas Graves (jlowe: rev 9c22065109a77681bc2534063eabe8692fbcb3cd) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestParentQueue.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Fix For: 2.6.0 > > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153091#comment-14153091 ] Hudson commented on YARN-1769: -- FAILURE: Integrated in Hadoop-Yarn-trunk #696 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/696/]) YARN-1769. CapacityScheduler: Improve reservations. Contributed by Thomas Graves (jlowe: rev 9c22065109a77681bc2534063eabe8692fbcb3cd) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestParentQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * hadoop-yarn-project/CHANGES.txt > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Fix For: 2.6.0 > > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151726#comment-14151726 ] Hudson commented on YARN-1769: -- FAILURE: Integrated in Hadoop-trunk-Commit #6135 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6135/]) YARN-1769. CapacityScheduler: Improve reservations. Contributed by Thomas Graves (jlowe: rev 9c22065109a77681bc2534063eabe8692fbcb3cd) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestParentQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Fix For: 2.6.0 > > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151706#comment-14151706 ] Jason Lowe commented on YARN-1769: -- +1 lgtm. Committing this. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149941#comment-14149941 ] Hadoop QA commented on YARN-1769: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12671525/YARN-1769.patch against trunk revision 3a1f981. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5150//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5150//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149565#comment-14149565 ] Thomas Graves commented on YARN-1769: - Thanks for the review Jason. I'll update the patch and remove some of the logging or make it truly debug. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149497#comment-14149497 ] Jason Lowe commented on YARN-1769: -- Thanks for refreshing the patch, Tom. I think the patch looks good overall, just one main comment about the logging. When we tried an earlier version of this patch out on some clusters we noticed the RM was logging a lot more than before, especially the "we needed to unreserve to be able to allocate" and "was going to reserve but hit user/queue limit" messages. I see the first message is conditional on debug logging despite it being an INFO message which is odd, but there are other places where we always log where we didn't before (e.g.: LeafQueue.assignToQueue). Given how much the RM logs already, I'd be tempted to not add any more logging in this patch by default and instead add all additional log messages in this patch as debug messages. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148098#comment-14148098 ] Thomas Graves commented on YARN-1769: - We've been running this now on cluster for quite a while and its showing great improvements in the time to get larger containers. I would like to put this in. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148059#comment-14148059 ] Hadoop QA commented on YARN-1769: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12671255/YARN-1769.patch against trunk revision e0b1dc5. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5127//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5127//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147879#comment-14147879 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12671237/YARN-1769.patch against trunk revision 6c3cebd. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5126//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147848#comment-14147848 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12671234/YARN-1769.patch against trunk revision 6c3cebd. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5125//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147798#comment-14147798 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12671217/YARN-1769.patch against trunk revision dff95f7. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestReservations {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5123//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5123//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14076769#comment-14076769 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658215/YARN-1769.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4460//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4460//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14076607#comment-14076607 ] Hadoop QA commented on YARN-1769: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658198/YARN-1769.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4458//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4458//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061410#comment-14061410 ] Hadoop QA commented on YARN-1769: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12655618/YARN-1769.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4298//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4298//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032821#comment-14032821 ] Hadoop QA commented on YARN-1769: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12650615/YARN-1769.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4001//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4001//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032725#comment-14032725 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12650603/YARN-1769.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4000//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/4000//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4000//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032646#comment-14032646 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12650590/YARN-1769.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3999//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/3999//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3999//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028427#comment-14028427 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12649891/YARN-1769.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3964//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010358#comment-14010358 ] Thomas Graves commented on YARN-1769: - TestFairScheduler is failing for other reasons. see https://issues.apache.org/jira/browse/YARN-2105. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010319#comment-14010319 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12646981/YARN-1769.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3836//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3836//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13965515#comment-13965515 ] Arun C Murthy commented on YARN-1769: - Sorry guys, been slammed. I'll take a look at this presently. Tx. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962325#comment-13962325 ] Jason Lowe commented on YARN-1769: -- Thanks, Tom. Latest changes look good to me. Waiting a bit for [~acmurthy] or [~vinodkv] to chime in with their $.02 before committing since they know the CapacityScheduler workings more than most. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962223#comment-13962223 ] Hadoop QA commented on YARN-1769: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639049/YARN-1769.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3523//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3523//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962152#comment-13962152 ] Thomas Graves commented on YARN-1769: - Thanks for the review Jason. - LeafQueue.findNodeToUnreserve is adjusting the headroom when it unreserves, but I don't see other unreservations doing a similar calculation. Wondering if this fixup is something that should have been in completedContainer or needs to be done elsewhere? I could easily be missing something here but asking just in case other unreservation situations also need to have the headroom fixed. You have to adjust the headroom otherwise we will have subtracted it twice. When you allocate a container (reservation or normal), it is subtracted from your headroom. Normally then when it goes to schedule a reserved resource, it goes into assignReservedContainer which makes sure it doesn't subtract it twice. In our case we first unreserve so we have to add that headroom back, then we schedule and the normal schedule path will handle subtracting it again. Also note that it is possible to unreserve a container with size greater then what we are now going to take. There are a bunch of issues with headroom, most listed under YARN-1198. I think we should leave any other fixups to that jira. - In LeafQueue.findNodeToUnreserve, isn't it kinda bad if the app thinks it has reservations on the node but the scheduler doesn't know about it? Wondering if the bookkeeping is messed up at that point therefore something a bit more than debug is an appropriate log level and if further fixup is needed. I put this in just as a precaution. I'll change this it log an error. I don't think I want to try to fix it up as that might just cause more issues. I addressed the rest of the comments. I'll upload a new patch shortly. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13960541#comment-13960541 ] Jason Lowe commented on YARN-1769: -- The patch no longer applies cleanly after YARN-1512. Other comments on the patch: - Nit: In LeafQueue.assignToQueue we could cache Resource.add(usedResources, required) in a local when we're computing potentialNewCapacity so we don't have to recompute it as part of the potentialNewWithoutReservedCapacity computation - LeafQueue.assignToQueue and LeafQueue.assignToUser don't seem to need the new priority argument, and therefore LeafQueue.checkLimitsToReserve wouldn't seem to need it either once those others are updated. - Should FiCaSchedulerApp getAppToUnreserve really be called getNodeIdToUnreserve or geNodeToUnreserve, since it's returning a node ID rather than an app? - In LeafQueue.findNodeToUnreserve, isn't it kinda bad if the app thinks it has reservations on the node but the scheduler doesn't know about it? Wondering if the bookkeeping is messed up at that point therefore something a bit more than debug is an appropriate log level and if further fixup is needed. - LeafQueue.findNodeToUnreserve is adjusting the headroom when it unreserves, but I don't see other unreservations doing a similar calculation. Wondering if this fixup is something that should have been in completedContainer or needs to be done elsewhere? I could easily be missing something here but asking just in case other unreservation situations also need to have the headroom fixed. - LeafQueue.assignContainer uses the much more expensive scheduler.getConfiguration().getReservationContinueLook() when it should be able to use the reservationsContinueLooking member instead. - LeafQueue.getReservationContinueLooking should be package private - Nit: LeafQueue.assignContainer has some reformatting of the log message after the "// Inform the node" comment which was clearer to read/maintain before since the label and the value were always on a line by themselves. Same goes for the "Reserved container" log towards the end of the method. - Ultra-Nit: ParentQueue.setupQueueConfig's log message should have the reservationsContinueLooking on the previous line to match the style of other label/value pairs in the log message. - ParentQueue.getReservationContinueLooking should be package private. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938352#comment-13938352 ] Jonathan Eagles commented on YARN-1769: --- TestResourceTrackerService test issue is caused by YARN-1591 > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938002#comment-13938002 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635102/YARN-1769.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3375//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3375//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935095#comment-13935095 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634719/YARN-1769.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3365//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/3365//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3365//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935038#comment-13935038 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634711/YARN-1769.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3364//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/3364//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3364//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933943#comment-13933943 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634513/YARN-1769.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3350//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933860#comment-13933860 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634508/YARN-1769.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3349//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933831#comment-13933831 ] Thomas Graves commented on YARN-1769: - The findbugs can be ignored. They are talking about inconsistent synchronization but of the class variable that is sometimes referenced inside synchronized functions and sometimes not. It doesn't matter if that variable is accessed synchronized or not. I'll add it to the excludes files. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933478#comment-13933478 ] Jonathan Eagles commented on YARN-1769: --- Hi, Tom. Can you comment on the findbugs warnings that are introduced as part of this patch when you get a chance? > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13919592#comment-13919592 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12632523/YARN-1769.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3242//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/3242//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3242//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13919506#comment-13919506 ] Thomas Graves commented on YARN-1769: - if canAllocContainer is false then you can't reserve another container. This could happen if you don't have any containers to unreserve when you hit the reservation limits and this node doesn't have available containers. if ((!scheduler.getConfiguration().getReservationContinueLook()) // without feature always reserve like previously did || (canAllocContainer) // if we hit our reservation limit and no available space on this node, don't reserve another one || (rmContainer != null)) { // if this was called because node already had reservation, we need to make sure it gets book keeped as re-reservation I can simplify this a bit. I don't really need the !scheduler.getConfiguration().getReservationContinueLook check anymore since canAllocContainer defaults to true in that case. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13919161#comment-13919161 ] Sunil G commented on YARN-1769: --- Hi Thomas In LeafQueue assignContainer call, reserve() call will happen from the else logic. Here there is a check as below if ((!scheduler.getConfiguration().getReservationContinueLook()) || (canAllocContainer) || (rmContainer != null)) { Is there any scenrio to happen with out these 3 case. ? Only if for first time allocation, and if application cant assign a container, may be chances are less to reach here. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13919154#comment-13919154 ] Sunil G commented on YARN-1769: --- Hi Thomas > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13919153#comment-13919153 ] Sunil G commented on YARN-1769: --- Hi Thomas > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918192#comment-13918192 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12632271/YARN-1769.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3229//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-YARN-Build/3229//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/3229//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3229//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)