[jira] [Commented] (YARN-10796) Capacity Scheduler: dynamic queue cannot scale out properly if its capacity is 0%
[ https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17356346#comment-17356346 ] Szilard Nemeth commented on YARN-10796: --- Thanks [~pbacsko] for working on this. Latest patch LGTM, committed to trunk > Capacity Scheduler: dynamic queue cannot scale out properly if its capacity > is 0% > - > > Key: YARN-10796 > URL: https://issues.apache.org/jira/browse/YARN-10796 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10796-001.patch, YARN-10796-002.patch, > YARN-10796-003.patch > > > If we have a dynamic queue (AutoCreatedLeafQueue) with capacity = 0%, then it > cannot properly scale even if it's max-capacity and the parent's max-capacity > would allow it. > Example: > {noformat} > Cluster Capacity: 16 GB / 16cpu (2 nodes, each with 8 GB / 8 cpu ) > Container allocation size: 1G / 1 vcore > root.dynamic > Effective Capacity: ( 50.0%) > Effective Max Capacity: (100.0%) > Template: > Capacity: 40% > Max Capacity: 100% > User Limit Factor: 4 > {noformat} > leaf-queue-template.capacity = 40% > leaf-queue-template.maximum-capacity = 100% > leaf-queue-template.maximum-am-resource-percent = 50% > leaf-queue-template.minimum-user-limit-percent =100% > leaf-queue-template.user-limit-factor = 4 > "root.dynamic" has a maximum capacity of 100% and a capacity of 50%. > Let's assume there are running containers in these dynamic queues (MR sleep > jobs): > root.dynamic.user1 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user2 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user3 = 1 AM + 15 container (capacity = 0%) > This scenario will result in an underutilized cluster. There will be approx > 18% unused capacity. On the other hand, it's still possible to submit a new > application to root.dynamic.user1 or root.dynamic.user2 and reaching a 100% > utilization is possible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10796) Capacity Scheduler: dynamic queue cannot scale out properly if its capacity is 0%
[ https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17356214#comment-17356214 ] Qi Zhu commented on YARN-10796: --- Thanks [~pbacsko] the latest patch LGTM +1. And i agree with you the capacity 0, we also need to relax to the max. > Capacity Scheduler: dynamic queue cannot scale out properly if its capacity > is 0% > - > > Key: YARN-10796 > URL: https://issues.apache.org/jira/browse/YARN-10796 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10796-001.patch, YARN-10796-002.patch, > YARN-10796-003.patch > > > If we have a dynamic queue (AutoCreatedLeafQueue) with capacity = 0%, then it > cannot properly scale even if it's max-capacity and the parent's max-capacity > would allow it. > Example: > {noformat} > Cluster Capacity: 16 GB / 16cpu (2 nodes, each with 8 GB / 8 cpu ) > Container allocation size: 1G / 1 vcore > root.dynamic > Effective Capacity: ( 50.0%) > Effective Max Capacity: (100.0%) > Template: > Capacity: 40% > Max Capacity: 100% > User Limit Factor: 4 > {noformat} > leaf-queue-template.capacity = 40% > leaf-queue-template.maximum-capacity = 100% > leaf-queue-template.maximum-am-resource-percent = 50% > leaf-queue-template.minimum-user-limit-percent =100% > leaf-queue-template.user-limit-factor = 4 > "root.dynamic" has a maximum capacity of 100% and a capacity of 50%. > Let's assume there are running containers in these dynamic queues (MR sleep > jobs): > root.dynamic.user1 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user2 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user3 = 1 AM + 15 container (capacity = 0%) > This scenario will result in an underutilized cluster. There will be approx > 18% unused capacity. On the other hand, it's still possible to submit a new > application to root.dynamic.user1 or root.dynamic.user2 and reaching a 100% > utilization is possible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10796) Capacity Scheduler: dynamic queue cannot scale out properly if its capacity is 0%
[ https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355919#comment-17355919 ] Hadoop QA commented on YARN-10796: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 13s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 51s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 45s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 19m 55s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 1m 50s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 53s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 55s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green}{color} | {color:green} the
[jira] [Commented] (YARN-10796) Capacity Scheduler: dynamic queue cannot scale out properly if its capacity is 0%
[ https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355820#comment-17355820 ] Benjamin Teke commented on YARN-10796: -- [~pbacsko] the latest patch looks good to me, +1 (non-binding) from my side as well. > Capacity Scheduler: dynamic queue cannot scale out properly if its capacity > is 0% > - > > Key: YARN-10796 > URL: https://issues.apache.org/jira/browse/YARN-10796 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10796-001.patch, YARN-10796-002.patch, > YARN-10796-003.patch > > > If we have a dynamic queue (AutoCreatedLeafQueue) with capacity = 0%, then it > cannot properly scale even if it's max-capacity and the parent's max-capacity > would allow it. > Example: > {noformat} > Cluster Capacity: 16 GB / 16cpu (2 nodes, each with 8 GB / 8 cpu ) > Container allocation size: 1G / 1 vcore > root.dynamic > Effective Capacity: ( 50.0%) > Effective Max Capacity: (100.0%) > Template: > Capacity: 40% > Max Capacity: 100% > User Limit Factor: 4 > {noformat} > leaf-queue-template.capacity = 40% > leaf-queue-template.maximum-capacity = 100% > leaf-queue-template.maximum-am-resource-percent = 50% > leaf-queue-template.minimum-user-limit-percent =100% > leaf-queue-template.user-limit-factor = 4 > "root.dynamic" has a maximum capacity of 100% and a capacity of 50%. > Let's assume there are running containers in these dynamic queues (MR sleep > jobs): > root.dynamic.user1 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user2 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user3 = 1 AM + 15 container (capacity = 0%) > This scenario will result in an underutilized cluster. There will be approx > 18% unused capacity. On the other hand, it's still possible to submit a new > application to root.dynamic.user1 or root.dynamic.user2 and reaching a 100% > utilization is possible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10796) Capacity Scheduler: dynamic queue cannot scale out properly if its capacity is 0%
[ https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355805#comment-17355805 ] Andras Gyori commented on YARN-10796: - [~pbacsko] I agree that this is the correct behaviour and therefore I would refrain from introducing a new property. The patch looks good to me. > Capacity Scheduler: dynamic queue cannot scale out properly if its capacity > is 0% > - > > Key: YARN-10796 > URL: https://issues.apache.org/jira/browse/YARN-10796 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10796-001.patch, YARN-10796-002.patch > > > If we have a dynamic queue (AutoCreatedLeafQueue) with capacity = 0%, then it > cannot properly scale even if it's max-capacity and the parent's max-capacity > would allow it. > Example: > {noformat} > Cluster Capacity: 16 GB / 16cpu (2 nodes, each with 8 GB / 8 cpu ) > Container allocation size: 1G / 1 vcore > root.dynamic > Effective Capacity: ( 50.0%) > Effective Max Capacity: (100.0%) > Template: > Capacity: 40% > Max Capacity: 100% > User Limit Factor: 4 > {noformat} > leaf-queue-template.capacity = 40% > leaf-queue-template.maximum-capacity = 100% > leaf-queue-template.maximum-am-resource-percent = 50% > leaf-queue-template.minimum-user-limit-percent =100% > leaf-queue-template.user-limit-factor = 4 > "root.dynamic" has a maximum capacity of 100% and a capacity of 50%. > Let's assume there are running containers in these dynamic queues (MR sleep > jobs): > root.dynamic.user1 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user2 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user3 = 1 AM + 15 container (capacity = 0%) > This scenario will result in an underutilized cluster. There will be approx > 18% unused capacity. On the other hand, it's still possible to submit a new > application to root.dynamic.user1 or root.dynamic.user2 and reaching a 100% > utilization is possible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10796) Capacity Scheduler: dynamic queue cannot scale out properly if its capacity is 0%
[ https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355804#comment-17355804 ] Peter Bacsko commented on YARN-10796: - [~gandras] this is a valid concern. Question is, do we accept how it worked before and say "yeah, that's another way of working"? Are there clusters built on the fact that a 0% queue cannot scale out properly, despite the max-capacity setting? Honestly, I don't know. Maybe some people got used to the improper behavior and expect it to work that way, which does happen in real life. That said, even a zero capacity queue should be able to occupy the cluster if nothing else is used, provided max-capacity is set appropriately. So I would not go for a new property. > Capacity Scheduler: dynamic queue cannot scale out properly if its capacity > is 0% > - > > Key: YARN-10796 > URL: https://issues.apache.org/jira/browse/YARN-10796 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10796-001.patch, YARN-10796-002.patch > > > If we have a dynamic queue (AutoCreatedLeafQueue) with capacity = 0%, then it > cannot properly scale even if it's max-capacity and the parent's max-capacity > would allow it. > Example: > {noformat} > Cluster Capacity: 16 GB / 16cpu (2 nodes, each with 8 GB / 8 cpu ) > Container allocation size: 1G / 1 vcore > root.dynamic > Effective Capacity: ( 50.0%) > Effective Max Capacity: (100.0%) > Template: > Capacity: 40% > Max Capacity: 100% > User Limit Factor: 4 > {noformat} > leaf-queue-template.capacity = 40% > leaf-queue-template.maximum-capacity = 100% > leaf-queue-template.maximum-am-resource-percent = 50% > leaf-queue-template.minimum-user-limit-percent =100% > leaf-queue-template.user-limit-factor = 4 > "root.dynamic" has a maximum capacity of 100% and a capacity of 50%. > Let's assume there are running containers in these dynamic queues (MR sleep > jobs): > root.dynamic.user1 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user2 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user3 = 1 AM + 15 container (capacity = 0%) > This scenario will result in an underutilized cluster. There will be approx > 18% unused capacity. On the other hand, it's still possible to submit a new > application to root.dynamic.user1 or root.dynamic.user2 and reaching a 100% > utilization is possible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10796) Capacity Scheduler: dynamic queue cannot scale out properly if its capacity is 0%
[ https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355783#comment-17355783 ] Andras Gyori commented on YARN-10796: - Thanks [~pbacsko] for the patch and for creating a unit test for UsersManager. I have the same suggestion as [~bteke]. Apart from this, I have one concern regarding this change. It is going to be functionally different than before. I hate to suggest yet an other configuration property, because YARN is already heavily bloated, but: * This is going to change how zero capacity queues work. It might not be feasible for all users to allow zero capacity queues to allocate resources at all. * Also found, that there can be zero capacity static queues as well (see ParentQueue#allowZeroCapacitySum). That being said, probably a user would stop a queue in order to indicate, that it is temporarily not accepting any new submission. These are all speculations and I would not introduce yet an other property if it is not necessary. What is your opinion about it? > Capacity Scheduler: dynamic queue cannot scale out properly if its capacity > is 0% > - > > Key: YARN-10796 > URL: https://issues.apache.org/jira/browse/YARN-10796 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10796-001.patch, YARN-10796-002.patch > > > If we have a dynamic queue (AutoCreatedLeafQueue) with capacity = 0%, then it > cannot properly scale even if it's max-capacity and the parent's max-capacity > would allow it. > Example: > {noformat} > Cluster Capacity: 16 GB / 16cpu (2 nodes, each with 8 GB / 8 cpu ) > Container allocation size: 1G / 1 vcore > root.dynamic > Effective Capacity: ( 50.0%) > Effective Max Capacity: (100.0%) > Template: > Capacity: 40% > Max Capacity: 100% > User Limit Factor: 4 > {noformat} > leaf-queue-template.capacity = 40% > leaf-queue-template.maximum-capacity = 100% > leaf-queue-template.maximum-am-resource-percent = 50% > leaf-queue-template.minimum-user-limit-percent =100% > leaf-queue-template.user-limit-factor = 4 > "root.dynamic" has a maximum capacity of 100% and a capacity of 50%. > Let's assume there are running containers in these dynamic queues (MR sleep > jobs): > root.dynamic.user1 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user2 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user3 = 1 AM + 15 container (capacity = 0%) > This scenario will result in an underutilized cluster. There will be approx > 18% unused capacity. On the other hand, it's still possible to submit a new > application to root.dynamic.user1 or root.dynamic.user2 and reaching a 100% > utilization is possible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10796) Capacity Scheduler: dynamic queue cannot scale out properly if its capacity is 0%
[ https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355771#comment-17355771 ] Peter Bacsko commented on YARN-10796: - Thanks [~bteke], this makes sense. > Capacity Scheduler: dynamic queue cannot scale out properly if its capacity > is 0% > - > > Key: YARN-10796 > URL: https://issues.apache.org/jira/browse/YARN-10796 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10796-001.patch, YARN-10796-002.patch > > > If we have a dynamic queue (AutoCreatedLeafQueue) with capacity = 0%, then it > cannot properly scale even if it's max-capacity and the parent's max-capacity > would allow it. > Example: > {noformat} > Cluster Capacity: 16 GB / 16cpu (2 nodes, each with 8 GB / 8 cpu ) > Container allocation size: 1G / 1 vcore > root.dynamic > Effective Capacity: ( 50.0%) > Effective Max Capacity: (100.0%) > Template: > Capacity: 40% > Max Capacity: 100% > User Limit Factor: 4 > {noformat} > leaf-queue-template.capacity = 40% > leaf-queue-template.maximum-capacity = 100% > leaf-queue-template.maximum-am-resource-percent = 50% > leaf-queue-template.minimum-user-limit-percent =100% > leaf-queue-template.user-limit-factor = 4 > "root.dynamic" has a maximum capacity of 100% and a capacity of 50%. > Let's assume there are running containers in these dynamic queues (MR sleep > jobs): > root.dynamic.user1 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user2 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user3 = 1 AM + 15 container (capacity = 0%) > This scenario will result in an underutilized cluster. There will be approx > 18% unused capacity. On the other hand, it's still possible to submit a new > application to root.dynamic.user1 or root.dynamic.user2 and reaching a 100% > utilization is possible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10796) Capacity Scheduler: dynamic queue cannot scale out properly if its capacity is 0%
[ https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355758#comment-17355758 ] Benjamin Teke commented on YARN-10796: -- [~pbacsko] thanks for the patch. One small thing: since the _originalCapacity.equals(Resources.none())_ case is/should be the same as if the userLimitFactor was disabled (set to -1) I think merging the two ifs would be a bit cleaner. Or even turning around the logic like: {code:java} if (getUserLimitFactor() == -1 || originalCapacity.equals(Resources.none()) { maxUserLimit = lQueue.getEffectiveMaxCapacityDown(nodePartition, lQueue.getMinimumAllocation()); } else { ... } {code} > Capacity Scheduler: dynamic queue cannot scale out properly if its capacity > is 0% > - > > Key: YARN-10796 > URL: https://issues.apache.org/jira/browse/YARN-10796 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10796-001.patch, YARN-10796-002.patch > > > If we have a dynamic queue (AutoCreatedLeafQueue) with capacity = 0%, then it > cannot properly scale even if it's max-capacity and the parent's max-capacity > would allow it. > Example: > {noformat} > Cluster Capacity: 16 GB / 16cpu (2 nodes, each with 8 GB / 8 cpu ) > Container allocation size: 1G / 1 vcore > root.dynamic > Effective Capacity: ( 50.0%) > Effective Max Capacity: (100.0%) > Template: > Capacity: 40% > Max Capacity: 100% > User Limit Factor: 4 > {noformat} > leaf-queue-template.capacity = 40% > leaf-queue-template.maximum-capacity = 100% > leaf-queue-template.maximum-am-resource-percent = 50% > leaf-queue-template.minimum-user-limit-percent =100% > leaf-queue-template.user-limit-factor = 4 > "root.dynamic" has a maximum capacity of 100% and a capacity of 50%. > Let's assume there are running containers in these dynamic queues (MR sleep > jobs): > root.dynamic.user1 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user2 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user3 = 1 AM + 15 container (capacity = 0%) > This scenario will result in an underutilized cluster. There will be approx > 18% unused capacity. On the other hand, it's still possible to submit a new > application to root.dynamic.user1 or root.dynamic.user2 and reaching a 100% > utilization is possible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10796) Capacity Scheduler: dynamic queue cannot scale out properly if its capacity is 0%
[ https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355720#comment-17355720 ] Peter Bacsko commented on YARN-10796: - [~bteke], [~gandras], [~snemeth] could you review this please? > Capacity Scheduler: dynamic queue cannot scale out properly if its capacity > is 0% > - > > Key: YARN-10796 > URL: https://issues.apache.org/jira/browse/YARN-10796 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10796-001.patch, YARN-10796-002.patch > > > If we have a dynamic queue (AutoCreatedLeafQueue) with capacity = 0%, then it > cannot properly scale even if it's max-capacity and the parent's max-capacity > would allow it. > Example: > {noformat} > Cluster Capacity: 16 GB / 16cpu (2 nodes, each with 8 GB / 8 cpu ) > Container allocation size: 1G / 1 vcore > root.dynamic > Effective Capacity: ( 50.0%) > Effective Max Capacity: (100.0%) > Template: > Capacity: 40% > Max Capacity: 100% > User Limit Factor: 4 > {noformat} > leaf-queue-template.capacity = 40% > leaf-queue-template.maximum-capacity = 100% > leaf-queue-template.maximum-am-resource-percent = 50% > leaf-queue-template.minimum-user-limit-percent =100% > leaf-queue-template.user-limit-factor = 4 > "root.dynamic" has a maximum capacity of 100% and a capacity of 50%. > Let's assume there are running containers in these dynamic queues (MR sleep > jobs): > root.dynamic.user1 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user2 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user3 = 1 AM + 15 container (capacity = 0%) > This scenario will result in an underutilized cluster. There will be approx > 18% unused capacity. On the other hand, it's still possible to submit a new > application to root.dynamic.user1 or root.dynamic.user2 and reaching a 100% > utilization is possible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10796) Capacity Scheduler: dynamic queue cannot scale out properly if its capacity is 0%
[ https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355317#comment-17355317 ] Hadoop QA commented on YARN-10796: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 27m 45s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 14s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 32s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 20m 52s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 1m 53s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 53s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 58s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 18s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green}{color} | {color:green} the
[jira] [Commented] (YARN-10796) Capacity Scheduler: dynamic queue cannot scale out properly if its capacity is 0%
[ https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355197#comment-17355197 ] Hadoop QA commented on YARN-10796: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 21s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 31s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 59s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 20m 12s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 1m 52s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 58s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 40s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/1032/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 9 new + 10 unchanged - 0 fixed = 19 total (was 10) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 59s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:g