[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15152729#comment-15152729 ] Naganarasimha G R commented on YARN-3945: - Thanks for the clarification [~wangda], Yes it would be better to limit the scope of this jira to #1 and #2 , will update with a new patch and share it. > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch, > YARN-3945.V1.003.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15149365#comment-15149365 ] Wangda Tan commented on YARN-3945: -- [~Naganarasimha], bq. so based on this there will be no elasticity even though the resources are free in some other queue, is this expected ? I think so bq. Are we trying to avoid elasticity because we try to avoid preempting AM's even when preemption is enabled? That's the one purpose, the other purpose is, when preemption is disabled, we will not suffer with too many AMs launched with queue's available resource increases and then come back. To make move this task forward, I would suggest: # Resolve bug of maxApplicationsPerUser should be capped by maxApplicationsPerQueue # Computation of user AM limit should be symmetric to computation of user-limit, and user AM limit should be capped by queue's AM limit # Avoid flexibility of computing queue and user's AM-limit (do not consider queue max cap). This needs more discussion. My understanding is, #1 and #2 are scope of this JIRA, #3 could be done separately. Agree? > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch, > YARN-3945.V1.003.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15123110#comment-15123110 ] Naganarasimha G R commented on YARN-3945: - Hi [~wangda], Few doubts here bq. However, I don't want to add flexibility to AM resource limit. the hard limit of queue's AM resource usage should be queueConfiguredResource * am-percent. So in this case, {{queueConfiguredResource}} is currently been calculated {{queue's abscapacity * partition resource}}, so based on this there will be no elasticity even though the resources are free in some other queue, is this expected ? Are we trying to avoid elasticity because we try to avoid preempting AM's even when preemption is enabled? {code} maxUserAMLimit = min{queue-am-limit * min{ULF, 1}, max{queue-am-limit / #activeUsers, queue-am-limit * ULP}} {code} This in my opinion will control the am's launched by different users within a queue, which is similar to the calculation of {{userlimit}}. If its not possible to change because {{am-limit consistently change}} issue then would it be better to leave it as it is with wrong calculation or consider {{maxUserAMLimit = queueConfiguredResource * am-percent}}, thoughts ? > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch, > YARN-3945.V1.003.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15122799#comment-15122799 ] Wangda Tan commented on YARN-3945: -- bq. if ok then will rework on it. Agree I also agree that computing AM resource limit should be symmetric to computing user-limit. However, I don't want to add flexibility to AM resource limit. the hard limit of queue's AM resource usage should be queueConfiguredResource * am-percent. And: {code} maxUserAMLimit = min{queue-am-limit * min{ULF, 1}, max{queue-am-limit / #activeUsers, queue-am-limit * ULP}} {code} Adding too many flexibilities to am-limit computation will lead to am-limit consistently change. > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch, > YARN-3945.V1.003.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15122328#comment-15122328 ] Naganarasimha G R commented on YARN-3945: - Thanks for the reply [~wangda], bq. I would prefer this proposal. So this implies {{#apps per user in the queue}} = {{#apps per queue}}, if ok then will rework on it. bq. Changing max value of ULF will be an incompatible changes, since lots of cluster are using very high ULF (e.g. 100). Sorry understood the code wrongly ULF multiplied with Queue's capacity and not the max capacity so you are right we need to just ensure that the final resource limit is not greater than the max limit of the queue when multipled with ULF. [~wangda], would like to know whether the query i had asked earlier is valid, In *getUserAMResourceLimitPerPartition* we are calculating *am resource limit* as {code} Resources.multiplyAndNormalizeUp(resourceCalculator, queuePartitionResource, queueCapacities.getMaxAMResourcePercentage(nodePartition) * effectiveUserLimit * userLimitFactor, minimumAllocation); {code} should the computation logic not be similar to *computeUserLimit* i.e. IMO *getUserAMResourceLimitPerPartition* calculation should be : {code} queuePartitionAMResource = queuePartitionResource * queueCapacities.getMaxAMResourcePercentage(nodePartition); maxUserAMLimit= queuePartitionAMResource * userLimitFactor; userAMLimitResource=currentAMCapacity * max(1 / #activeUsers, 1 * user-limit-percentage%) queuePartitionUserAMResource = min( userAMLimitResource, maxUserAMLimit); {code} Not able to digest/understand the logic of multiplication : *effectiveUserLimit * userLimitFactor* > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch, > YARN-3945.V1.003.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106191#comment-15106191 ] Wangda Tan commented on YARN-3945: -- bq. i feel better not to consider userLimit and userLimitFactor at all, to reduce the confusion for the number of applications per user. I would prefer this proposal. bq. IMO numAppsPerUser can be greater than numAppsPerQueue and user-resource and user-am-resource greater than the queue's resource or queue's AM resource only when userLimitFactor is of really greater value, so is it actually required to be greater than 1, Is it sufficient to restrict this to 1 ? I think it's better to only cap it by the max possible value of queue (queue's max capacity / queue's max application number). User can still set ULF as he wants, but we will return capped value to user. > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch, > YARN-3945.V1.003.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15104199#comment-15104199 ] Naganarasimha G R commented on YARN-3945: - Thanks for the comment [~wangda], But the particular case which i had raised in the forum was : {quote} Came across one scenario where in maxApplications @ cluster level(2 node) was set to a low value like 10 and based on capacity configuration for a particular queue it was coming to 2 as value, but further while calculating maxApplicationsPerUser formula used is : maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * userLimitFactor); {quote} i had kept userlimit factor as 1 and user limit as 25, so it was coming as zero. This in my opinion is wrong, i feel better not to consider *userLimit and userLimitFactor* at all, to reduce the confusion for the number of applications per user. Futher even if suppose we consider user limit & userLimitFactor for *UserAMResource* current approach of calculating it in {{getUserAMResourceLimitPerPartition}} is different from {{computeUserLimit}} In *getUserAMResourceLimitPerPartition* : {code} return Resources.multiplyAndNormalizeUp(resourceCalculator, queuePartitionResource, queueCapacities.getMaxAMResourcePercentage(nodePartition) * effectiveUserLimit * userLimitFactor, minimumAllocation); {code} In *computeUserLimit* {code} // Cap final user limit with maxUserLimit userLimitResource = Resources.roundUp( resourceCalculator, Resources.min( resourceCalculator, clusterResource, userLimitResource, maxUserLimit ), minimumAllocation); {code} IMO it should be min (userLimitResource,maxUserLimit) and not multiple of it , thoughts ? bq. Now numAppsPerUser could be more than numAppsPerQueue (before of user-limit). Same to user-resource and user-am-resource, it will be helpful to make sure they're capped by queue's limitation (am-resource, number-am, queue-max-resource, etc.). IMO numAppsPerUser can be greater than numAppsPerQueue and user-resource and user-am-resource greater than the queue's resource or queue's AM resource only when userLimitFactor is of really greater value, so is it actually required to be greater than 1, Is it sufficient to restrict this to 1 ? > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch, > YARN-3945.V1.003.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093246#comment-15093246 ] Wangda Tan commented on YARN-3945: -- Hi [~Naganarasimha], I agree that fixing user limit is a non-trivial fix and should be done in a separated JIRA. Thinking hard of this issue, I feel maybe we shouldn't bother user-limit to max-application too much: - We have max-am-percent already, and CS respects it - With above, maybe we don't have to limit number of applications per user, I felt it's not that important. - Two different dimensions of limitation (max-am-resource-per-user and max-number-per-user) could lead to under utilization. (A user could use less AM resource but cannot get new AM container allocated before app-number exceeds. I would suggest to make a simple fix: Now numAppsPerUser could be more than numAppsPerQueue (before of user-limit). Same to user-resource and user-am-resource, it will be helpful to make sure they're capped by queue's limitation (am-resource, number-am, queue-max-resource, etc.). With this user will not confused by web UI reports max resource of a user could exceed max resource of a queue. > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch, > YARN-3945.V1.003.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091384#comment-15091384 ] Hadoop QA commented on YARN-3945: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 53s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 22s {color} | {color:red} hadoop-yarn-server-resourcemanager in trunk failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 19s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 28s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 56s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 149m 7s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | JDK v1.7.0_91 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12781477/YARN-
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649531#comment-14649531 ] Wangda Tan commented on YARN-3945: -- Thanks for comments, [~nroberts]! bq. I don't think we can change it in any significant way at this point without a major configuration switch that clearly indicates you're getting different behavior. I'm sure admins have built up clusters with this tuned in very specific ways, a significant change wouldn't be compatible with their expectations. I agree that we cannot change the behavior of this option itself, but I think we can add new option instead. bq. I don't really agree with this. It may not be doing an ideal job but I think the intent is to introduce fairness between users. It's a progression from 0 being the most fair, and 100+ being more fifo. In your example it's trying to get everyone 50% which isn't likely to happen so in this case it's going to operate mostly fifo. If the intent is to be much more fair across the 10 users, then a much smaller value would be appropriate. The problem I can see is, it uses #active-user to compute user-limit, this can lead to unfair, an example of this: A queue has 100 guaranteed resource (capacity=max-capacity=100). And minimum-user-limit=25. There're 4 users in the queue, they're using u1=40, u2=30, u3=20, u4=10 resources. After a while, u3 finished its application, so there're 20 available resources. Only u2 and u1 are asking resources. So the user-limit = max(1/#active-user, 25/100) = 50. So it is possible u2 get all available resource, and usage becomes u1=40, u2=50, u3=0, u4=10. This is very unfair to me. And I think currently we cannot relief this issue via tuning minimum-user-limit. bq. Since the scheduler can't predict what an application is going to request in the future, I don't see how a predictable formula is even possible (ignoring the possibility of taking away resources via in-queue preemption). It's not great, but being fair to currently requesting users makes some bit of sense. The definition of predictable in my mind is: given resource request of each user, queue's guaranteed/available/used resource, we can get how much resource of each user can get. The above example shows we cannot get resource of each user can get. If thinking more fair, when there's any available resource, we should give them to users have requirement and also respecting their usage (e.g. we should give 20 available resource to u4 to make usage to be u1=40, u2=30, u3=0, u4=30). bq. user-limit-factor is the max-resource-limit of each user today, right? The second one seems very hard to track. It seems like one of the initial users can stay in the "guaranteed" set as long as he keeps requesting resources. This doesn't seem very fair to the users only getting idle shares. You're correct, it is not good. How about computing fair share (as same as how fair scheduler computes fair share) for users within a queue, it will be a new option like (enable-user-fair-share), and user can choose to use minimum-user-limit OR enable-user-fair-share. > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649063#comment-14649063 ] Naganarasimha G R commented on YARN-3945: - [~nroberts] Missed to check this flow, Thanks for the clarification ! > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649064#comment-14649064 ] Naganarasimha G R commented on YARN-3945: - [~nroberts] Missed to check this flow, Thanks for the clarification ! > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648418#comment-14648418 ] Nathan Roberts commented on YARN-3945: -- Hi [~leftnoteasy]. Regarding minimum_user_limit_percent. - I totally agree it is very confusing. - I don't think we can change it in any significant way at this point without a major configuration switch that clearly indicates you're getting different behavior. I'm sure admins have built up clusters with this tuned in very specific ways, a significant change wouldn't be compatible with their expectations. bq. User-limit is not a fairness mechanism to balance resources between users, instead, it can lead to bad imbalance. One example is, if we set user-limit = 50, and there are 10 users running, we cannot manage how much resource can be used by each user. I don't really agree with this. It may not be doing an ideal job but I think the intent is to introduce fairness between users. It's a progression from 0 being the most fair, and 100+ being more fifo. In your example it's trying to get everyone 50% which isn't likely to happen so in this case it's going to operate mostly fifo. If the intent is to be much more fair across the 10 users, then a much smaller value would be appropriate. bq. meaningful since #active-user is changing every minute, it is not a predictable formula. Since the scheduler can't predict what an application is going to request in the future, I don't see how a predictable formula is even possible (ignoring the possibility of taking away resources via in-queue preemption). It's not great, but being fair to currently requesting users makes some bit of sense. bq. Instead we may need to consider some notion like fair sharing: user-limit-factor becomes max-resource-limit of each user, and user-limit-percentage becomes something like guaranteed-concurrent-#user, when #user > guaranteed-concurrent-#user, rest users can only get idle shares. user-limit-factor is the max-resource-limit of each user today, right? The second one seems very hard to track. It seems like one of the initial users can stay in the "guaranteed" set as long as he keeps requesting resources. This doesn't seem very fair to the users only getting idle shares. > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648032#comment-14648032 ] Nathan Roberts commented on YARN-3945: -- bq. Though the class doc(ActiveUsersManager) also says the same but implementation wise i was not sure its considering in that way as the ActiveUsersManager.deactivateApplication (which takes care of decrementing activeusers count) is on application finish only (current me if i am wrong). I think it's also done from checkForDeactivation() which is called when the outstanding resource requests change (either something got allocated, or resource request gets updated). > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646975#comment-14646975 ] Wangda Tan commented on YARN-3945: -- And forgot to mention, maxApplicationsPerUser computation is a byproduct of user-limit, I would like to see if we can reach some consent about change/not-change user-limit before fixing maxApplicationPerUser based on existing user-limit assumptions. > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646964#comment-14646964 ] Wangda Tan commented on YARN-3945: -- Thanks for summarizing [~Naganarasimha]. I think we *might* need to reconsider user-limit / user-limit-factor configuration. I can also see it's hard to be understood: - User-limit is not a lower bound nor higher bound. - User-limit is not a fairness mechanism to balance resources between users, instead, it can lead to bad imbalance. One example is, if we set user-limit = 50, and there're 10 users running, we cannot manage how much resource can be used by each user. - It's really hard to understand, I spent time working on CapacityScheduler almost everyday, but sometimes I will forget and need to look at code to see how it is computed. :-(. Basically User-limit is computed by: {{user-limit = {{min(queue-capacity * user-limit-factor, current-capacity * max(user-limit / 100, 1 / #active-user)}}. But this formula is not that meaningful since #active-user is changing every minute, it is not a predictable formula. Instead we may need to consider some notion like fair sharing: user-limit-factor becomes max-resource-limit of each user, and user-limit-percentage becomes something like guaranteed-concurrent-#user, when #user > guaranteed-concurrent-#user, rest users can only get idle shares. With this approach, and considering we have user-limit-preemption within a queue (YARN-2113), we can get a predictable user-limit. Thoughts? [~nroberts], [~jlowe]. > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646048#comment-14646048 ] Naganarasimha G R commented on YARN-3945: - [~wangda] & [~nroberts], Checkstyle is not accurate as eclipse code format template is as per the coding guidelines wiki. and white space is not exactly in the lines which are modified but can get it corrected along with other review comments and doc updates. Can you please check the implementation and my comments on doc so that i can modify as required ? > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645959#comment-14645959 ] Hadoop QA commented on YARN-3945: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 0s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 43s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 38s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 48s | The applied patch generated 1 new checkstyle issues (total was 92, now 91). | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 20s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 26s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 52m 13s | Tests passed in hadoop-yarn-server-resourcemanager. | | | | 90m 5s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747768/YARN-3945.20150729-1.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 6374ee0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8704/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/8704/artifact/patchprocess/whitespace.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8704/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8704/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8704/console | This message was automatically generated. > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645461#comment-14645461 ] Hadoop QA commented on YARN-3945: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 34s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:red}-1{color} | javac | 3m 31s | The patch appears to cause the build to fail. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747702/YARN-3945.20150728-1.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 0712a81 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8703/console | This message was automatically generated. > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645408#comment-14645408 ] Naganarasimha G R commented on YARN-3945: - Hi [~wangda], I have added a initial patch based on the approach i mentioned (which seems to be correct wrt to the calculations in other places) but docs are not yet updated, Hi [~nroberts], Thanks for sharing your views, but i had few queries bq. My understanding is that it tries to guarantee all active applications this percentage of a queue's capacity (configured or current, whichever is larger). Note: an active application is one that is currently requesting resources, a running application that has all the resources it needs, is NOT active. If one application stops asking for additional resources, the other applications can certainly go higher than the 50%. user-limit-factor is what determines the absolute maximum capacity a user can consume within a queue Though the class doc(ActiveUsersManager) also says the same but implementation wise i was not sure its considering in that way as the ActiveUsersManager.deactivateApplication (which takes care of decrementing activeusers count) is on application finish only (current me if i am wrong). > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14635246#comment-14635246 ] Nathan Roberts commented on YARN-3945: -- My feeling is the documentation on minimum-user-limit-percent needs a rewrite. It makes it sound like minimum-user-limit-percent caps the amount of resource to say 50% if there are 2 applications submitted to the queue. This isn't the case (afaik). My understanding is that it tries to guarantee all active applications this percentage of a queue's capacity (configured or current, whichever is larger). Note: an active application is one that is currently requesting resources, a running application that has all the resources it needs, is NOT active. If one application stops asking for additional resources, the other applications can certainly go higher than the 50%. user-limit-factor is what determines the absolute maximum capacity a user can consume within a queue. Basically, minimum-user-limit percent defines how fair the queue is. The lower the value, the sooner the queue will try to spread resources evenly across all users in the queue. The higher the value, the more fifo it behaves. > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634306#comment-14634306 ] Naganarasimha G R commented on YARN-3945: - Also typo in the documentation, it has 2 "The" in the highlighted description in the document. Also i feel we need to mention what is the effective formula for max number of applications based on {{yarn.scheduler.capacity..minimum-user-limit-percent}} and {{yarn.scheduler.capacity..user-limit-factor}} thoughts? > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3945) maxApplicationsPerUser is wrongly calculated
[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634298#comment-14634298 ] Naganarasimha G R commented on YARN-3945: - As per the suggestion in the forum by [~wangda] , formula should be {{max(1, maxApplications * max(userlimit/100, 1/#activeUsers))}} but i think it should be some thing like {code} float userLimitRatio = max((userLimit / 100.0f),1/#activeUsers)); // need to ensure % by zero should not happen i.e. min activeUsers=1 maxApplicationsPerUser = (int)(maxApplications * userLimitRatio * userLimitFactor) {code} Thoughts ? > maxApplicationsPerUser is wrongly calculated > > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)