[jira] [Commented] (MAPREDUCE-3681) capacity scheduler LeafQueues calculate used capacity wrong
[ https://issues.apache.org/jira/browse/MAPREDUCE-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190867#comment-13190867 ] Hadoop QA commented on MAPREDUCE-3681: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12511466/MAPREDUCE-3681.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1650//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1650//console This message is automatically generated. > capacity scheduler LeafQueues calculate used capacity wrong > --- > > Key: MAPREDUCE-3681 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3681 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.0 >Reporter: Thomas Graves >Assignee: Arun C Murthy >Priority: Critical > Attachments: MAPREDUCE-3681.patch, MAPREDUCE-3681.patch, > MAPREDUCE-3681.patch > > > In the Capacity scheduler if you configure the queues to be hierarchical > where you have root -> parent queue -> leaf queue, the leaf queue doesn't > calculate the used capacity properly. It seems to be using the entire cluster > memory rather then its parents memory capacity. > In updateResource in LeafQueue: > setUsedCapacity( > usedResources.getMemory() / (clusterResource.getMemory() * capacity)); > I think the clusterResource.getMemory() should be something like > getParentsMemory(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3681) capacity scheduler LeafQueues calculate used capacity wrong
[ https://issues.apache.org/jira/browse/MAPREDUCE-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-3681: - Attachment: MAPREDUCE-3681.patch I had made the updates for usedCapacity = usedResources.getMemory() / (clusterResource.getMemory() * parent.getAbsoluteCapacity()) while testing, here is what I was running. > capacity scheduler LeafQueues calculate used capacity wrong > --- > > Key: MAPREDUCE-3681 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3681 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.0 >Reporter: Thomas Graves >Assignee: Arun C Murthy >Priority: Critical > Attachments: MAPREDUCE-3681.patch, MAPREDUCE-3681.patch, > MAPREDUCE-3681.patch > > > In the Capacity scheduler if you configure the queues to be hierarchical > where you have root -> parent queue -> leaf queue, the leaf queue doesn't > calculate the used capacity properly. It seems to be using the entire cluster > memory rather then its parents memory capacity. > In updateResource in LeafQueue: > setUsedCapacity( > usedResources.getMemory() / (clusterResource.getMemory() * capacity)); > I think the clusterResource.getMemory() should be something like > getParentsMemory(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3681) capacity scheduler LeafQueues calculate used capacity wrong
[ https://issues.apache.org/jira/browse/MAPREDUCE-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190842#comment-13190842 ] Thomas Graves commented on MAPREDUCE-3681: -- Thanks for the update Arun. I was misunderstanding what usedCapacity meant, but I think I now understand, although I don't think the current formula for used Capacity works in all cases. Please correct me if I’m off on the definition here. I believe used Capacity is meant to be what % of the parent am I using. Similar to how capacity/max capacity apply to the parent. I expected usedCapacity to be what utilization is defined as, which is what I thought users will be interested in -> how much of the resources normally allocated to my queue (absoluteCapacity) am I using right now. I think Utilization = absoluteUsedCapacity. Is a user normally really going to care what % of my parent queue they are using? To me the absoluteCapacity is what I’m interested in because my queue has X amount of memory allocated to it, and I have Y left. I can see admins being interested in the used Capacity as they can quickly see what percent of the parent is being used by the various leaf queues. right now definitions are: utilization = usedResources.getMemory() / (clusterResource.getMemory() * absoluteCapacity) usedCapacity = usedResources.getMemory() / (clusterResource.getMemory() * capacity) Here is an example, which I believe demonstrates usedCapacity being off - lets say cluster has 32G memory total: Queue definitions: root - capacity = absoluteCapacity = 100% - 32G root.a - capacity=50%, absoluteCap = 100*50=50% - 16G root.a.a1 - capacity = 50%, absoluteCap = (100*50)*50 = 25% - 8G root.a.a2 - capacity = 50%, absoluteCap = (100*50)*50% = 25% - 8G root.b - capacity = 20%, absoluteCap = 100*20=20% - 6.4G root.b.b1 - capacity = 60%, absoluteCap = 100*20%*60% = 12%, MaxCap = -1 (lets me go over my capacity) = 3.84G root.c - capacity = 30%, absoluteCap = 100*30=30% - 9.6G root.c.c1 - capacity = 75%, absoluteCap = 100*30%*75* = 22.5% = 7.2G Now I start a job that uses 4G submitted to a1. root used = 4/(32 * 1) = 12.5%, utilization= 4/(32*1) = 12.5% = 4G of 32G root.a used = 4/(32*.5) = 25%, utilization = 4/(32*.5) = 25% = 4G of 16G root.a.a1 used = 4/(32 *.5) = 25%, utilization = 4/(32*.25) = 50% => this just happens to work because capacity = parent asoluteCap job in a1 finishes. now start a job that uses 4G submitted to b1 root used = 4/(32 * 1) = 12.5%, utilization= 4/(32*1) = 12.5% = 4G of 32G root.b used = 4/(32*.2) = 62.5%, utilization = 4/(32*.2) = 62.5% = 4G of 6.4G root.b.b1 used = 4/(32 *.6) = 20.8% (4G of 6.4G ??), utilization = 4/(32*.12) = 104% (4G of 3.84G) the 20.8% doesn't make sense to me? I'm using 4G of my parents 6.4G - that should be 62.5%. job in a1 finishes. now start a job that uses 4G submitted to c1 root used = 4/(32 * 1) = 12.5%, utilization= 4/(32*1) = 12.5% = 4G of 32G root.c used = 4/(32*.3) = 41.6%, utilization = 4/(32*.3) = 41.6% = 4G of 9.6G root.c.c1 used = 4/(32 *.75) = 16.6% (4G of 9.6G ??), utilization = 4/(32*.225) = 55.5% (4G of 7.2G) Again the 16.6% I think is wrong as it should be 4G of the 9.6G or 41.6% I think the formula should be usedCapacity = usedResources.getMemory() / (clusterResource.getMemory() * parent.getAbsoluteCapacity()) > capacity scheduler LeafQueues calculate used capacity wrong > --- > > Key: MAPREDUCE-3681 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3681 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.0 >Reporter: Thomas Graves >Assignee: Arun C Murthy >Priority: Critical > Attachments: MAPREDUCE-3681.patch, MAPREDUCE-3681.patch > > > In the Capacity scheduler if you configure the queues to be hierarchical > where you have root -> parent queue -> leaf queue, the leaf queue doesn't > calculate the used capacity properly. It seems to be using the entire cluster > memory rather then its parents memory capacity. > In updateResource in LeafQueue: > setUsedCapacity( > usedResources.getMemory() / (clusterResource.getMemory() * capacity)); > I think the clusterResource.getMemory() should be something like > getParentsMemory(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3525) Shuffle benchmark is nearly 1.5x slower in 0.23
[ https://issues.apache.org/jira/browse/MAPREDUCE-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar resolved MAPREDUCE-3525. -- Resolution: Fixed Fix Version/s: 0.23.1 Fixed via MAPREDUCE-3641. Thanks for the performance update Amol. > Shuffle benchmark is nearly 1.5x slower in 0.23 > --- > > Key: MAPREDUCE-3525 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3525 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: mrv2 >Affects Versions: 0.23.1 >Reporter: Karam Singh >Assignee: Vinod Kumar Vavilapalli >Priority: Blocker > Fix For: 0.23.1 > > > Shuffle benchmark is nearly 1.5X slower(almost 55% increased) in 0.23 than > Hadoop-0.20.204 on 350 nodes size cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira