[jira] [Updated] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Tryuber updated MAPREDUCE-3859: -- Release Note: Fixed wrong CapacityScheduler resource allocation for high memory consumption jobs Status: Patch Available (was: Open) Fix is for MR1 only. Test + fix is in the patch. CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Environment: CDH3u1 Reporter: Sergey Tryuber Assignee: Sergey Tryuber Attachments: test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Tryuber updated MAPREDUCE-3859: -- Attachment: MAPREDUCE-3859_MR1_fix_and_test.patch.txt testcase and fix CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Environment: CDH3u1 Reporter: Sergey Tryuber Assignee: Sergey Tryuber Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661838#comment-13661838 ] Sergey Tryuber commented on MAPREDUCE-3859: --- Arun, I've attached patch for branch-1 with testcase and fix (thanks for pointing me to the right branch). happy to help with YARN/trunk if you want. - yes, please. You know, I had troubles with understanding of test cases of YARN version of CS. I'm not sure about correctness of testing architecture, where there is one huge capacity scheduler configuration with lots of queues. This scheduler configuration is created at the beginning of each test by Before method and each test uses that configuration. I think this is not a good choice, because it doesn't allow to test edge cases and hard for understanding (there are no comments at all)). So please, could you help me and take care about fix for YARN. P.S. Hardcored mocks are great, but, personally, I'd prefer old school with inversion of control (strategy pattern) and agile architecture. CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Environment: CDH3u1 Reporter: Sergey Tryuber Assignee: Sergey Tryuber Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662647#comment-13662647 ] Sergey Tryuber commented on MAPREDUCE-3859: --- Thanks, Arun CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Reporter: Sergey Tryuber Assignee: Sergey Tryuber Fix For: 2.0.5-beta, 1.2.1 Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659278#comment-13659278 ] Sergey Tryuber commented on MAPREDUCE-3859: --- Mike, that's great! So I think this task can be closed, unless someone from Cloudera (their MR1 in CDH4 is still be affected) wants to take care about this issue and port the fix to old Capacity Scheduler into their sources. For the others who faces this issue, below is a brief step-by-step instruction for CDH4.1.2: * Download sources from https://ccp.cloudera.com/display/SUPPORT/CDH+Downloads. Note: you need hadoop-0.20-mapreduce-0.20.2+1265 tarball. * Unpack it and go to root directory. * Apply changes from the first comment and test case from attached patch. * Also you should add the following lines: {code} reactor.repo=https\://repository.cloudera.com/content/repositories/snapshots version=2.0.0-mr1-cdh4.1.2 {code} into src/contrib/index/ivy/libraries.properties and src/contrib/capacity-scheduler/ivy/libraries.properties files. * Test fixes that were made: {code} ant test-contrib {code} * Build a jar file: {code} cd src/contrib/capacity-scheduler/ ant jar cd - {code} * The result file will be placed at build/contrib/capacity-scheduler/hadoop-capacity-scheduler-2.0.0-mr1-cdh4.1.2.jar. * Replace original file with the fixed on a node where JobTracker is started. Original file is placed in /usr/lib/hadoop-0.20-mapreduce/contrib/capacity-scheduler/ directory. CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Environment: CDH3u1 Reporter: Sergey Tryuber Assignee: Sergey Tryuber Attachments: test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658317#comment-13658317 ] Sergey Tryuber commented on MAPREDUCE-3859: --- Mike, your test results look a little bit strange even for 2 slots per reducer. Because you've said that max capacity is 60. So I would expect that all 60 slots are used in this case. Try to play with user limit factor. Also try to set up initial capacity to a little be higher value that 4 slots. I'm afraid there is another, not related to this, bug when slots per task initial capacity. Arun, Matt, today I have a look into trunk (I believe this is what you call 1.3.0 release, because there is no 1.3 brunch). And I found there fully reworked capacity scheduler (YARN). There is another abstraction now which is called Resource instead of slot/task. I was digging into it for a couple of hours and got to the feeling that this bug is gone there. I even found a test which tests something similar and tried to create my own test, but test case (TestLeafQueue.java) organized very poorly and, basically, tests nothing (mocks over mocks, no human readable logic and so on). Sorry, I've spent couple of hours trying to rewrite it and understood that it would take several more days for me. So I give it up. But, once again, the bug seems to be gone in YARN version of CS, so nothing to fix there. For everyone else who is affected by this bug (old Capacity Scheduler), please, use a hot fix from my first comment. Or, Arun, you can commit that fix and attached test case (yep, old CapacityScheduler were covered by test cases much better than in yarn) to appropriate brunch - I just don't know which brunch to use and I didn't found contrib module in trunk. CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Environment: CDH3u1 Reporter: Sergey Tryuber Assignee: Sergey Tryuber Attachments: test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618011#comment-13618011 ] Sergey Tryuber commented on MAPREDUCE-3859: --- What I know is that the bug is still present in CDH4.1 MR1. So we had to patch Capacity Scheduler there as well... CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Environment: CDH3u1 Reporter: Sergey Tryuber Attachments: test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira