[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081122#comment-13081122 ] Arun C Murthy commented on MAPREDUCE-2729: -- Thomas, it doesn't make sense to port this to trunk - please don't bother, unless you want to look at this vis-a-vis MAPREDUCE-279. > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081118#comment-13081118 ] Thomas Graves commented on MAPREDUCE-2729: -- The patch is for the branch-0.20-security branch. I will look at putting it on trunk. > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079451#comment-13079451 ] Sherry Chen commented on MAPREDUCE-2729: I manually verified this fix on the 10 nodes cluster. Verification steps: 1. Replace hadoop-capacity-scheduler.jar with the fix on the cluster gateway 2. Modify the capacity-scheduler.xml to ensure a queue have multiple map & reduce task slots 3. restart mapred 4. Submit jobs for a user which start reduces when 5% (default) maps complete, submit jobs for 2nd user (same queue as 1st user) which start reduces when 50% maps complete. 5. Verify that 1st user got all queue reduce capacity whatever the 2nd user hasn't used yet, it is greater than user-limit. > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079129#comment-13079129 ] Arun C Murthy commented on MAPREDUCE-2729: -- Sherri - thanks. Can you please clarify that you manually verified this fix on the cluster? Thanks. > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079087#comment-13079087 ] Sherry Chen commented on MAPREDUCE-2729: Tested in 10 node mini cluster, test passed. > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073803#comment-13073803 ] Arun C Murthy commented on MAPREDUCE-2729: -- To qualify: please run it on a cluster of 5-10 nodes, verify the fix manually and please let me know. Thanks. > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073078#comment-13073078 ] Arun C Murthy commented on MAPREDUCE-2729: -- Sherry, you need to verify this on a real cluster to be safe before we commit this... > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073027#comment-13073027 ] Sherry Chen commented on MAPREDUCE-2729: Arun, do you mean I need to run tests in a test cluster? I haven't got any cluster to do it. > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073009#comment-13073009 ] Arun C Murthy commented on MAPREDUCE-2729: -- Sherry - I meant what tests you ran at scale to ensure this works... > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072987#comment-13072987 ] Sherry Chen commented on MAPREDUCE-2729: Arun, I ran unit-tests and test-patch. Thx, Sherry > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072714#comment-13072714 ] Arun C Murthy commented on MAPREDUCE-2729: -- Sherry, the patch looks good. What sort of testing have you done? > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071465#comment-13071465 ] Hadoop QA commented on MAPREDUCE-2729: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12487910/MAPREDUCE-2729.patch against trunk revision 1150926. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/504//console This message is automatically generated. > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071421#comment-13071421 ] Sherry Chen commented on MAPREDUCE-2729: Ant test passed. > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > Attachments: MAPREDUCE-2729.patch > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed
[ https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071333#comment-13071333 ] Milind Bhandarkar commented on MAPREDUCE-2729: -- It would be good to have a notion of a "ready" task, which is separate from a pending task. > Reducers are always counted having "pending tasks" even if they can't be > scheduled yet because not enough of their mappers have completed > - > > Key: MAPREDUCE-2729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 0.20.205.0 > Environment: 0.20.1xx-Secondary >Reporter: Sherry Chen >Assignee: Sherry Chen > Fix For: 0.20.205.0 > > > In capacity scheduler, number of users in a queue needing slots are > calculated based on whether users' jobs have any pending tasks. > This works fine for map tasks. However, for reduce tasks, jobs do not need > reduce slots until the minimum number of map tasks have been completed. > Here, we add checking whether reduce is ready to schedule (i.e. if a job has > completed enough map tasks) when we increment number of users in a queue > needing reduce slots. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira