[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-08-08 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081122#comment-13081122
 ] 

Arun C Murthy commented on MAPREDUCE-2729:
--

Thomas, it doesn't make sense to port this to trunk - please don't bother, 
unless you want to look at this vis-a-vis MAPREDUCE-279.

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-08-08 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081118#comment-13081118
 ] 

Thomas Graves commented on MAPREDUCE-2729:
--

The patch is for the branch-0.20-security branch.  I will look at putting it on 
trunk.

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-08-04 Thread Sherry Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079451#comment-13079451
 ] 

Sherry Chen commented on MAPREDUCE-2729:


I manually verified this fix on the 10 nodes cluster.

Verification steps:
1. Replace hadoop-capacity-scheduler.jar with the fix on the cluster gateway
2. Modify the capacity-scheduler.xml to ensure a queue have multiple map & 
reduce task slots
3. restart mapred
4. Submit jobs for a user which start reduces when 5% (default) maps complete, 
submit jobs for 2nd user (same queue as 1st user) which start reduces when 50% 
maps complete.
5. Verify that 1st user got all queue reduce capacity whatever the 2nd user 
hasn't used yet, it is greater than user-limit.



> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-08-03 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079129#comment-13079129
 ] 

Arun C Murthy commented on MAPREDUCE-2729:
--

Sherri - thanks. Can you please clarify that you manually verified this fix on 
the cluster? Thanks.

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-08-03 Thread Sherry Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079087#comment-13079087
 ] 

Sherry Chen commented on MAPREDUCE-2729:


Tested in 10 node mini cluster, test passed.

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-08-01 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073803#comment-13073803
 ] 

Arun C Murthy commented on MAPREDUCE-2729:
--

To qualify: please run it on a cluster of 5-10 nodes, verify the fix manually 
and please let me know. Thanks.

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-07-29 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073078#comment-13073078
 ] 

Arun C Murthy commented on MAPREDUCE-2729:
--

Sherry, you need to verify this on a real cluster to be safe before we commit 
this...

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-07-29 Thread Sherry Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073027#comment-13073027
 ] 

Sherry Chen commented on MAPREDUCE-2729:


Arun, do you mean I need to run tests in a test cluster? I haven't got any 
cluster to do it.

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-07-29 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073009#comment-13073009
 ] 

Arun C Murthy commented on MAPREDUCE-2729:
--

Sherry - I meant what tests you ran at scale to ensure this works...

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-07-29 Thread Sherry Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072987#comment-13072987
 ] 

Sherry Chen commented on MAPREDUCE-2729:


Arun, I ran unit-tests and test-patch. Thx, Sherry

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-07-29 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072714#comment-13072714
 ] 

Arun C Murthy commented on MAPREDUCE-2729:
--

Sherry, the patch looks good. What sort of testing have you done?

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-07-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071465#comment-13071465
 ] 

Hadoop QA commented on MAPREDUCE-2729:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12487910/MAPREDUCE-2729.patch
  against trunk revision 1150926.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/504//console

This message is automatically generated.

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-07-26 Thread Sherry Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071421#comment-13071421
 ] 

Sherry Chen commented on MAPREDUCE-2729:


Ant test passed.

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-07-26 Thread Milind Bhandarkar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071333#comment-13071333
 ] 

Milind Bhandarkar commented on MAPREDUCE-2729:
--

It would be good to have a notion of a "ready" task, which is separate from a 
pending task.

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira