[jira] [Commented] (MAPREDUCE-4693) Historyserver should provide counters for failed tasks

2013-02-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581980#comment-13581980
 ] 

Hadoop QA commented on MAPREDUCE-4693:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12570083/MAPREDUCE-4693.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-tools/hadoop-rumen.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3345//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3345//console

This message is automatically generated.

> Historyserver should provide counters for failed tasks
> --
>
> Key: MAPREDUCE-4693
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4693
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, mrv2
>Reporter: Jason Lowe
>Assignee: Xuan Gong
>  Labels: usability
> Attachments: MAPREDUCE-4693.1.patch
>
>
> Currently the historyserver is not providing counters for failed tasks, even 
> though they are available via the AM as long as the job is still running.  
> Those counters are lost when the client needs to redirect to the 
> historyserver after the job completes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4693) Historyserver should provide counters for failed tasks

2013-02-19 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated MAPREDUCE-4693:
-

Status: Patch Available  (was: Open)

> Historyserver should provide counters for failed tasks
> --
>
> Key: MAPREDUCE-4693
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4693
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, mrv2
>Reporter: Jason Lowe
>Assignee: Xuan Gong
>  Labels: usability
> Attachments: MAPREDUCE-4693.1.patch
>
>
> Currently the historyserver is not providing counters for failed tasks, even 
> though they are available via the AM as long as the job is still running.  
> Those counters are lost when the client needs to redirect to the 
> historyserver after the job completes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-4693) Historyserver should provide counters for failed tasks

2013-02-19 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong reassigned MAPREDUCE-4693:


Assignee: Xuan Gong

> Historyserver should provide counters for failed tasks
> --
>
> Key: MAPREDUCE-4693
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4693
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, mrv2
>Reporter: Jason Lowe
>Assignee: Xuan Gong
>  Labels: usability
> Attachments: MAPREDUCE-4693.1.patch
>
>
> Currently the historyserver is not providing counters for failed tasks, even 
> though they are available via the AM as long as the job is still running.  
> Those counters are lost when the client needs to redirect to the 
> historyserver after the job completes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4693) Historyserver should provide counters for failed tasks

2013-02-19 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated MAPREDUCE-4693:
-

Attachment: MAPREDUCE-4693.1.patch

> Historyserver should provide counters for failed tasks
> --
>
> Key: MAPREDUCE-4693
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4693
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, mrv2
>Reporter: Jason Lowe
>Assignee: Xuan Gong
>  Labels: usability
> Attachments: MAPREDUCE-4693.1.patch
>
>
> Currently the historyserver is not providing counters for failed tasks, even 
> though they are available via the AM as long as the job is still running.  
> Those counters are lost when the client needs to redirect to the 
> historyserver after the job completes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4502) Multi-level aggregation with combining the result of maps per node/rack

2013-02-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581940#comment-13581940
 ] 

Hadoop QA commented on MAPREDUCE-4502:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12570068/MAPREDUCE-4502.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 16 new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3344//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3344//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3344//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-app.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3344//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3344//console

This message is automatically generated.

> Multi-level aggregation with combining the result of maps per node/rack
> ---
>
> Key: MAPREDUCE-4502
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4502
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster, mrv2
>Affects Versions: 3.0.0
>Reporter: Tsuyoshi OZAWA
>Assignee: Tsuyoshi OZAWA
> Attachments: design_v2.pdf, MAPREDUCE-4502.1.patch, 
> MAPREDUCE-4525-pof.diff, speculative_draft.pdf
>
>
> The shuffle costs is expensive in Hadoop in spite of the existence of 
> combiner, because the scope of combining is limited within only one MapTask. 
> To solve this problem, it's a good way to aggregate the result of maps per 
> node/rack by launch combiner.
> This JIRA is to implement the multi-level aggregation infrastructure, 
> including combining per container(MAPREDUCE-3902 is related), coordinating 
> containers by application master without breaking fault tolerance of jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4502) Multi-level aggregation with combining the result of maps per node/rack

2013-02-19 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4502:
--

Affects Version/s: 3.0.0
   Status: Patch Available  (was: Open)

> Multi-level aggregation with combining the result of maps per node/rack
> ---
>
> Key: MAPREDUCE-4502
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4502
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster, mrv2
>Affects Versions: 3.0.0
>Reporter: Tsuyoshi OZAWA
>Assignee: Tsuyoshi OZAWA
> Attachments: design_v2.pdf, MAPREDUCE-4502.1.patch, 
> MAPREDUCE-4525-pof.diff, speculative_draft.pdf
>
>
> The shuffle costs is expensive in Hadoop in spite of the existence of 
> combiner, because the scope of combining is limited within only one MapTask. 
> To solve this problem, it's a good way to aggregate the result of maps per 
> node/rack by launch combiner.
> This JIRA is to implement the multi-level aggregation infrastructure, 
> including combining per container(MAPREDUCE-3902 is related), coordinating 
> containers by application master without breaking fault tolerance of jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work stopped] (MAPREDUCE-4502) Multi-level aggregation with combining the result of maps per node/rack

2013-02-19 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-4502 stopped by Tsuyoshi OZAWA.

> Multi-level aggregation with combining the result of maps per node/rack
> ---
>
> Key: MAPREDUCE-4502
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4502
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster, mrv2
>Reporter: Tsuyoshi OZAWA
>Assignee: Tsuyoshi OZAWA
> Attachments: design_v2.pdf, MAPREDUCE-4502.1.patch, 
> MAPREDUCE-4525-pof.diff, speculative_draft.pdf
>
>
> The shuffle costs is expensive in Hadoop in spite of the existence of 
> combiner, because the scope of combining is limited within only one MapTask. 
> To solve this problem, it's a good way to aggregate the result of maps per 
> node/rack by launch combiner.
> This JIRA is to implement the multi-level aggregation infrastructure, 
> including combining per container(MAPREDUCE-3902 is related), coordinating 
> containers by application master without breaking fault tolerance of jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4502) Multi-level aggregation with combining the result of maps per node/rack

2013-02-19 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4502:
--

Attachment: MAPREDUCE-4502.1.patch

Added patch including test.

> Multi-level aggregation with combining the result of maps per node/rack
> ---
>
> Key: MAPREDUCE-4502
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4502
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster, mrv2
>Reporter: Tsuyoshi OZAWA
>Assignee: Tsuyoshi OZAWA
> Attachments: design_v2.pdf, MAPREDUCE-4502.1.patch, 
> MAPREDUCE-4525-pof.diff, speculative_draft.pdf
>
>
> The shuffle costs is expensive in Hadoop in spite of the existence of 
> combiner, because the scope of combining is limited within only one MapTask. 
> To solve this problem, it's a good way to aggregate the result of maps per 
> node/rack by launch combiner.
> This JIRA is to implement the multi-level aggregation infrastructure, 
> including combining per container(MAPREDUCE-3902 is related), coordinating 
> containers by application master without breaking fault tolerance of jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5014) Extending DistCp through a custom CopyListing is not possible

2013-02-19 Thread Srikanth Sundarrajan (JIRA)
Srikanth Sundarrajan created MAPREDUCE-5014:
---

 Summary: Extending DistCp through a custom CopyListing is not 
possible
 Key: MAPREDUCE-5014
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5014
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: distcp
Affects Versions: 0.23.5, 0.23.4, 0.23.3, 0.23.1, 0.23.0, trunk
Reporter: Srikanth Sundarrajan
Assignee: Srikanth Sundarrajan


* While it is possible to implement a custom CopyListing in DistCp, DistCp 
driver class doesn't allow for using this custom CopyListing.

* Allow SimpleCopyListing to provide an option to exclude files (For instance 
it is useful to exclude FileOutputCommiter.SUCCEEDED_FILE_NAME during copy as 
premature copy can indicate that the entire data is available at the 
destination)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5013) mapred.JobStatus compatibility: MR2 missing constructors from MR1

2013-02-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581781#comment-13581781
 ] 

Hadoop QA commented on MAPREDUCE-5013:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12570027/MAPREDUCE-5013.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3343//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3343//console

This message is automatically generated.

> mapred.JobStatus compatibility: MR2 missing constructors from MR1
> -
>
> Key: MAPREDUCE-5013
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5013
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5013.patch
>
>
> JobStatus is missing the following constructors in MR2 that were present in 
> MR1
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, float, int);
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, int);
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, float, int, org.apache.hadoop.mapred.JobPriority);
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, float, float, int, org.apache.hadoop.mapred.JobPriority);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5013) mapred.JobStatus compatibility: MR2 missing constructors from MR1

2013-02-19 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-5013:
--

Status: Patch Available  (was: Open)

> mapred.JobStatus compatibility: MR2 missing constructors from MR1
> -
>
> Key: MAPREDUCE-5013
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5013
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5013.patch
>
>
> JobStatus is missing the following constructors in MR2 that were present in 
> MR1
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, float, int);
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, int);
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, float, int, org.apache.hadoop.mapred.JobPriority);
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, float, float, int, org.apache.hadoop.mapred.JobPriority);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5013) mapred.JobStatus compatibility: MR2 missing constructors from MR1

2013-02-19 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-5013:
--

Attachment: MAPREDUCE-5013.patch

> mapred.JobStatus compatibility: MR2 missing constructors from MR1
> -
>
> Key: MAPREDUCE-5013
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5013
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5013.patch
>
>
> JobStatus is missing the following constructors in MR2 that were present in 
> MR1
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, float, int);
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, int);
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, float, int, org.apache.hadoop.mapred.JobPriority);
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, float, float, int, org.apache.hadoop.mapred.JobPriority);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5009) Killing the Task Attempt slated for commit does not clear the value from the Task commitAttempt member

2013-02-19 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581770#comment-13581770
 ] 

Jonathan Eagles commented on MAPREDUCE-5009:


Thanks, Robert!

> Killing the Task Attempt slated for commit does not clear the value from the 
> Task commitAttempt member
> --
>
> Key: MAPREDUCE-5009
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5009
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: trunk, 2.0.3-alpha, 0.23.5
>Reporter: Robert Parker
>Assignee: Robert Parker
>Priority: Critical
> Fix For: 3.0.0, 0.23.7, 2.0.4-beta
>
> Attachments: MAPREDUCE-5009_branch23.patch, MAPREDUCE-5009.patch
>
>
> A reduce task attempt was killed by the RM(pre-emptively), but had already 
> been assigned to the commitAttempt member.  This causes all subsequent 
> attempts to be killed by the AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5009) Killing the Task Attempt slated for commit does not clear the value from the Task commitAttempt member

2013-02-19 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated MAPREDUCE-5009:
---

   Resolution: Fixed
Fix Version/s: 2.0.4-beta
   0.23.7
   3.0.0
   Status: Resolved  (was: Patch Available)

> Killing the Task Attempt slated for commit does not clear the value from the 
> Task commitAttempt member
> --
>
> Key: MAPREDUCE-5009
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5009
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: trunk, 2.0.3-alpha, 0.23.5
>Reporter: Robert Parker
>Assignee: Robert Parker
>Priority: Critical
> Fix For: 3.0.0, 0.23.7, 2.0.4-beta
>
> Attachments: MAPREDUCE-5009_branch23.patch, MAPREDUCE-5009.patch
>
>
> A reduce task attempt was killed by the RM(pre-emptively), but had already 
> been assigned to the commitAttempt member.  This causes all subsequent 
> attempts to be killed by the AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5009) Killing the Task Attempt slated for commit does not clear the value from the Task commitAttempt member

2013-02-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581768#comment-13581768
 ] 

Hudson commented on MAPREDUCE-5009:
---

Integrated in Hadoop-trunk-Commit #3365 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3365/])
MAPREDUCE-5009. Killing the Task Attempt slated for commit does not clear 
the value from the Task commitAttempt member (Robert Parker via jeagles) 
(Revision 1447965)

 Result = SUCCESS
jeagles : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1447965
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java


> Killing the Task Attempt slated for commit does not clear the value from the 
> Task commitAttempt member
> --
>
> Key: MAPREDUCE-5009
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5009
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: trunk, 2.0.3-alpha, 0.23.5
>Reporter: Robert Parker
>Assignee: Robert Parker
>Priority: Critical
> Attachments: MAPREDUCE-5009_branch23.patch, MAPREDUCE-5009.patch
>
>
> A reduce task attempt was killed by the RM(pre-emptively), but had already 
> been assigned to the commitAttempt member.  This causes all subsequent 
> attempts to be killed by the AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5013) mapred.JobStatus compatibility: MR2 missing constructors from MR1

2013-02-19 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-5013:
--

Summary: mapred.JobStatus compatibility: MR2 missing constructors from MR1  
(was: JobStatus compatibility: MR2 missing constructors from MR1)

> mapred.JobStatus compatibility: MR2 missing constructors from MR1
> -
>
> Key: MAPREDUCE-5013
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5013
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> JobStatus is missing the following constructors in MR2 that were present in 
> MR1
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, float, int);
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, int);
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, float, int, org.apache.hadoop.mapred.JobPriority);
> public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
> float, float, float, float, int, org.apache.hadoop.mapred.JobPriority);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5013) JobStatus compatibility: MR2 missing constructors from MR1

2013-02-19 Thread Sandy Ryza (JIRA)
Sandy Ryza created MAPREDUCE-5013:
-

 Summary: JobStatus compatibility: MR2 missing constructors from MR1
 Key: MAPREDUCE-5013
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5013
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza


JobStatus is missing the following constructors in MR2 that were present in MR1

public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
float, float, float, int);
public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
float, float, int);
public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
float, float, float, int, org.apache.hadoop.mapred.JobPriority);
public org.apache.hadoop.mapred.JobStatus(org.apache.hadoop.mapred.JobID, 
float, float, float, float, int, org.apache.hadoop.mapred.JobPriority);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5009) Killing the Task Attempt slated for commit does not clear the value from the Task commitAttempt member

2013-02-19 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581674#comment-13581674
 ] 

Jonathan Eagles commented on MAPREDUCE-5009:


+1.

> Killing the Task Attempt slated for commit does not clear the value from the 
> Task commitAttempt member
> --
>
> Key: MAPREDUCE-5009
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5009
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: trunk, 2.0.3-alpha, 0.23.5
>Reporter: Robert Parker
>Assignee: Robert Parker
>Priority: Critical
> Attachments: MAPREDUCE-5009_branch23.patch, MAPREDUCE-5009.patch
>
>
> A reduce task attempt was killed by the RM(pre-emptively), but had already 
> been assigned to the commitAttempt member.  This causes all subsequent 
> attempts to be killed by the AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5009) Killing the Task Attempt slated for commit does not clear the value from the Task commitAttempt member

2013-02-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581605#comment-13581605
 ] 

Hadoop QA commented on MAPREDUCE-5009:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12569987/MAPREDUCE-5009_branch23.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app:

  org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskImpl

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3342//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3342//console

This message is automatically generated.

> Killing the Task Attempt slated for commit does not clear the value from the 
> Task commitAttempt member
> --
>
> Key: MAPREDUCE-5009
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5009
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: trunk, 2.0.3-alpha, 0.23.5
>Reporter: Robert Parker
>Assignee: Robert Parker
>Priority: Critical
> Attachments: MAPREDUCE-5009_branch23.patch, MAPREDUCE-5009.patch
>
>
> A reduce task attempt was killed by the RM(pre-emptively), but had already 
> been assigned to the commitAttempt member.  This causes all subsequent 
> attempts to be killed by the AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5009) Killing the Task Attempt slated for commit does not clear the value from the Task commitAttempt member

2013-02-19 Thread Robert Parker (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Parker updated MAPREDUCE-5009:
-

Attachment: MAPREDUCE-5009_branch23.patch

> Killing the Task Attempt slated for commit does not clear the value from the 
> Task commitAttempt member
> --
>
> Key: MAPREDUCE-5009
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5009
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: trunk, 2.0.3-alpha, 0.23.5
>Reporter: Robert Parker
>Assignee: Robert Parker
>Priority: Critical
> Attachments: MAPREDUCE-5009_branch23.patch, MAPREDUCE-5009.patch
>
>
> A reduce task attempt was killed by the RM(pre-emptively), but had already 
> been assigned to the commitAttempt member.  This causes all subsequent 
> attempts to be killed by the AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5009) Killing the Task Attempt slated for commit does not clear the value from the Task commitAttempt member

2013-02-19 Thread Robert Parker (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581587#comment-13581587
 ] 

Robert Parker commented on MAPREDUCE-5009:
--

Branch 0.23 patch uploaded.

> Killing the Task Attempt slated for commit does not clear the value from the 
> Task commitAttempt member
> --
>
> Key: MAPREDUCE-5009
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5009
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: trunk, 2.0.3-alpha, 0.23.5
>Reporter: Robert Parker
>Assignee: Robert Parker
>Priority: Critical
> Attachments: MAPREDUCE-5009_branch23.patch, MAPREDUCE-5009.patch
>
>
> A reduce task attempt was killed by the RM(pre-emptively), but had already 
> been assigned to the commitAttempt member.  This causes all subsequent 
> attempts to be killed by the AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5009) Killing the Task Attempt slated for commit does not clear the value from the Task commitAttempt member

2013-02-19 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581532#comment-13581532
 ] 

Jonathan Eagles commented on MAPREDUCE-5009:


I have verified the fix and test on trunk. Rob, if you provide a patch for 
branch 0.23, I can check it in there as well.

> Killing the Task Attempt slated for commit does not clear the value from the 
> Task commitAttempt member
> --
>
> Key: MAPREDUCE-5009
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5009
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: trunk, 2.0.3-alpha, 0.23.5
>Reporter: Robert Parker
>Assignee: Robert Parker
>Priority: Critical
> Attachments: MAPREDUCE-5009.patch
>
>
> A reduce task attempt was killed by the RM(pre-emptively), but had already 
> been assigned to the commitAttempt member.  This causes all subsequent 
> attempts to be killed by the AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5012) Typo in javadoc for IdentityMapper class

2013-02-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581517#comment-13581517
 ] 

Hadoop QA commented on MAPREDUCE-5012:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12569365/HADOOP-9308.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3341//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3341//console

This message is automatically generated.

> Typo in javadoc for IdentityMapper class
> 
>
> Key: MAPREDUCE-5012
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5012
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation
>Reporter: Adam Monsen
>Assignee: Adam Monsen
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: HADOOP-9308.patch
>
>
> IdentityMapper.map() is incorrectly documented as the "identify" function.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5012) Typo in javadoc for IdentityMapper class

2013-02-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581519#comment-13581519
 ] 

Hudson commented on MAPREDUCE-5012:
---

Integrated in Hadoop-trunk-Commit #3364 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3364/])
MAPREDUCE-5012. Typo in javadoc for IdentityMapper class. Contributed by 
Adam Monsen. (Revision 1447865)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1447865
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/IdentityMapper.java


> Typo in javadoc for IdentityMapper class
> 
>
> Key: MAPREDUCE-5012
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5012
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation
>Reporter: Adam Monsen
>Assignee: Adam Monsen
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: HADOOP-9308.patch
>
>
> IdentityMapper.map() is incorrectly documented as the "identify" function.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5012) Typo in javadoc for IdentityMapper class

2013-02-19 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated MAPREDUCE-5012:
---

   Resolution: Fixed
Fix Version/s: 3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I committed the patch to trunk. Thank you Adam!

> Typo in javadoc for IdentityMapper class
> 
>
> Key: MAPREDUCE-5012
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5012
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation
>Reporter: Adam Monsen
>Assignee: Adam Monsen
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: HADOOP-9308.patch
>
>
> IdentityMapper.map() is incorrectly documented as the "identify" function.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Moved] (MAPREDUCE-5012) Typo in javadoc for IdentityMapper class

2013-02-19 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas moved HADOOP-9308 to MAPREDUCE-5012:


Component/s: (was: documentation)
 documentation
Key: MAPREDUCE-5012  (was: HADOOP-9308)
Project: Hadoop Map/Reduce  (was: Hadoop Common)

> Typo in javadoc for IdentityMapper class
> 
>
> Key: MAPREDUCE-5012
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5012
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation
>Reporter: Adam Monsen
>Assignee: Adam Monsen
>Priority: Trivial
> Attachments: HADOOP-9308.patch
>
>
> IdentityMapper.map() is incorrectly documented as the "identify" function.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5010) use multithreading to speed up mergeParts and try MapPartitionsCompleteEvent to schedule fetch in reduce

2013-02-19 Thread Li Junjun (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Junjun updated MAPREDUCE-5010:
-

Description: 
use multithreading to speed up Merger and try MapPartitionsCompleteEvent to 
schedule fetch in reduce 


This is for muticore cpu, the performance will depend on your hardware and 
config.

In maptask 

for (int parts = 0; parts < partitions; parts++) {
//doing merger , append to final output file (file.out)
}

it only use one thread !
so,I think :We can use more Theads(conf: mapred.map.mergerthreads) to do Merger 
, if you have many cores or cpus.


Before, only a map task complete the reduce tasks will fetch the output , that 
means 
when map x complete , all the reduce will fetch the output concomitantly. even 
we use
   
   // Randomize the map output locations to prevent 
   // all reduce-tasks swamping the same tasktracker
   List hostList = new ArrayList();
   hostList.addAll(mapLocations.keySet());   
   Collections.shuffle(hostList, this.random);

in  reduce task .
for example ,  100 reduce wait 2 map complete ,beacase the cluster's map task 
capacity is 98,but the job have 
100 map tasks . 






so,I think : During the threads mergering  , for example if map has 8 
partitions , and use 3 thread  doing merger , 
where one of the thread complete one part we can inform  the Reduce to fetch 
the partition file  immediately,
or we can wait after 3 parts complete then send the event  (conf: 
mapred.map.parts.inform) to reduce the jt's stress.
not to wait all the map task complete. by doing this, it will  prevent all 
reduce-tasks swamping the same tasktracker
more effective and  speed reduce process.



is it  acceptable ?
and other good ideas ?


  was:
use multithreading to speed up Merger and try MapPartitionsCompleteEvent to 
schedule fetch in reduce 


This is for muticore cpu, the performance will depend on your hardware and 
config.

In maptask 

for (int parts = 0; parts < partitions; parts++) {
//doing merger , append to final output file (file.out)
}

it only use one thread !
so,I think :We can use more Theads(conf: mapred.map.mergerthreads) to do Merger 
, if you have many cores or cpus.


Before, only a map task complete the reduce tasks will fetch the output , that 
means 
when map x complete , all the reduce will fetch the output concomitantly. even 
we use
   
   // Randomize the map output locations to prevent 
   // all reduce-tasks swamping the same tasktracker
   List hostList = new ArrayList();
   hostList.addAll(mapLocations.keySet());   
   Collections.shuffle(hostList, this.random);

in  reduce task .
for example ,  100 reduce wait 2 map complete ,beacase the cluster's map task 
capacity is 98,but the job have 
100 map tasks . 


so,I think : During the threads mergering  , for example if map has 8 
partitions , and use 3 thread  doing merger , 
where one of the thread complete one part we can inform  the Reduce to fetch 
the partition file  immediately,
or we can wait after 3 parts complete then send the event  (conf: 
mapred.map.parts.inform) to reduce the jt's stress.
not to wait all the map task complete. by doing this, it will  prevent all 
reduce-tasks swamping the same tasktracker
more effective .



is it  acceptable ?
and other good ideas ?



> use multithreading to speed up mergeParts  and try MapPartitionsCompleteEvent 
> to schedule fetch in reduce 
> --
>
> Key: MAPREDUCE-5010
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5010
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Affects Versions: 1.0.1
>Reporter: Li Junjun
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: MAPREDUCE-5010.jpg
>
>
> use multithreading to speed up Merger and try MapPartitionsCompleteEvent to 
> schedule fetch in reduce 
> This is for muticore cpu, the performance will depend on your hardware and 
> config.
> In maptask 
> 
> for (int parts = 0; parts < partitions; parts++) {
>   //doing merger , append to final output file (file.out)
> }
> 
> it only use one thread !
> so,I think :We can use more Theads(conf: mapred.map.mergerthreads) to do 
> Merger , if you have many cores or cpus.
> Before, only a map task complete the reduce tasks will fetch the output , 
> that means 
> when map x complete , all the reduce will fetch the output concomitantly. 
> even we use
>
>// Randomize the map output locations to prevent 
>// all reduce-tasks swamping the same tasktracker
>List hostList = new ArrayList();
>hostList.addAll(mapLocations.keySet());   
>Collections.shuffle(hostList, this.random);
> 
> in  reduce task .
> for example ,  100 reduce wait 2 map complete ,beacase the cluster's map tas

[jira] [Commented] (MAPREDUCE-4974) Optimising the LineRecordReader initialize() method

2013-02-19 Thread Gelesh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581170#comment-13581170
 ] 

Gelesh commented on MAPREDUCE-4974:
---

[~tlipcon], [~snihalani], [~kkambatl]
Please share your thoughts

> Optimising the LineRecordReader initialize() method
> ---
>
> Key: MAPREDUCE-4974
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4974
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1, mrv2, performance
>Affects Versions: 2.0.2-alpha, 0.23.5
> Environment: Hadoop Linux
>Reporter: Arun A K
>Assignee: Gelesh
>  Labels: patch, performance
> Attachments: MAPREDUCE-4974.1.patch, MAPREDUCE-4974.2.patch, 
> MAPREDUCE-4974.3.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> I found there is a a scope of optimizing the code, over initialize() if we 
> have compressionCodecs & codec instantiated only if its a compressed input.
> Mean while Gelesh George Omathil, added if we could avoid the null check of 
> key & value. This would time save, since for every next key value generation, 
> null check is done. The intention being to instantiate only once and avoid 
> NPE as well. Hope both could be met if initialize key & value over  
> initialize() method. We both have worked on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira