[jira] [Commented] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-10-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983871#comment-14983871
 ] 

Hadoop QA commented on MAPREDUCE-5003:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
9s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 20s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 23s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
48s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 14s 
{color} | {color:red} hadoop-mapreduce-client-app in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 18s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 18s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s 
{color} | {color:red} Patch generated 2 new checkstyle issues in 
hadoop-mapreduce-project/hadoop-mapreduce-client (total was 604, now 603). 
{color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 38s {color} 
| {color:red} hadoop-mapreduce-client-app in the patch failed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 32s {color} 
| {color:red} hadoop-mapreduce-client-core in the patch failed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 16s {color} 
| {color:red} hadoop-mapreduce-client-app in the patch failed with JDK 
v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 52s {color} 
| {color:red} hadoop-mapreduce-client-core in the patch failed with JDK 
v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 21s 
{color} | {color:red} Patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 44m 7s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | 
hadoop.mapreduce.filecache.TestClientDistributedCacheManager |
| JDK v1.8.0_60 Timed out junit tests | 
org.apache.hadoop.mapreduce.v2.app.TestFetchFailure |
|   | org.apache.hadoop.mapreduce.v2.app.TestMRApp |
| JDK v1.7.0_79 Failed junit tests | 
hadoop.mapreduce.filecache.TestClientDistributedCacheManager |
| JDK v1.7.0_79 Timed out junit tests | 
org.apache.hadoop.mapredu

[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete

2015-10-30 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-5003:

Attachment: MAPREDUCE-5003.10.patch

.10 patch fix some checkstyle.
broken test of TestJobHistoryEventHandler is not related to my change. It may 
be transient since it pass in my local machine with my patch on. broken test of 
TestRecovery also appear to be transient becaue it pass on my local machine 
with my patch on. I update testMultipleCrashes of TestRecovery to improve its 
stability. 
testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken 
without applying my patch. Will file jira for that broken test

> AM recovery should recreate records for attempts that were incomplete
> -
>
> Key: MAPREDUCE-5003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.10.patch, 
> MAPREDUCE-5003.2.patch, MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, 
> MAPREDUCE-5003.5.patch, MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, 
> MAPREDUCE-5003.7.patch, MAPREDUCE-5003.8.patch, MAPREDUCE-5003.9.patch, 
> MAPREDUCE-5003.9.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task 
> attempt entries for *all* task attempts launched by the prior app attempt 
> even if those task attempts did not complete.  The attempts would have to be 
> marked as killed or something similar to indicate it is no longer running.  
> Having records for the task attempts enables the user to see what nodes were 
> associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories

2015-10-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983790#comment-14983790
 ] 

Hadoop QA commented on MAPREDUCE-6512:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
2s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 13s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 5s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
54s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 3s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 2s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 2s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 54s 
{color} | {color:red} Patch generated 1 new checkstyle issues in root (total 
was 149, now 149). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 24s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_60. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 29s {color} 
| {color:red} hadoop-mapreduce-client-core in the patch failed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 135m 24s 
{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed 
with JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 5s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_79. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 2s {color} | 
{color:red} hadoop-mapreduce-client-core in the patch failed with JDK 
v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 137m 13s 
{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed 
with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 32s 
{color} | {color:red} Patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 329m 14s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | 
hadoop.mapreduce.filecache.TestClientDistributedCacheManager |
|   | hadoop.mapred.TestMerge |
|   |

[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983777#comment-14983777
 ] 

Hudson commented on MAPREDUCE-6451:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2552 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2552/])
MAPREDUCE-6451. DistCp has incorrect chunkFilePath for multiple jobs (kihwal: 
rev 2868ca0328d908056745223fb38d9a90fd2811ba)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/lib/TestDynamicInputFormat.java
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/StubContext.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunk.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicRecordReader.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputFormat.java
Addendum to MAPREDUCE-6451 (kihwal: rev 
b24fe0648348d325d14931f80cee8a170fb3358a)
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunkContext.java


> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.7.2
>
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983762#comment-14983762
 ] 

Hudson commented on MAPREDUCE-6451:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #558 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/558/])
Addendum to MAPREDUCE-6451 (kihwal: rev 
b24fe0648348d325d14931f80cee8a170fb3358a)
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunkContext.java


> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.7.2
>
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983760#comment-14983760
 ] 

Hudson commented on MAPREDUCE-6451:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk #2495 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2495/])
MAPREDUCE-6451. DistCp has incorrect chunkFilePath for multiple jobs (kihwal: 
rev 2868ca0328d908056745223fb38d9a90fd2811ba)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicRecordReader.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputFormat.java
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/lib/TestDynamicInputFormat.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunk.java
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/StubContext.java
Addendum to MAPREDUCE-6451 (kihwal: rev 
b24fe0648348d325d14931f80cee8a170fb3358a)
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunkContext.java


> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.7.2
>
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983734#comment-14983734
 ] 

Hudson commented on MAPREDUCE-6451:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #610 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/610/])
Addendum to MAPREDUCE-6451 (kihwal: rev 
b24fe0648348d325d14931f80cee8a170fb3358a)
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunkContext.java


> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.7.2
>
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983708#comment-14983708
 ] 

Hudson commented on MAPREDUCE-6451:
---

FAILURE: Integrated in Hadoop-Yarn-trunk #1345 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1345/])
Addendum to MAPREDUCE-6451 (kihwal: rev 
b24fe0648348d325d14931f80cee8a170fb3358a)
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunkContext.java


> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.7.2
>
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6302) Preempt reducers after a configurable timeout irrespective of headroom

2015-10-30 Thread Jooseong Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983656#comment-14983656
 ] 

Jooseong Kim commented on MAPREDUCE-6302:
-

I think this usually happens when the RM sends out a overestimated headroom.
One thing we could do is to skip scheduleReduces() if we ended up preempting 
reducers through preemptReducesIfNeeded().
Since the headroom is underestimated, scheduleReduces may schedule more 
reducers, which will need to be preempted again.

> Preempt reducers after a configurable timeout irrespective of headroom
> --
>
> Key: MAPREDUCE-6302
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6302
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: mai shurong
>Assignee: Karthik Kambatla
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: AM_log_head10.txt.gz, AM_log_tail10.txt.gz, 
> log.txt, mr-6302-1.patch, mr-6302-2.patch, mr-6302-3.patch, mr-6302-4.patch, 
> mr-6302-5.patch, mr-6302-6.patch, mr-6302-7.patch, mr-6302-prelim.patch, 
> mr-6302_branch-2.patch, queue_with_max163cores.png, 
> queue_with_max263cores.png, queue_with_max333cores.png
>
>
> I submit a  big job, which has 500 maps and 350 reduce, to a 
> queue(fairscheduler) with 300 max cores. When the big mapreduce job is 
> running 100% maps, the 300 reduces have occupied 300 max cores in the queue. 
> And then, a map fails and retry, waiting for a core, while the 300 reduces 
> are waiting for failed map to finish. So a deadlock occur. As a result, the 
> job is blocked, and the later job in the queue cannot run because no 
> available cores in the queue.
> I think there is the similar issue for memory of a queue .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6514) Job hangs as ask is not updated after ramping down of all reducers

2015-10-30 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-6514:
---
Status: Open  (was: Patch Available)

h4. Comment on current patch
You should look at {{rampDownReduces()}} API and use it instead of hand-rolling 
{{decContainerReq}}. I actually think once we do this, you should remove 
{{clearAllPendingReduceRequests()}} altogether.

I am looking at branch-2 and I think the current patch is better served on top 
of MAPREDUCE-6302 (and this only in 2.8+) given the numerous changes made 
there. The patch obviously doesn't apply on branch-2.7 which you set the 
target-version as (2.7.2). Canceling the patch.

h4. Meta thought
If MAPREDUCE-6513 goes through per my latest proposal there, there is no need 
for canceling all the reduce asks and thus this patch, no? 

h4. Release
IAC, this has been a long-standing problem (though I'm very surprised nobody 
caught this till now), so I'd propose we move this out into 2.7.3 or better 
2.8+ so I can make progress on the 2.7.2 release. Thoughts?

> Job hangs as ask is not updated after ramping down of all reducers
> --
>
> Key: MAPREDUCE-6514
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6514
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.7.1
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: MAPREDUCE-6514.01.patch
>
>
> In RMContainerAllocator#preemptReducesIfNeeded, we simply clear the scheduled 
> reduces map and put these reducers to pending. This is not updated in ask. So 
> RM keeps on assigning and AM is not able to assign as no reducer is 
> scheduled(check logs below the code).
> If this is updated immediately, RM will be able to schedule mappers 
> immediately which anyways is the intention when we ramp down reducers.
> Scheduler need not allocate for ramped down reducers
> This if not handled can lead to map starvation as pointed out in 
> MAPREDUCE-6513
> {code}
>  LOG.info("Ramping down all scheduled reduces:"
> + scheduledRequests.reduces.size());
> for (ContainerRequest req : scheduledRequests.reduces.values()) {
>   pendingReduces.add(req);
> }
> scheduledRequests.reduces.clear();
> {code}
> {noformat}
> 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container not 
> assigned : container_1437451211867_1485_01_000215
> 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Cannot assign 
> container Container: [ContainerId: container_1437451211867_1485_01_000216, 
> NodeId: hdszzdcxdat6g06u04p:26009, NodeHttpAddress: 
> hdszzdcxdat6g06u04p:26010, Resource: , Priority: 10, 
> Token: Token { kind: ContainerToken, service: 10.2.33.236:26009 }, ] for a 
> reduce as either  container memory less than required 4096 or no pending 
> reduce tasks - reduces.isEmpty=true
> 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container not 
> assigned : container_1437451211867_1485_01_000216
> 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Cannot assign 
> container Container: [ContainerId: container_1437451211867_1485_01_000217, 
> NodeId: hdszzdcxdat6g06u06p:26009, NodeHttpAddress: 
> hdszzdcxdat6g06u06p:26010, Resource: , Priority: 10, 
> Token: Token { kind: ContainerToken, service: 10.2.33.239:26009 }, ] for a 
> reduce as either  container memory less than required 4096 or no pending 
> reduce tasks - reduces.isEmpty=true
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983575#comment-14983575
 ] 

Hudson commented on MAPREDUCE-6451:
---

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #622 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/622/])
MAPREDUCE-6451. DistCp has incorrect chunkFilePath for multiple jobs (kihwal: 
rev 2868ca0328d908056745223fb38d9a90fd2811ba)
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/lib/TestDynamicInputFormat.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunk.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicRecordReader.java
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/StubContext.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputFormat.java
Addendum to MAPREDUCE-6451 (kihwal: rev 
b24fe0648348d325d14931f80cee8a170fb3358a)
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunkContext.java


> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.7.2
>
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983540#comment-14983540
 ] 

Hudson commented on MAPREDUCE-6451:
---

FAILURE: Integrated in Hadoop-trunk-Commit #8736 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8736/])
Addendum to MAPREDUCE-6451 (kihwal: rev 
b24fe0648348d325d14931f80cee8a170fb3358a)
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunkContext.java


> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.7.2
>
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time

2015-10-30 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983421#comment-14983421
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-6513:


Went through the discussion. Here's what we should do, mostly agreeing with 
what [~chen317] says.
 - Node failure should not be counted towards task-attempt count. So, yes, 
let's continue to mark such tasks as killed.
 - Rescheduling of this killed task can (and must) take higher priority 
independent of whether it is marked as killed or failed. In fact, this was how 
we originally designed the failed-map-should-have-higher-priority concept. In 
sprit, fail-fast-map actually meant maps which retroactively failed, like in 
this case.

[~varun_saxena], I can take a stab at this if you don't have cycles. Let me 
know either-ways.

IAC, this has been a long-standing problem (though I'm very surprised nobody 
caught this till now), so I'd propose we move this out into 2.7.3 so I can make 
progress on the 2.7.2 release. Thoughts? /cc [~Jobo]

> MR job got hanged forever when one NM unstable for some time
> 
>
> Key: MAPREDUCE-6513
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6513
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Bob
>Assignee: Varun Saxena
>Priority: Critical
>
> when job is in-progress which is having more tasks,one node became unstable 
> due to some OS issue.After the node became unstable, the map on this node 
> status changed to KILLED state. 
> Currently maps which were running on unstable node are rescheduled, and all 
> are in scheduled state and wait for RM assign container.Seen ask requests for 
> map till Node is good (all those failed), there are no ask request after 
> this. But AM keeps on preempting the reducers (it's recycling).
> Finally reducers are waiting for complete mappers and mappers did n't get 
> container..
> My Question Is:
> 
> why map requests did not sent AM ,once after node recovery.?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983413#comment-14983413
 ] 

Hudson commented on MAPREDUCE-6451:
---

FAILURE: Integrated in Hadoop-Yarn-trunk #1344 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1344/])
MAPREDUCE-6451. DistCp has incorrect chunkFilePath for multiple jobs (kihwal: 
rev 2868ca0328d908056745223fb38d9a90fd2811ba)
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputFormat.java
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/StubContext.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunk.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/lib/TestDynamicInputFormat.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicRecordReader.java


> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.7.2
>
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983385#comment-14983385
 ] 

Hudson commented on MAPREDUCE-6451:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #557 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/557/])
MAPREDUCE-6451. DistCp has incorrect chunkFilePath for multiple jobs (kihwal: 
rev 2868ca0328d908056745223fb38d9a90fd2811ba)
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputFormat.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunk.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/StubContext.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicRecordReader.java
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/lib/TestDynamicInputFormat.java


> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.7.2
>
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983369#comment-14983369
 ] 

Hudson commented on MAPREDUCE-6451:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #609 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/609/])
MAPREDUCE-6451. DistCp has incorrect chunkFilePath for multiple jobs (kihwal: 
rev 2868ca0328d908056745223fb38d9a90fd2811ba)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputFormat.java
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/StubContext.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicRecordReader.java
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/lib/TestDynamicInputFormat.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunk.java


> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.7.2
>
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983325#comment-14983325
 ] 

Allen Wittenauer commented on MAPREDUCE-6451:
-

I was over here trying to figure out why test-patch said it was good when trunk 
was failing. haha.  :D

> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.7.2
>
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983321#comment-14983321
 ] 

Kihwal Lee commented on MAPREDUCE-6451:
---

bq did you forget the DynamicInputChunkContext class when you commit?
It is a Friday. :)  Fixed it. Thanks for reporting.

> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.7.2
>
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983309#comment-14983309
 ] 

Hudson commented on MAPREDUCE-6451:
---

FAILURE: Integrated in Hadoop-trunk-Commit #8735 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8735/])
MAPREDUCE-6451. DistCp has incorrect chunkFilePath for multiple jobs (kihwal: 
rev 2868ca0328d908056745223fb38d9a90fd2811ba)
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/StubContext.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunk.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputFormat.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/lib/TestDynamicInputFormat.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicRecordReader.java


> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.7.2
>
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983302#comment-14983302
 ] 

Mingliang Liu commented on MAPREDUCE-6451:
--

Hi [~kihwal], did you forget the {{DynamicInputChunkContext}} class when you 
commit? If true, I think addendum commit will be good.

> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.7.2
>
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983299#comment-14983299
 ] 

Hudson commented on MAPREDUCE-6528:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk #2494 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2494/])
MAPREDUCE-6528. Memory leak for HistoryFileManager.getJobSummary(). (jlowe: rev 
6344b6a7694c70f296392b6462dba452ff762109)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java


> Memory leak for HistoryFileManager.getJobSummary()
> --
>
> Key: MAPREDUCE-6528
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.7.2, 2.6.3
>
> Attachments: MAPREDUCE-6528.patch
>
>
> We meet memory leak issues for JHS in a large cluster which is caused by code 
> below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 
> should fix most cases that exceptions get thrown. However, we still need to 
> fix the memory leak for occasional case.
> {code} 
> private String getJobSummary(FileContext fc, Path path) throws IOException {
> Path qPath = fc.makeQualified(path);
> FSDataInputStream in = fc.open(qPath);
> String jobSummaryString = in.readUTF();
> in.close();
> return jobSummaryString;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories

2015-10-30 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated MAPREDUCE-6512:

Attachment: MAPREDUCE-6512.2.patch

.2 patch fix broken tests which are caused by missing parent dir

> FileOutputCommitter tasks unconditionally create parent directories
> ---
>
> Key: MAPREDUCE-6512
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: MAPREDUCE-6512.2.patch, MAPREDUCE-6512.patch, 
> MAPREDUCE-6512.patch
>
>
> If the output directory is deleted then subsequent tasks should fail. Instead 
> they blindly create the missing parent directories, leading the job to be 
> "succesful" despite potentially missing almost all of the output. Task 
> attempts should fail if the parent app attempt directory is missing when they 
> go to create their task attempt directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6529) AppMaster will not retry to request resource if AppMaster happens to decide to not use the resource

2015-10-30 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983230#comment-14983230
 ] 

Jason Lowe commented on MAPREDUCE-6529:
---

bq. For example, in a heterogeneous cluster, reduce task may prefer a container 
on powerful machine with higher I/O speed. MPI job may prefer containers on 
machine with higher cpu frequency. But RM can't know all the resource 
requirement for different applications. So I am just wandering if the RPC 
protocol between RM and AM may provide such new interface, let the AM make a 
second choice about if AM will use the container it just gets.

If the AM has a preference for something then it needs to specify that in the 
locality request.  Failing to do so just leads to a Monte Carlo situation where 
the AM tosses away containers hoping that the next random container is better 
than the last while the task starves for resources in the interim.  The AM 
doesn't have a cluster-wide view.  It's not going to know that the nodes it 
desires so much are all occupied, nor does it know what other things are 
running on various nodes.  So I still think it's undesirable to have the AM 
toss away a container because it hopes a better one will come along later.  
Instead we should have the AM improve the container request so the RM can 
better know the AM's intentions.

For example, if the cluster is heterogeneous with some nodes being much better 
for reducers than others then we should label those nodes.  Then the AM request 
can ask reducers to use those labeled nodes with the ability to relax that 
locality constraint if the nodes are unavailable.  The RM will make a much 
better decision more efficiently than the AM could ever hope to do on its own.  
Otherwise the AM is going to get a container, see it's not on one of the "good" 
nodes and toss it, request another, get another bad allocation, rinse, repeat.  
If we can teach the MapReduce AM how to recognize a good node vs. a bad node 
then we can also teach it how to request those nodes when it makes the initial 
allocation request.


> AppMaster will not retry to request resource if AppMaster happens to decide 
> to not use the resource
> ---
>
> Key: MAPREDUCE-6529
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6529
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Affects Versions: 2.6.0
>Reporter: Wei Chen
>
> I am viewing code in RMContainerAllocator.java.   I want to do some 
> improvement  so that the AppMaster could give up some containers that may not 
> be optimal  when it receives new assigned containers.  But I found that if 
> AppMaster give up the containers, it will not retry to request the resource 
> again.
> int RMContainerRequestor.java, Set ask  is used to ask 
> resource from ResourceManager. I found each container could only be requested 
> once. It mean ask can be filled by addResourceRequestToAsk(ResourceRequest 
> remoteRequest[]), but it can only added for once for each container. If we 
> give up one assigned container, It will never request again



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated MAPREDUCE-6451:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.7.2
   3.0.0
   Status: Resolved  (was: Patch Available)

I've committed this to trunk, branch-2 and branch-2.7. Thanks for working on 
the fix, Kuhu. Thank you gentlemen for the reviews.

> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 2.7.2
>
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic

2015-10-30 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983156#comment-14983156
 ] 

Kihwal Lee commented on MAPREDUCE-6451:
---

+1

> DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
> -
>
> Key: MAPREDUCE-6451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, 
> MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch
>
>
> DistCp when used with dynamic strategy does not update the chunkFilePath and 
> other static variables any time other than for the first job. This is seen 
> when DistCp::run() is used. 
> A single copy succeeds but multiple jobs finish successfully without any real 
> copying. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6529) AppMaster will not retry to request resource if AppMaster happens to decide to not use the resource

2015-10-30 Thread Wei Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983155#comment-14983155
 ] 

Wei Chen commented on MAPREDUCE-6529:
-

Yes,  What I am doing is just to let MA  have chances to choose the containers 
it needs.  I knows that a delay scheduling is implemented in YARN. It is very 
useful for locality sensitive job like MapReduce. But sometimes delay scheduler 
can not make sure the optimal resource the AM get.  For example, in a 
heterogeneous cluster, reduce task may prefer a container on powerful machine 
with higher I/O speed.  MPI job may prefer containers on machine with higher 
cpu frequency.  But RM can't know all the resource requirement for different 
applications.  So I am just wandering if  the RPC protocol between RM and AM 
may provide such new interface, let the AM make a second choice about if AM 
will use the container it just gets. And different AMs may implement their own 
preference for container they request.


> AppMaster will not retry to request resource if AppMaster happens to decide 
> to not use the resource
> ---
>
> Key: MAPREDUCE-6529
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6529
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Affects Versions: 2.6.0
>Reporter: Wei Chen
>
> I am viewing code in RMContainerAllocator.java.   I want to do some 
> improvement  so that the AppMaster could give up some containers that may not 
> be optimal  when it receives new assigned containers.  But I found that if 
> AppMaster give up the containers, it will not retry to request the resource 
> again.
> int RMContainerRequestor.java, Set ask  is used to ask 
> resource from ResourceManager. I found each container could only be requested 
> once. It mean ask can be filled by addResourceRequestToAsk(ResourceRequest 
> remoteRequest[]), but it can only added for once for each container. If we 
> give up one assigned container, It will never request again



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983126#comment-14983126
 ] 

Hudson commented on MAPREDUCE-6528:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #556 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/556/])
MAPREDUCE-6528. Memory leak for HistoryFileManager.getJobSummary(). (jlowe: rev 
6344b6a7694c70f296392b6462dba452ff762109)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java


> Memory leak for HistoryFileManager.getJobSummary()
> --
>
> Key: MAPREDUCE-6528
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.7.2, 2.6.3
>
> Attachments: MAPREDUCE-6528.patch
>
>
> We meet memory leak issues for JHS in a large cluster which is caused by code 
> below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 
> should fix most cases that exceptions get thrown. However, we still need to 
> fix the memory leak for occasional case.
> {code} 
> private String getJobSummary(FileContext fc, Path path) throws IOException {
> Path qPath = fc.makeQualified(path);
> FSDataInputStream in = fc.open(qPath);
> String jobSummaryString = in.readUTF();
> in.close();
> return jobSummaryString;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983110#comment-14983110
 ] 

Hudson commented on MAPREDUCE-6528:
---

FAILURE: Integrated in Hadoop-Yarn-trunk #1343 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1343/])
MAPREDUCE-6528. Memory leak for HistoryFileManager.getJobSummary(). (jlowe: rev 
6344b6a7694c70f296392b6462dba452ff762109)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java


> Memory leak for HistoryFileManager.getJobSummary()
> --
>
> Key: MAPREDUCE-6528
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.7.2, 2.6.3
>
> Attachments: MAPREDUCE-6528.patch
>
>
> We meet memory leak issues for JHS in a large cluster which is caused by code 
> below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 
> should fix most cases that exceptions get thrown. However, we still need to 
> fix the memory leak for occasional case.
> {code} 
> private String getJobSummary(FileContext fc, Path path) throws IOException {
> Path qPath = fc.makeQualified(path);
> FSDataInputStream in = fc.open(qPath);
> String jobSummaryString = in.readUTF();
> in.close();
> return jobSummaryString;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983017#comment-14983017
 ] 

Hudson commented on MAPREDUCE-6528:
---

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #620 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/620/])
MAPREDUCE-6528. Memory leak for HistoryFileManager.getJobSummary(). (jlowe: rev 
6344b6a7694c70f296392b6462dba452ff762109)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java


> Memory leak for HistoryFileManager.getJobSummary()
> --
>
> Key: MAPREDUCE-6528
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.7.2, 2.6.3
>
> Attachments: MAPREDUCE-6528.patch
>
>
> We meet memory leak issues for JHS in a large cluster which is caused by code 
> below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 
> should fix most cases that exceptions get thrown. However, we still need to 
> fix the memory leak for occasional case.
> {code} 
> private String getJobSummary(FileContext fc, Path path) throws IOException {
> Path qPath = fc.makeQualified(path);
> FSDataInputStream in = fc.open(qPath);
> String jobSummaryString = in.readUTF();
> in.close();
> return jobSummaryString;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983012#comment-14983012
 ] 

Hudson commented on MAPREDUCE-6528:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2550 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2550/])
MAPREDUCE-6528. Memory leak for HistoryFileManager.getJobSummary(). (jlowe: rev 
6344b6a7694c70f296392b6462dba452ff762109)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java
* hadoop-mapreduce-project/CHANGES.txt


> Memory leak for HistoryFileManager.getJobSummary()
> --
>
> Key: MAPREDUCE-6528
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.7.2, 2.6.3
>
> Attachments: MAPREDUCE-6528.patch
>
>
> We meet memory leak issues for JHS in a large cluster which is caused by code 
> below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 
> should fix most cases that exceptions get thrown. However, we still need to 
> fix the memory leak for occasional case.
> {code} 
> private String getJobSummary(FileContext fc, Path path) throws IOException {
> Path qPath = fc.makeQualified(path);
> FSDataInputStream in = fc.open(qPath);
> String jobSummaryString = in.readUTF();
> in.close();
> return jobSummaryString;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982974#comment-14982974
 ] 

Hudson commented on MAPREDUCE-6528:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #607 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/607/])
MAPREDUCE-6528. Memory leak for HistoryFileManager.getJobSummary(). (jlowe: rev 
6344b6a7694c70f296392b6462dba452ff762109)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java


> Memory leak for HistoryFileManager.getJobSummary()
> --
>
> Key: MAPREDUCE-6528
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.7.2, 2.6.3
>
> Attachments: MAPREDUCE-6528.patch
>
>
> We meet memory leak issues for JHS in a large cluster which is caused by code 
> below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 
> should fix most cases that exceptions get thrown. However, we still need to 
> fix the memory leak for occasional case.
> {code} 
> private String getJobSummary(FileContext fc, Path path) throws IOException {
> Path qPath = fc.makeQualified(path);
> FSDataInputStream in = fc.open(qPath);
> String jobSummaryString = in.readUTF();
> in.close();
> return jobSummaryString;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()

2015-10-30 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982852#comment-14982852
 ] 

Junping Du commented on MAPREDUCE-6528:
---

Thanks Jason to review and commit this in!

> Memory leak for HistoryFileManager.getJobSummary()
> --
>
> Key: MAPREDUCE-6528
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.7.2, 2.6.3
>
> Attachments: MAPREDUCE-6528.patch
>
>
> We meet memory leak issues for JHS in a large cluster which is caused by code 
> below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 
> should fix most cases that exceptions get thrown. However, we still need to 
> fix the memory leak for occasional case.
> {code} 
> private String getJobSummary(FileContext fc, Path path) throws IOException {
> Path qPath = fc.makeQualified(path);
> FSDataInputStream in = fc.open(qPath);
> String jobSummaryString = in.readUTF();
> in.close();
> return jobSummaryString;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6514) Job hangs as ask is not updated after ramping down of all reducers

2015-10-30 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982766#comment-14982766
 ] 

Varun Saxena commented on MAPREDUCE-6514:
-

Test failure to be handled by YARN-4320

> Job hangs as ask is not updated after ramping down of all reducers
> --
>
> Key: MAPREDUCE-6514
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6514
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.7.1
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: MAPREDUCE-6514.01.patch
>
>
> In RMContainerAllocator#preemptReducesIfNeeded, we simply clear the scheduled 
> reduces map and put these reducers to pending. This is not updated in ask. So 
> RM keeps on assigning and AM is not able to assign as no reducer is 
> scheduled(check logs below the code).
> If this is updated immediately, RM will be able to schedule mappers 
> immediately which anyways is the intention when we ramp down reducers.
> Scheduler need not allocate for ramped down reducers
> This if not handled can lead to map starvation as pointed out in 
> MAPREDUCE-6513
> {code}
>  LOG.info("Ramping down all scheduled reduces:"
> + scheduledRequests.reduces.size());
> for (ContainerRequest req : scheduledRequests.reduces.values()) {
>   pendingReduces.add(req);
> }
> scheduledRequests.reduces.clear();
> {code}
> {noformat}
> 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container not 
> assigned : container_1437451211867_1485_01_000215
> 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Cannot assign 
> container Container: [ContainerId: container_1437451211867_1485_01_000216, 
> NodeId: hdszzdcxdat6g06u04p:26009, NodeHttpAddress: 
> hdszzdcxdat6g06u04p:26010, Resource: , Priority: 10, 
> Token: Token { kind: ContainerToken, service: 10.2.33.236:26009 }, ] for a 
> reduce as either  container memory less than required 4096 or no pending 
> reduce tasks - reduces.isEmpty=true
> 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container not 
> assigned : container_1437451211867_1485_01_000216
> 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Cannot assign 
> container Container: [ContainerId: container_1437451211867_1485_01_000217, 
> NodeId: hdszzdcxdat6g06u06p:26009, NodeHttpAddress: 
> hdszzdcxdat6g06u06p:26010, Resource: , Priority: 10, 
> Token: Token { kind: ContainerToken, service: 10.2.33.239:26009 }, ] for a 
> reduce as either  container memory less than required 4096 or no pending 
> reduce tasks - reduces.isEmpty=true
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()

2015-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982729#comment-14982729
 ] 

Hudson commented on MAPREDUCE-6528:
---

FAILURE: Integrated in Hadoop-trunk-Commit #8732 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8732/])
MAPREDUCE-6528. Memory leak for HistoryFileManager.getJobSummary(). (jlowe: rev 
6344b6a7694c70f296392b6462dba452ff762109)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java
* hadoop-mapreduce-project/CHANGES.txt


> Memory leak for HistoryFileManager.getJobSummary()
> --
>
> Key: MAPREDUCE-6528
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.7.2, 2.6.3
>
> Attachments: MAPREDUCE-6528.patch
>
>
> We meet memory leak issues for JHS in a large cluster which is caused by code 
> below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 
> should fix most cases that exceptions get thrown. However, we still need to 
> fix the memory leak for occasional case.
> {code} 
> private String getJobSummary(FileContext fc, Path path) throws IOException {
> Path qPath = fc.makeQualified(path);
> FSDataInputStream in = fc.open(qPath);
> String jobSummaryString = in.readUTF();
> in.close();
> return jobSummaryString;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()

2015-10-30 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-6528:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.6.3
   2.7.2
   Status: Resolved  (was: Patch Available)

Thanks to Junping for the contribution and to Brahma for additional review!  I 
committed this to trunk, branch-2, branch-2.7, and branch-2.6.

> Memory leak for HistoryFileManager.getJobSummary()
> --
>
> Key: MAPREDUCE-6528
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.7.2, 2.6.3
>
> Attachments: MAPREDUCE-6528.patch
>
>
> We meet memory leak issues for JHS in a large cluster which is caused by code 
> below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 
> should fix most cases that exceptions get thrown. However, we still need to 
> fix the memory leak for occasional case.
> {code} 
> private String getJobSummary(FileContext fc, Path path) throws IOException {
> Path qPath = fc.makeQualified(path);
> FSDataInputStream in = fc.open(qPath);
> String jobSummaryString = in.readUTF();
> in.close();
> return jobSummaryString;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()

2015-10-30 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982690#comment-14982690
 ] 

Jason Lowe commented on MAPREDUCE-6528:
---

+1, committing this.

> Memory leak for HistoryFileManager.getJobSummary()
> --
>
> Key: MAPREDUCE-6528
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: MAPREDUCE-6528.patch
>
>
> We meet memory leak issues for JHS in a large cluster which is caused by code 
> below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 
> should fix most cases that exceptions get thrown. However, we still need to 
> fix the memory leak for occasional case.
> {code} 
> private String getJobSummary(FileContext fc, Path path) throws IOException {
> Path qPath = fc.makeQualified(path);
> FSDataInputStream in = fc.open(qPath);
> String jobSummaryString = in.readUTF();
> in.close();
> return jobSummaryString;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6273) HistoryFileManager should check whether summaryFile exists to avoid FileNotFoundException causing HistoryFileInfo into MOVE_FAILED state

2015-10-30 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-6273:
--
Fix Version/s: 2.6.3

I committed this to branch-2.6 as well.

> HistoryFileManager should check whether summaryFile exists to avoid 
> FileNotFoundException causing HistoryFileInfo into MOVE_FAILED state
> 
>
> Key: MAPREDUCE-6273
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6273
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Minor
> Fix For: 2.7.2, 2.6.3
>
> Attachments: MAPREDUCE-6273.000.patch, MAPREDUCE-6273.001.patch
>
>
> HistoryFileManager should check whether summaryFile exists to avoid 
> FileNotFoundException causing HistoryFileInfo into MOVE_FAILED state,
> I saw the following error message:
> {code}
> 2015-02-17 19:13:45,198 ERROR 
> org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Error while trying to 
> move a job to done
> java.io.FileNotFoundException: File does not exist: 
> /user/history/done_intermediate/agd_laci-sluice/job_1423740288390_1884.summary
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1878)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1819)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1771)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:527)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:85)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:356)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>   at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown 
> Source)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>   at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>   at 
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1181)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1169)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1159)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:270)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:237)
>   at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:230)
>   at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1457)
>   at org.apache.hadoop.fs.Hdfs.open(Hdfs.java:318)
>   at org.apache.hadoop.fs.Hdfs.open(Hdfs.java:59)
>   at 
> org.apache.hadoop.fs.AbstractFileSystem.open(AbstractFileSystem.java:621)
>   at org.apache.hadoop.fs.FileContext$6.next(FileContext.java:789)
>   at org.apache.hadoop.fs.FileContext$6.next(FileContext.java:785)
>   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
>   at org.apache.hadoop.fs.FileContext.open(FileContext.java:785)
>   at 
> org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.getJobSummary(HistoryFileManager.java:953)
>   at 
> o

[jira] [Commented] (MAPREDUCE-5870) Support for passing Job priority through Application Submission Context in Mapreduce Side

2015-10-30 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982632#comment-14982632
 ] 

Sunil G commented on MAPREDUCE-5870:


Test case failures looks related. I will fix them in the next patch.

> Support for passing Job priority through Application Submission Context in 
> Mapreduce Side
> -
>
> Key: MAPREDUCE-5870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-MAPREDUCE-5870.patch, 0002-MAPREDUCE-5870.patch, 
> 0003-MAPREDUCE-5870.patch, 0004-MAPREDUCE-5870.patch, 
> 0005-MAPREDUCE-5870.patch, 0006-MAPREDUCE-5870.patch, 
> 0007-MAPREDUCE-5870.patch, Yarn-2002.1.patch
>
>
> Job Prioirty can be set from client side as below [Configuration and api].
>   a.  JobConf.getJobPriority() and 
> Job.setPriority(JobPriority priority) 
>   b.  We can also use configuration 
> "mapreduce.job.priority".
>   Now this Job priority can be passed in Application Submission 
> context from Client side.
>   Here we can reuse the MRJobConfig.PRIORITY configuration. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6529) AppMaster will not retry to request resource if AppMaster happens to decide to not use the resource

2015-10-30 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982611#comment-14982611
 ] 

Jason Lowe commented on MAPREDUCE-6529:
---

I'm a little confused based on the wording of the summary and the description.  
The summary implies there's a bug, but the description sounds like we're trying 
to add a feature where the MapReduce AM will voluntarily give up an assigned 
container because it's not "good enough."  I'm assuming the latter, please 
correct me if I'm wrong.

Could you elaborate more on the use-case for this?  Seems to me in general it 
is not a good idea for the AM to second-guess the RM scheduler when it comes to 
container placement.  The AM already conveyed to the RM where it wants 
containers, and the RM granted some containers.  If the AM doesn't like where 
those containers were placed, how likely is it that giving them up will result 
in a better placement in a reasonable timeframe?  The RM scheduler (at least 
the CapacityScheduler) already implements a delay to try to achieve node 
locality (see YARN-80), so at least in that setup it would seem a detriment to 
the job to give up a usable container now in hopes a better one comes along 
soon.  The RM has already waited around for a better one and decided to give a 
suboptimal one instead.  Unless it is very costly for the task to run off-node 
or off-rack, it will be worse to give up the assigned container than to just 
use it because it is unlikely a better container coming along soon.

> AppMaster will not retry to request resource if AppMaster happens to decide 
> to not use the resource
> ---
>
> Key: MAPREDUCE-6529
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6529
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Affects Versions: 2.6.0
>Reporter: Wei Chen
>
> I am viewing code in RMContainerAllocator.java.   I want to do some 
> improvement  so that the AppMaster could give up some containers that may not 
> be optimal  when it receives new assigned containers.  But I found that if 
> AppMaster give up the containers, it will not retry to request the resource 
> again.
> int RMContainerRequestor.java, Set ask  is used to ask 
> resource from ResourceManager. I found each container could only be requested 
> once. It mean ask can be filled by addResourceRequestToAsk(ResourceRequest 
> remoteRequest[]), but it can only added for once for each container. If we 
> give up one assigned container, It will never request again



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5870) Support for passing Job priority through Application Submission Context in Mapreduce Side

2015-10-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982600#comment-14982600
 ] 

Hadoop QA commented on MAPREDUCE-5870:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
1s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
11s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 24s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 24s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 24s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s 
{color} | {color:red} Patch generated 22 new checkstyle issues in 
hadoop-mapreduce-project/hadoop-mapreduce-client (total was 380, now 397). 
{color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 40s 
{color} | {color:green} hadoop-mapreduce-client-common in the patch passed with 
JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 40s {color} 
| {color:red} hadoop-mapreduce-client-core in the patch failed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 103m 22s 
{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed 
with JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s 
{color} | {color:green} hadoop-mapreduce-client-common in the patch passed with 
JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 0s {color} | 
{color:red} hadoop-mapreduce-client-core in the patch failed with JDK 
v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 103m 18s 
{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed 
with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 34s 
{color} | {color:red} Patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 239m 46s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | 
hadoop.mapreduce.

[jira] [Updated] (MAPREDUCE-6527) Data race on field org.apache.hadoop.mapred.JobConf.credentials

2015-10-30 Thread Edgar Pek (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edgar Pek updated MAPREDUCE-6527:
-
Affects Version/s: 2.7.1

> Data race on field org.apache.hadoop.mapred.JobConf.credentials
> ---
>
> Key: MAPREDUCE-6527
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6527
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Ali Kheradmand
>
> I am running the test suite against a dynamic race detector called 
> RV-Predict. Here is a race report that I got: 
> {noformat}
> Data race on field org.apache.hadoop.mapred.JobConf.credentials: {{{
> Concurrent read in thread T327 (locks held: {})
>  >  at org.apache.hadoop.mapred.JobConf.getCredentials(JobConf.java:505)
> at 
> org.apache.hadoop.mapreduce.task.JobContextImpl.(JobContextImpl.java:70)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:524)
> T327 is created by T22
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:218)
> Concurrent write in thread T22 (locks held: {Monitor@496c673a, 
> Monitor@496319b0})
>  >  at org.apache.hadoop.mapred.JobConf.setCredentials(JobConf.java:510)
> at 
> org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:787)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:241)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
> at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
>  locked Monitor@496319b0 at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:n/a)
>  
> at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:245)
>  locked Monitor@496c673a at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:229)
>  
> T22 is created by T1
> at 
> org.apache.hadoop.mapred.jobcontrol.TestJobControl.doJobControlTest(TestJobControl.java:111)
> }}}
> {noformat}
> In the source code of org.apache.hadoop.mapreduce.JobStatus.submitJob 
> function, we have the following lines:
> {code}
> Job job = new Job(JobID.downgrade(jobid), jobSubmitDir);
> job.job.setCredentials(credentials);
> {code}
> It looks a bit suspicious: Job extends thread and at the end of its 
> constructor it starts a new thread which creates a new instance of 
> JobContextImpl which reads credentials. However, the first thread 
> concurrently sets credentials after a creating the Job instance. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5485) Allow repeating job commit by extending OutputCommitter API

2015-10-30 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982375#comment-14982375
 ] 

Junping Du commented on MAPREDUCE-5485:
---

Thanks [~bikassaha] for the comments! I agree it makes more sense to move retry 
logic into committer.commitJob() if it support repeatable. My original thinking 
is to combine this retry for committer.commitJob() with other AM exceptions in 
handleJobCommit (outside of committer), like: failed to write 
endCommitSuccessFile, etc. But now I think we should separate committer retry 
with AM specific handling for the reason you mentioned above. For this case, I 
would prefer we just let AM exit directly instead of fail the job (if commit 
job is repeatable). Most like the same as proposed above by [~nemon], but a 
slightly different is: we should apply AM fail (not job fail) even for 
commiter.commitJob() failed after retry for handling some corner cases, i.e. 
something goes wrong with related to committer in this AM but still get chance 
to success in another AM if we support repeatable in commit job. 
I will update a patch soon.

> Allow repeating job commit by extending OutputCommitter API
> ---
>
> Key: MAPREDUCE-5485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5485
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.1.0-beta
>Reporter: Nemon Lou
>Assignee: Junping Du
> Attachments: MAPREDUCE-5485-demo-2.patch, MAPREDUCE-5485-demo.patch
>
>
> There are chances MRAppMaster crush during job committing,or NodeManager 
> restart cause the committing AM exit due to container expire.In these cases 
> ,the job will fail.
> However,some jobs can redo commit so failing the job becomes unnecessary.
> Let clients tell AM to allow redo commit or not is a better choice.
> This idea comes from Jason Lowe's comments in MAPREDUCE-4819 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()

2015-10-30 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982317#comment-14982317
 ] 

Junping Du commented on MAPREDUCE-6528:
---

Thanks [~brahmareddy]! Can someone commit this patch in? It is quite 
straight-forward.

> Memory leak for HistoryFileManager.getJobSummary()
> --
>
> Key: MAPREDUCE-6528
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: MAPREDUCE-6528.patch
>
>
> We meet memory leak issues for JHS in a large cluster which is caused by code 
> below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 
> should fix most cases that exceptions get thrown. However, we still need to 
> fix the memory leak for occasional case.
> {code} 
> private String getJobSummary(FileContext fc, Path path) throws IOException {
> Path qPath = fc.makeQualified(path);
> FSDataInputStream in = fc.open(qPath);
> String jobSummaryString = in.readUTF();
> in.close();
> return jobSummaryString;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6532) CLONE - Create Fake Log from Hadoop

2015-10-30 Thread Girmay Desta (JIRA)
Girmay Desta created MAPREDUCE-6532:
---

 Summary: CLONE - Create Fake Log from Hadoop
 Key: MAPREDUCE-6532
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6532
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: contrib/mumak
Reporter: Girmay Desta
 Attachments: FakeLogs.tar.gz

Version 1 of Mumak supports simulation of Map-Reduce jobs from the logs 
generated by original job run. Our main aim to run the job even without 
submitting it. So this task concerns to generate fake log file of Map-Reduce 
task, convert that into JSON by Rumen and run those files in Mumak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6531) CLONE - Mumak: Map-Reduce Simulator

2015-10-30 Thread Girmay Desta (JIRA)
Girmay Desta created MAPREDUCE-6531:
---

 Summary: CLONE - Mumak: Map-Reduce Simulator
 Key: MAPREDUCE-6531
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6531
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 0.21.0
Reporter: Girmay Desta
Assignee: Hong Tang
 Fix For: 0.21.0
 Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
mapreduce-728-20090918-3.patch, mapreduce-728-20090918-5.patch, 
mapreduce-728-20090918-6.patch, mapreduce-728-20090918.patch, mumak.png

h3. Vision:

We want to build a Simulator to simulate large-scale Hadoop clusters, 
applications and workloads. This would be invaluable in furthering Hadoop by 
providing a tool for researchers and developers to prototype features (e.g. 
pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
their behaviour and performance with reasonable amount of confidence, there-by 
aiding rapid innovation.



h3. First Cut: Simulator for the Map-Reduce Scheduler

The Map-Reduce Scheduler is a fertile area of interest with at least four 
schedulers, each with their own set of features, currently in existence: 
Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority Scheduler.

Each scheduler's scheduling decisions are driven by many factors, such as 
fairness, capacity guarantee, resource availability, data-locality etc.

Given that, it is non-trivial to accurately choose a single scheduler or even a 
set of desired features to predict the right scheduler (or features) for a 
given workload. Hence a simulator which can predict how well a particular 
scheduler works for some specific workload by quickly iterating over schedulers 
and/or scheduler features would be quite useful.

So, the first cut is to implement a simulator for the Map-Reduce scheduler 
which take as input a job trace derived from production workload and a cluster 
definition, and simulates the execution of the jobs in as defined in the trace 
in this virtual cluster. As output, the detailed job execution trace (recorded 
in relation to virtual simulated time) could then be analyzed to understand 
various traits of individual schedulers (individual jobs turn around time, 
throughput, faireness, capacity guarantee, etc). To support this, we would need 
a simulator which could accurately model the conditions of the actual system 
which would affect a schedulers decisions. These include very large-scale 
clusters (thousands of nodes), the detailed characteristics of the workload 
thrown at the clusters, job or task failures, data locality, and cluster 
hardware (cpu, memory, disk i/o, network i/o, network topology) etc.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()

2015-10-30 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982241#comment-14982241
 ] 

Brahma Reddy Battula commented on MAPREDUCE-6528:
-

Agreed..Since it's planned for 2.6.3 release..Current patch LGTM..+ 1 
(nonbinding)..

> Memory leak for HistoryFileManager.getJobSummary()
> --
>
> Key: MAPREDUCE-6528
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: MAPREDUCE-6528.patch
>
>
> We meet memory leak issues for JHS in a large cluster which is caused by code 
> below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 
> should fix most cases that exceptions get thrown. However, we still need to 
> fix the memory leak for occasional case.
> {code} 
> private String getJobSummary(FileContext fc, Path path) throws IOException {
> Path qPath = fc.makeQualified(path);
> FSDataInputStream in = fc.open(qPath);
> String jobSummaryString = in.readUTF();
> in.close();
> return jobSummaryString;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()

2015-10-30 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982214#comment-14982214
 ] 

Junping Du commented on MAPREDUCE-6528:
---

Good point, Vinod! Let's keep the patch as it is now as try-with-resources 
won't be supported in earlier version of JDKs.

> Memory leak for HistoryFileManager.getJobSummary()
> --
>
> Key: MAPREDUCE-6528
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: MAPREDUCE-6528.patch
>
>
> We meet memory leak issues for JHS in a large cluster which is caused by code 
> below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 
> should fix most cases that exceptions get thrown. However, we still need to 
> fix the memory leak for occasional case.
> {code} 
> private String getJobSummary(FileContext fc, Path path) throws IOException {
> Path qPath = fc.makeQualified(path);
> FSDataInputStream in = fc.open(qPath);
> String jobSummaryString = in.readUTF();
> in.close();
> return jobSummaryString;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5870) Support for passing Job priority through Application Submission Context in Mapreduce Side

2015-10-30 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated MAPREDUCE-5870:
---
Attachment: 0007-MAPREDUCE-5870.patch

As MAPREDUCE-6515 is committed, we can get the priority from AM in JobStatus. 
Updating the patch with this support.

> Support for passing Job priority through Application Submission Context in 
> Mapreduce Side
> -
>
> Key: MAPREDUCE-5870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-MAPREDUCE-5870.patch, 0002-MAPREDUCE-5870.patch, 
> 0003-MAPREDUCE-5870.patch, 0004-MAPREDUCE-5870.patch, 
> 0005-MAPREDUCE-5870.patch, 0006-MAPREDUCE-5870.patch, 
> 0007-MAPREDUCE-5870.patch, Yarn-2002.1.patch
>
>
> Job Prioirty can be set from client side as below [Configuration and api].
>   a.  JobConf.getJobPriority() and 
> Job.setPriority(JobPriority priority) 
>   b.  We can also use configuration 
> "mapreduce.job.priority".
>   Now this Job priority can be passed in Application Submission 
> context from Client side.
>   Here we can reuse the MRJobConfig.PRIORITY configuration. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6530) Jobtracker is slow when more JT UI requests

2015-10-30 Thread Prabhu Joseph (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982031#comment-14982031
 ] 

Prabhu Joseph commented on MAPREDUCE-6530:
--

[~kasha] It is MR1 Hadoop 2.5.1. I entered Wrong version.

> Jobtracker is slow when more JT UI requests
> ---
>
> Key: MAPREDUCE-6530
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6530
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.5.1
>Reporter: Prabhu Joseph
>
> JobTracker is slow when there are huge number of Jobs running and 30
> connections were established to info port to view Job status and counters.
> hadoop job -list took 4m22.412s
> We took Jstack traces and found most of the server threads waiting on 
> JobTracker object and the thread which has the lock on JobTracker waits for 
> ResourceBundle object.
> "retireJobs" prio=10 tid=0x7f2345200800 nid=0x11c1 waiting for
> monitor entry [0x7f22e3499000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at
> org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56)
> - waiting to lock <0x000197cc6218> (a java.lang.Class for
> org.apache.hadoop.mapreduce.util.ResourceBundles)
> at
> org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75)
> at
> org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130)
> at 
> org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534)
> - locked <0x0007f8411608> (a org.apache.hadoop.mapred.Counters)
> at
> org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1728)
> at
> org.apache.hadoop.mapred.JobInProgress.getMapCounters(JobInProgress.java:1669)
> at
> org.apache.hadoop.mapred.JobTracker$RetireJobs.addToCache(JobTracker.java:657)
> - locked <0x9644ae08> (a
> org.apache.hadoop.mapred.JobTracker$RetireJobs)
> at
> org.apache.hadoop.mapred.JobTracker$RetireJobs.run(JobTracker.java:769)
> - locked <0x964c5550> (a
> org.apache.hadoop.mapred.FairScheduler)
> - locked <0x9644a9d0> (a 
> java.util.Collections$SynchronizedMap)
> - locked <0x962ac660> (a org.apache.hadoop.mapred.JobTracker)
> at java.lang.Thread.run(Thread.java:745)
> The ResourceBundle object is locked most of the time by JT GUI jobtracker_jsp 
> and does getMapCounters().
> "926410165@qtp-1732070199-56" daemon prio=10 tid=0x7f232c4df000 nid=0x27c0
> runnable [0x7f22db7bf000]
>java.lang.Thread.State: RUNNABLE
> at java.lang.Throwable.fillInStackTrace(Native Method)
> at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
> - locked <0x00061a49ede0> (a java.util.MissingResourceException)
> at java.lang.Throwable.(Throwable.java:287)
> at java.lang.Exception.(Exception.java:84)
> at java.lang.RuntimeException.(RuntimeException.java:80)
> at
> java.util.MissingResourceException.(MissingResourceException.java:85)
> at
> java.util.ResourceBundle.throwMissingResourceException(ResourceBundle.java:1499)
> at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1322)
> at java.util.ResourceBundle.getBundle(ResourceBundle.java:1028)
> at
> org.apache.hadoop.mapreduce.util.ResourceBundles.getBundle(ResourceBundles.java:37)
> at
> org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56)
> - locked <0x000197cc6218> (a java.lang.Class for
> org.apache.hadoop.mapreduce.util.ResourceBundles)
> at
> org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75)
> at
> org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130)
> at 
> org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534)
> - locked <0x0007ed1024b8> (a org.apache.hadoop.mapred.Counters)
> at
> org.apache.hadoop.mapred.

[jira] [Updated] (MAPREDUCE-6530) Jobtracker is slow when more JT UI requests

2015-10-30 Thread Prabhu Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated MAPREDUCE-6530:
-
Affects Version/s: (was: 1.2.1)
   2.5.1

> Jobtracker is slow when more JT UI requests
> ---
>
> Key: MAPREDUCE-6530
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6530
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.5.1
>Reporter: Prabhu Joseph
>
> JobTracker is slow when there are huge number of Jobs running and 30
> connections were established to info port to view Job status and counters.
> hadoop job -list took 4m22.412s
> We took Jstack traces and found most of the server threads waiting on 
> JobTracker object and the thread which has the lock on JobTracker waits for 
> ResourceBundle object.
> "retireJobs" prio=10 tid=0x7f2345200800 nid=0x11c1 waiting for
> monitor entry [0x7f22e3499000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at
> org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56)
> - waiting to lock <0x000197cc6218> (a java.lang.Class for
> org.apache.hadoop.mapreduce.util.ResourceBundles)
> at
> org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75)
> at
> org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130)
> at 
> org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534)
> - locked <0x0007f8411608> (a org.apache.hadoop.mapred.Counters)
> at
> org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1728)
> at
> org.apache.hadoop.mapred.JobInProgress.getMapCounters(JobInProgress.java:1669)
> at
> org.apache.hadoop.mapred.JobTracker$RetireJobs.addToCache(JobTracker.java:657)
> - locked <0x9644ae08> (a
> org.apache.hadoop.mapred.JobTracker$RetireJobs)
> at
> org.apache.hadoop.mapred.JobTracker$RetireJobs.run(JobTracker.java:769)
> - locked <0x964c5550> (a
> org.apache.hadoop.mapred.FairScheduler)
> - locked <0x9644a9d0> (a 
> java.util.Collections$SynchronizedMap)
> - locked <0x962ac660> (a org.apache.hadoop.mapred.JobTracker)
> at java.lang.Thread.run(Thread.java:745)
> The ResourceBundle object is locked most of the time by JT GUI jobtracker_jsp 
> and does getMapCounters().
> "926410165@qtp-1732070199-56" daemon prio=10 tid=0x7f232c4df000 nid=0x27c0
> runnable [0x7f22db7bf000]
>java.lang.Thread.State: RUNNABLE
> at java.lang.Throwable.fillInStackTrace(Native Method)
> at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
> - locked <0x00061a49ede0> (a java.util.MissingResourceException)
> at java.lang.Throwable.(Throwable.java:287)
> at java.lang.Exception.(Exception.java:84)
> at java.lang.RuntimeException.(RuntimeException.java:80)
> at
> java.util.MissingResourceException.(MissingResourceException.java:85)
> at
> java.util.ResourceBundle.throwMissingResourceException(ResourceBundle.java:1499)
> at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1322)
> at java.util.ResourceBundle.getBundle(ResourceBundle.java:1028)
> at
> org.apache.hadoop.mapreduce.util.ResourceBundles.getBundle(ResourceBundles.java:37)
> at
> org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56)
> - locked <0x000197cc6218> (a java.lang.Class for
> org.apache.hadoop.mapreduce.util.ResourceBundles)
> at
> org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75)
> at
> org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130)
> at 
> org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534)
> - locked <0x0007ed1024b8> (a org.apache.hadoop.mapred.Counters)
> at
> org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgre

[jira] [Updated] (MAPREDUCE-6530) Jobtracker is slow when more JT UI requests

2015-10-30 Thread Prabhu Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated MAPREDUCE-6530:
-
Target Version/s:   (was: 1.3.0)

> Jobtracker is slow when more JT UI requests
> ---
>
> Key: MAPREDUCE-6530
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6530
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.5.1
>Reporter: Prabhu Joseph
>
> JobTracker is slow when there are huge number of Jobs running and 30
> connections were established to info port to view Job status and counters.
> hadoop job -list took 4m22.412s
> We took Jstack traces and found most of the server threads waiting on 
> JobTracker object and the thread which has the lock on JobTracker waits for 
> ResourceBundle object.
> "retireJobs" prio=10 tid=0x7f2345200800 nid=0x11c1 waiting for
> monitor entry [0x7f22e3499000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at
> org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56)
> - waiting to lock <0x000197cc6218> (a java.lang.Class for
> org.apache.hadoop.mapreduce.util.ResourceBundles)
> at
> org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75)
> at
> org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130)
> at 
> org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534)
> - locked <0x0007f8411608> (a org.apache.hadoop.mapred.Counters)
> at
> org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1728)
> at
> org.apache.hadoop.mapred.JobInProgress.getMapCounters(JobInProgress.java:1669)
> at
> org.apache.hadoop.mapred.JobTracker$RetireJobs.addToCache(JobTracker.java:657)
> - locked <0x9644ae08> (a
> org.apache.hadoop.mapred.JobTracker$RetireJobs)
> at
> org.apache.hadoop.mapred.JobTracker$RetireJobs.run(JobTracker.java:769)
> - locked <0x964c5550> (a
> org.apache.hadoop.mapred.FairScheduler)
> - locked <0x9644a9d0> (a 
> java.util.Collections$SynchronizedMap)
> - locked <0x962ac660> (a org.apache.hadoop.mapred.JobTracker)
> at java.lang.Thread.run(Thread.java:745)
> The ResourceBundle object is locked most of the time by JT GUI jobtracker_jsp 
> and does getMapCounters().
> "926410165@qtp-1732070199-56" daemon prio=10 tid=0x7f232c4df000 nid=0x27c0
> runnable [0x7f22db7bf000]
>java.lang.Thread.State: RUNNABLE
> at java.lang.Throwable.fillInStackTrace(Native Method)
> at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
> - locked <0x00061a49ede0> (a java.util.MissingResourceException)
> at java.lang.Throwable.(Throwable.java:287)
> at java.lang.Exception.(Exception.java:84)
> at java.lang.RuntimeException.(RuntimeException.java:80)
> at
> java.util.MissingResourceException.(MissingResourceException.java:85)
> at
> java.util.ResourceBundle.throwMissingResourceException(ResourceBundle.java:1499)
> at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1322)
> at java.util.ResourceBundle.getBundle(ResourceBundle.java:1028)
> at
> org.apache.hadoop.mapreduce.util.ResourceBundles.getBundle(ResourceBundles.java:37)
> at
> org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56)
> - locked <0x000197cc6218> (a java.lang.Class for
> org.apache.hadoop.mapreduce.util.ResourceBundles)
> at
> org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47)
> at
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75)
> at
> org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130)
> at 
> org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534)
> - locked <0x0007ed1024b8> (a org.apache.hadoop.mapred.Counters)
> at
> org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1728)
> at
> org