[jira] [Commented] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983871#comment-14983871 ] Hadoop QA commented on MAPREDUCE-5003: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 9s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 20s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 23s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 48s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 14s {color} | {color:red} hadoop-mapreduce-client-app in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 15s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 18s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 18s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s {color} | {color:red} Patch generated 2 new checkstyle issues in hadoop-mapreduce-project/hadoop-mapreduce-client (total was 604, now 603). {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 38s {color} | {color:red} hadoop-mapreduce-client-app in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 32s {color} | {color:red} hadoop-mapreduce-client-core in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 16s {color} | {color:red} hadoop-mapreduce-client-app in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 52s {color} | {color:red} hadoop-mapreduce-client-core in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 21s {color} | {color:red} Patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 44m 7s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_60 Failed junit tests | hadoop.mapreduce.filecache.TestClientDistributedCacheManager | | JDK v1.8.0_60 Timed out junit tests | org.apache.hadoop.mapreduce.v2.app.TestFetchFailure | | | org.apache.hadoop.mapreduce.v2.app.TestMRApp | | JDK v1.7.0_79 Failed junit tests | hadoop.mapreduce.filecache.TestClientDistributedCacheManager | | JDK v1.7.0_79 Timed out junit tests | org.apache.hadoop.mapredu
[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5003: Attachment: MAPREDUCE-5003.10.patch .10 patch fix some checkstyle. broken test of TestJobHistoryEventHandler is not related to my change. It may be transient since it pass in my local machine with my patch on. broken test of TestRecovery also appear to be transient becaue it pass on my local machine with my patch on. I update testMultipleCrashes of TestRecovery to improve its stability. testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken without applying my patch. Will file jira for that broken test > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.10.patch, > MAPREDUCE-5003.2.patch, MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, > MAPREDUCE-5003.5.patch, MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, > MAPREDUCE-5003.7.patch, MAPREDUCE-5003.8.patch, MAPREDUCE-5003.9.patch, > MAPREDUCE-5003.9.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983790#comment-14983790 ] Hadoop QA commented on MAPREDUCE-6512: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 7 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 2s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 13s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 5s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 12s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 2s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 2s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 54s {color} | {color:red} Patch generated 1 new checkstyle issues in root (total was 149, now 149). {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 24s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 29s {color} | {color:red} hadoop-mapreduce-client-core in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 135m 24s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed with JDK v1.8.0_60. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 5s {color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 2s {color} | {color:red} hadoop-mapreduce-client-core in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 137m 13s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 32s {color} | {color:red} Patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 329m 14s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_60 Failed junit tests | hadoop.mapreduce.filecache.TestClientDistributedCacheManager | | | hadoop.mapred.TestMerge | | |
[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983777#comment-14983777 ] Hudson commented on MAPREDUCE-6451: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2552 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2552/]) MAPREDUCE-6451. DistCp has incorrect chunkFilePath for multiple jobs (kihwal: rev 2868ca0328d908056745223fb38d9a90fd2811ba) * hadoop-mapreduce-project/CHANGES.txt * hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/lib/TestDynamicInputFormat.java * hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/StubContext.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunk.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicRecordReader.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputFormat.java Addendum to MAPREDUCE-6451 (kihwal: rev b24fe0648348d325d14931f80cee8a170fb3358a) * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunkContext.java > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 3.0.0, 2.7.2 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983762#comment-14983762 ] Hudson commented on MAPREDUCE-6451: --- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #558 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/558/]) Addendum to MAPREDUCE-6451 (kihwal: rev b24fe0648348d325d14931f80cee8a170fb3358a) * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunkContext.java > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 3.0.0, 2.7.2 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983760#comment-14983760 ] Hudson commented on MAPREDUCE-6451: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #2495 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2495/]) MAPREDUCE-6451. DistCp has incorrect chunkFilePath for multiple jobs (kihwal: rev 2868ca0328d908056745223fb38d9a90fd2811ba) * hadoop-mapreduce-project/CHANGES.txt * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicRecordReader.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputFormat.java * hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/lib/TestDynamicInputFormat.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunk.java * hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/StubContext.java Addendum to MAPREDUCE-6451 (kihwal: rev b24fe0648348d325d14931f80cee8a170fb3358a) * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunkContext.java > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 3.0.0, 2.7.2 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983734#comment-14983734 ] Hudson commented on MAPREDUCE-6451: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #610 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/610/]) Addendum to MAPREDUCE-6451 (kihwal: rev b24fe0648348d325d14931f80cee8a170fb3358a) * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunkContext.java > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 3.0.0, 2.7.2 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983708#comment-14983708 ] Hudson commented on MAPREDUCE-6451: --- FAILURE: Integrated in Hadoop-Yarn-trunk #1345 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1345/]) Addendum to MAPREDUCE-6451 (kihwal: rev b24fe0648348d325d14931f80cee8a170fb3358a) * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunkContext.java > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 3.0.0, 2.7.2 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6302) Preempt reducers after a configurable timeout irrespective of headroom
[ https://issues.apache.org/jira/browse/MAPREDUCE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983656#comment-14983656 ] Jooseong Kim commented on MAPREDUCE-6302: - I think this usually happens when the RM sends out a overestimated headroom. One thing we could do is to skip scheduleReduces() if we ended up preempting reducers through preemptReducesIfNeeded(). Since the headroom is underestimated, scheduleReduces may schedule more reducers, which will need to be preempted again. > Preempt reducers after a configurable timeout irrespective of headroom > -- > > Key: MAPREDUCE-6302 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6302 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: mai shurong >Assignee: Karthik Kambatla >Priority: Critical > Fix For: 2.8.0 > > Attachments: AM_log_head10.txt.gz, AM_log_tail10.txt.gz, > log.txt, mr-6302-1.patch, mr-6302-2.patch, mr-6302-3.patch, mr-6302-4.patch, > mr-6302-5.patch, mr-6302-6.patch, mr-6302-7.patch, mr-6302-prelim.patch, > mr-6302_branch-2.patch, queue_with_max163cores.png, > queue_with_max263cores.png, queue_with_max333cores.png > > > I submit a big job, which has 500 maps and 350 reduce, to a > queue(fairscheduler) with 300 max cores. When the big mapreduce job is > running 100% maps, the 300 reduces have occupied 300 max cores in the queue. > And then, a map fails and retry, waiting for a core, while the 300 reduces > are waiting for failed map to finish. So a deadlock occur. As a result, the > job is blocked, and the later job in the queue cannot run because no > available cores in the queue. > I think there is the similar issue for memory of a queue . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6514) Job hangs as ask is not updated after ramping down of all reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-6514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-6514: --- Status: Open (was: Patch Available) h4. Comment on current patch You should look at {{rampDownReduces()}} API and use it instead of hand-rolling {{decContainerReq}}. I actually think once we do this, you should remove {{clearAllPendingReduceRequests()}} altogether. I am looking at branch-2 and I think the current patch is better served on top of MAPREDUCE-6302 (and this only in 2.8+) given the numerous changes made there. The patch obviously doesn't apply on branch-2.7 which you set the target-version as (2.7.2). Canceling the patch. h4. Meta thought If MAPREDUCE-6513 goes through per my latest proposal there, there is no need for canceling all the reduce asks and thus this patch, no? h4. Release IAC, this has been a long-standing problem (though I'm very surprised nobody caught this till now), so I'd propose we move this out into 2.7.3 or better 2.8+ so I can make progress on the 2.7.2 release. Thoughts? > Job hangs as ask is not updated after ramping down of all reducers > -- > > Key: MAPREDUCE-6514 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6514 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster >Affects Versions: 2.7.1 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Critical > Attachments: MAPREDUCE-6514.01.patch > > > In RMContainerAllocator#preemptReducesIfNeeded, we simply clear the scheduled > reduces map and put these reducers to pending. This is not updated in ask. So > RM keeps on assigning and AM is not able to assign as no reducer is > scheduled(check logs below the code). > If this is updated immediately, RM will be able to schedule mappers > immediately which anyways is the intention when we ramp down reducers. > Scheduler need not allocate for ramped down reducers > This if not handled can lead to map starvation as pointed out in > MAPREDUCE-6513 > {code} > LOG.info("Ramping down all scheduled reduces:" > + scheduledRequests.reduces.size()); > for (ContainerRequest req : scheduledRequests.reduces.values()) { > pendingReduces.add(req); > } > scheduledRequests.reduces.clear(); > {code} > {noformat} > 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container not > assigned : container_1437451211867_1485_01_000215 > 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Cannot assign > container Container: [ContainerId: container_1437451211867_1485_01_000216, > NodeId: hdszzdcxdat6g06u04p:26009, NodeHttpAddress: > hdszzdcxdat6g06u04p:26010, Resource: , Priority: 10, > Token: Token { kind: ContainerToken, service: 10.2.33.236:26009 }, ] for a > reduce as either container memory less than required 4096 or no pending > reduce tasks - reduces.isEmpty=true > 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container not > assigned : container_1437451211867_1485_01_000216 > 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Cannot assign > container Container: [ContainerId: container_1437451211867_1485_01_000217, > NodeId: hdszzdcxdat6g06u06p:26009, NodeHttpAddress: > hdszzdcxdat6g06u06p:26010, Resource: , Priority: 10, > Token: Token { kind: ContainerToken, service: 10.2.33.239:26009 }, ] for a > reduce as either container memory less than required 4096 or no pending > reduce tasks - reduces.isEmpty=true > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983575#comment-14983575 ] Hudson commented on MAPREDUCE-6451: --- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #622 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/622/]) MAPREDUCE-6451. DistCp has incorrect chunkFilePath for multiple jobs (kihwal: rev 2868ca0328d908056745223fb38d9a90fd2811ba) * hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/lib/TestDynamicInputFormat.java * hadoop-mapreduce-project/CHANGES.txt * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunk.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicRecordReader.java * hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/StubContext.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputFormat.java Addendum to MAPREDUCE-6451 (kihwal: rev b24fe0648348d325d14931f80cee8a170fb3358a) * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunkContext.java > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 3.0.0, 2.7.2 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983540#comment-14983540 ] Hudson commented on MAPREDUCE-6451: --- FAILURE: Integrated in Hadoop-trunk-Commit #8736 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8736/]) Addendum to MAPREDUCE-6451 (kihwal: rev b24fe0648348d325d14931f80cee8a170fb3358a) * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunkContext.java > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 3.0.0, 2.7.2 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time
[ https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983421#comment-14983421 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-6513: Went through the discussion. Here's what we should do, mostly agreeing with what [~chen317] says. - Node failure should not be counted towards task-attempt count. So, yes, let's continue to mark such tasks as killed. - Rescheduling of this killed task can (and must) take higher priority independent of whether it is marked as killed or failed. In fact, this was how we originally designed the failed-map-should-have-higher-priority concept. In sprit, fail-fast-map actually meant maps which retroactively failed, like in this case. [~varun_saxena], I can take a stab at this if you don't have cycles. Let me know either-ways. IAC, this has been a long-standing problem (though I'm very surprised nobody caught this till now), so I'd propose we move this out into 2.7.3 so I can make progress on the 2.7.2 release. Thoughts? /cc [~Jobo] > MR job got hanged forever when one NM unstable for some time > > > Key: MAPREDUCE-6513 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6513 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, resourcemanager >Affects Versions: 2.7.0 >Reporter: Bob >Assignee: Varun Saxena >Priority: Critical > > when job is in-progress which is having more tasks,one node became unstable > due to some OS issue.After the node became unstable, the map on this node > status changed to KILLED state. > Currently maps which were running on unstable node are rescheduled, and all > are in scheduled state and wait for RM assign container.Seen ask requests for > map till Node is good (all those failed), there are no ask request after > this. But AM keeps on preempting the reducers (it's recycling). > Finally reducers are waiting for complete mappers and mappers did n't get > container.. > My Question Is: > > why map requests did not sent AM ,once after node recovery.? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983413#comment-14983413 ] Hudson commented on MAPREDUCE-6451: --- FAILURE: Integrated in Hadoop-Yarn-trunk #1344 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1344/]) MAPREDUCE-6451. DistCp has incorrect chunkFilePath for multiple jobs (kihwal: rev 2868ca0328d908056745223fb38d9a90fd2811ba) * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputFormat.java * hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/StubContext.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunk.java * hadoop-mapreduce-project/CHANGES.txt * hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/lib/TestDynamicInputFormat.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicRecordReader.java > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 3.0.0, 2.7.2 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983385#comment-14983385 ] Hudson commented on MAPREDUCE-6451: --- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #557 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/557/]) MAPREDUCE-6451. DistCp has incorrect chunkFilePath for multiple jobs (kihwal: rev 2868ca0328d908056745223fb38d9a90fd2811ba) * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputFormat.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunk.java * hadoop-mapreduce-project/CHANGES.txt * hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/StubContext.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicRecordReader.java * hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/lib/TestDynamicInputFormat.java > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 3.0.0, 2.7.2 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983369#comment-14983369 ] Hudson commented on MAPREDUCE-6451: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #609 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/609/]) MAPREDUCE-6451. DistCp has incorrect chunkFilePath for multiple jobs (kihwal: rev 2868ca0328d908056745223fb38d9a90fd2811ba) * hadoop-mapreduce-project/CHANGES.txt * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputFormat.java * hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/StubContext.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicRecordReader.java * hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/lib/TestDynamicInputFormat.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunk.java > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 3.0.0, 2.7.2 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983325#comment-14983325 ] Allen Wittenauer commented on MAPREDUCE-6451: - I was over here trying to figure out why test-patch said it was good when trunk was failing. haha. :D > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 3.0.0, 2.7.2 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983321#comment-14983321 ] Kihwal Lee commented on MAPREDUCE-6451: --- bq did you forget the DynamicInputChunkContext class when you commit? It is a Friday. :) Fixed it. Thanks for reporting. > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 3.0.0, 2.7.2 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983309#comment-14983309 ] Hudson commented on MAPREDUCE-6451: --- FAILURE: Integrated in Hadoop-trunk-Commit #8735 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8735/]) MAPREDUCE-6451. DistCp has incorrect chunkFilePath for multiple jobs (kihwal: rev 2868ca0328d908056745223fb38d9a90fd2811ba) * hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/StubContext.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputChunk.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicInputFormat.java * hadoop-mapreduce-project/CHANGES.txt * hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/lib/TestDynamicInputFormat.java * hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/lib/DynamicRecordReader.java > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 3.0.0, 2.7.2 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983302#comment-14983302 ] Mingliang Liu commented on MAPREDUCE-6451: -- Hi [~kihwal], did you forget the {{DynamicInputChunkContext}} class when you commit? If true, I think addendum commit will be good. > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 3.0.0, 2.7.2 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()
[ https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983299#comment-14983299 ] Hudson commented on MAPREDUCE-6528: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #2494 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2494/]) MAPREDUCE-6528. Memory leak for HistoryFileManager.getJobSummary(). (jlowe: rev 6344b6a7694c70f296392b6462dba452ff762109) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java > Memory leak for HistoryFileManager.getJobSummary() > -- > > Key: MAPREDUCE-6528 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Fix For: 2.7.2, 2.6.3 > > Attachments: MAPREDUCE-6528.patch > > > We meet memory leak issues for JHS in a large cluster which is caused by code > below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 > should fix most cases that exceptions get thrown. However, we still need to > fix the memory leak for occasional case. > {code} > private String getJobSummary(FileContext fc, Path path) throws IOException { > Path qPath = fc.makeQualified(path); > FSDataInputStream in = fc.open(qPath); > String jobSummaryString = in.readUTF(); > in.close(); > return jobSummaryString; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6512: Attachment: MAPREDUCE-6512.2.patch .2 patch fix broken tests which are caused by missing parent dir > FileOutputCommitter tasks unconditionally create parent directories > --- > > Key: MAPREDUCE-6512 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6512.2.patch, MAPREDUCE-6512.patch, > MAPREDUCE-6512.patch > > > If the output directory is deleted then subsequent tasks should fail. Instead > they blindly create the missing parent directories, leading the job to be > "succesful" despite potentially missing almost all of the output. Task > attempts should fail if the parent app attempt directory is missing when they > go to create their task attempt directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6529) AppMaster will not retry to request resource if AppMaster happens to decide to not use the resource
[ https://issues.apache.org/jira/browse/MAPREDUCE-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983230#comment-14983230 ] Jason Lowe commented on MAPREDUCE-6529: --- bq. For example, in a heterogeneous cluster, reduce task may prefer a container on powerful machine with higher I/O speed. MPI job may prefer containers on machine with higher cpu frequency. But RM can't know all the resource requirement for different applications. So I am just wandering if the RPC protocol between RM and AM may provide such new interface, let the AM make a second choice about if AM will use the container it just gets. If the AM has a preference for something then it needs to specify that in the locality request. Failing to do so just leads to a Monte Carlo situation where the AM tosses away containers hoping that the next random container is better than the last while the task starves for resources in the interim. The AM doesn't have a cluster-wide view. It's not going to know that the nodes it desires so much are all occupied, nor does it know what other things are running on various nodes. So I still think it's undesirable to have the AM toss away a container because it hopes a better one will come along later. Instead we should have the AM improve the container request so the RM can better know the AM's intentions. For example, if the cluster is heterogeneous with some nodes being much better for reducers than others then we should label those nodes. Then the AM request can ask reducers to use those labeled nodes with the ability to relax that locality constraint if the nodes are unavailable. The RM will make a much better decision more efficiently than the AM could ever hope to do on its own. Otherwise the AM is going to get a container, see it's not on one of the "good" nodes and toss it, request another, get another bad allocation, rinse, repeat. If we can teach the MapReduce AM how to recognize a good node vs. a bad node then we can also teach it how to request those nodes when it makes the initial allocation request. > AppMaster will not retry to request resource if AppMaster happens to decide > to not use the resource > --- > > Key: MAPREDUCE-6529 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6529 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.6.0 >Reporter: Wei Chen > > I am viewing code in RMContainerAllocator.java. I want to do some > improvement so that the AppMaster could give up some containers that may not > be optimal when it receives new assigned containers. But I found that if > AppMaster give up the containers, it will not retry to request the resource > again. > int RMContainerRequestor.java, Set ask is used to ask > resource from ResourceManager. I found each container could only be requested > once. It mean ask can be filled by addResourceRequestToAsk(ResourceRequest > remoteRequest[]), but it can only added for once for each container. If we > give up one assigned container, It will never request again -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated MAPREDUCE-6451: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.7.2 3.0.0 Status: Resolved (was: Patch Available) I've committed this to trunk, branch-2 and branch-2.7. Thanks for working on the fix, Kuhu. Thank you gentlemen for the reviews. > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 3.0.0, 2.7.2 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983156#comment-14983156 ] Kihwal Lee commented on MAPREDUCE-6451: --- +1 > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6529) AppMaster will not retry to request resource if AppMaster happens to decide to not use the resource
[ https://issues.apache.org/jira/browse/MAPREDUCE-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983155#comment-14983155 ] Wei Chen commented on MAPREDUCE-6529: - Yes, What I am doing is just to let MA have chances to choose the containers it needs. I knows that a delay scheduling is implemented in YARN. It is very useful for locality sensitive job like MapReduce. But sometimes delay scheduler can not make sure the optimal resource the AM get. For example, in a heterogeneous cluster, reduce task may prefer a container on powerful machine with higher I/O speed. MPI job may prefer containers on machine with higher cpu frequency. But RM can't know all the resource requirement for different applications. So I am just wandering if the RPC protocol between RM and AM may provide such new interface, let the AM make a second choice about if AM will use the container it just gets. And different AMs may implement their own preference for container they request. > AppMaster will not retry to request resource if AppMaster happens to decide > to not use the resource > --- > > Key: MAPREDUCE-6529 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6529 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.6.0 >Reporter: Wei Chen > > I am viewing code in RMContainerAllocator.java. I want to do some > improvement so that the AppMaster could give up some containers that may not > be optimal when it receives new assigned containers. But I found that if > AppMaster give up the containers, it will not retry to request the resource > again. > int RMContainerRequestor.java, Set ask is used to ask > resource from ResourceManager. I found each container could only be requested > once. It mean ask can be filled by addResourceRequestToAsk(ResourceRequest > remoteRequest[]), but it can only added for once for each container. If we > give up one assigned container, It will never request again -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()
[ https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983126#comment-14983126 ] Hudson commented on MAPREDUCE-6528: --- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #556 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/556/]) MAPREDUCE-6528. Memory leak for HistoryFileManager.getJobSummary(). (jlowe: rev 6344b6a7694c70f296392b6462dba452ff762109) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java > Memory leak for HistoryFileManager.getJobSummary() > -- > > Key: MAPREDUCE-6528 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Fix For: 2.7.2, 2.6.3 > > Attachments: MAPREDUCE-6528.patch > > > We meet memory leak issues for JHS in a large cluster which is caused by code > below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 > should fix most cases that exceptions get thrown. However, we still need to > fix the memory leak for occasional case. > {code} > private String getJobSummary(FileContext fc, Path path) throws IOException { > Path qPath = fc.makeQualified(path); > FSDataInputStream in = fc.open(qPath); > String jobSummaryString = in.readUTF(); > in.close(); > return jobSummaryString; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()
[ https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983110#comment-14983110 ] Hudson commented on MAPREDUCE-6528: --- FAILURE: Integrated in Hadoop-Yarn-trunk #1343 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1343/]) MAPREDUCE-6528. Memory leak for HistoryFileManager.getJobSummary(). (jlowe: rev 6344b6a7694c70f296392b6462dba452ff762109) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java > Memory leak for HistoryFileManager.getJobSummary() > -- > > Key: MAPREDUCE-6528 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Fix For: 2.7.2, 2.6.3 > > Attachments: MAPREDUCE-6528.patch > > > We meet memory leak issues for JHS in a large cluster which is caused by code > below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 > should fix most cases that exceptions get thrown. However, we still need to > fix the memory leak for occasional case. > {code} > private String getJobSummary(FileContext fc, Path path) throws IOException { > Path qPath = fc.makeQualified(path); > FSDataInputStream in = fc.open(qPath); > String jobSummaryString = in.readUTF(); > in.close(); > return jobSummaryString; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()
[ https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983017#comment-14983017 ] Hudson commented on MAPREDUCE-6528: --- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #620 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/620/]) MAPREDUCE-6528. Memory leak for HistoryFileManager.getJobSummary(). (jlowe: rev 6344b6a7694c70f296392b6462dba452ff762109) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java > Memory leak for HistoryFileManager.getJobSummary() > -- > > Key: MAPREDUCE-6528 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Fix For: 2.7.2, 2.6.3 > > Attachments: MAPREDUCE-6528.patch > > > We meet memory leak issues for JHS in a large cluster which is caused by code > below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 > should fix most cases that exceptions get thrown. However, we still need to > fix the memory leak for occasional case. > {code} > private String getJobSummary(FileContext fc, Path path) throws IOException { > Path qPath = fc.makeQualified(path); > FSDataInputStream in = fc.open(qPath); > String jobSummaryString = in.readUTF(); > in.close(); > return jobSummaryString; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()
[ https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983012#comment-14983012 ] Hudson commented on MAPREDUCE-6528: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2550 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2550/]) MAPREDUCE-6528. Memory leak for HistoryFileManager.getJobSummary(). (jlowe: rev 6344b6a7694c70f296392b6462dba452ff762109) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java * hadoop-mapreduce-project/CHANGES.txt > Memory leak for HistoryFileManager.getJobSummary() > -- > > Key: MAPREDUCE-6528 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Fix For: 2.7.2, 2.6.3 > > Attachments: MAPREDUCE-6528.patch > > > We meet memory leak issues for JHS in a large cluster which is caused by code > below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 > should fix most cases that exceptions get thrown. However, we still need to > fix the memory leak for occasional case. > {code} > private String getJobSummary(FileContext fc, Path path) throws IOException { > Path qPath = fc.makeQualified(path); > FSDataInputStream in = fc.open(qPath); > String jobSummaryString = in.readUTF(); > in.close(); > return jobSummaryString; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()
[ https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982974#comment-14982974 ] Hudson commented on MAPREDUCE-6528: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #607 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/607/]) MAPREDUCE-6528. Memory leak for HistoryFileManager.getJobSummary(). (jlowe: rev 6344b6a7694c70f296392b6462dba452ff762109) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java > Memory leak for HistoryFileManager.getJobSummary() > -- > > Key: MAPREDUCE-6528 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Fix For: 2.7.2, 2.6.3 > > Attachments: MAPREDUCE-6528.patch > > > We meet memory leak issues for JHS in a large cluster which is caused by code > below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 > should fix most cases that exceptions get thrown. However, we still need to > fix the memory leak for occasional case. > {code} > private String getJobSummary(FileContext fc, Path path) throws IOException { > Path qPath = fc.makeQualified(path); > FSDataInputStream in = fc.open(qPath); > String jobSummaryString = in.readUTF(); > in.close(); > return jobSummaryString; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()
[ https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982852#comment-14982852 ] Junping Du commented on MAPREDUCE-6528: --- Thanks Jason to review and commit this in! > Memory leak for HistoryFileManager.getJobSummary() > -- > > Key: MAPREDUCE-6528 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Fix For: 2.7.2, 2.6.3 > > Attachments: MAPREDUCE-6528.patch > > > We meet memory leak issues for JHS in a large cluster which is caused by code > below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 > should fix most cases that exceptions get thrown. However, we still need to > fix the memory leak for occasional case. > {code} > private String getJobSummary(FileContext fc, Path path) throws IOException { > Path qPath = fc.makeQualified(path); > FSDataInputStream in = fc.open(qPath); > String jobSummaryString = in.readUTF(); > in.close(); > return jobSummaryString; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6514) Job hangs as ask is not updated after ramping down of all reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-6514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982766#comment-14982766 ] Varun Saxena commented on MAPREDUCE-6514: - Test failure to be handled by YARN-4320 > Job hangs as ask is not updated after ramping down of all reducers > -- > > Key: MAPREDUCE-6514 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6514 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster >Affects Versions: 2.7.1 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Critical > Attachments: MAPREDUCE-6514.01.patch > > > In RMContainerAllocator#preemptReducesIfNeeded, we simply clear the scheduled > reduces map and put these reducers to pending. This is not updated in ask. So > RM keeps on assigning and AM is not able to assign as no reducer is > scheduled(check logs below the code). > If this is updated immediately, RM will be able to schedule mappers > immediately which anyways is the intention when we ramp down reducers. > Scheduler need not allocate for ramped down reducers > This if not handled can lead to map starvation as pointed out in > MAPREDUCE-6513 > {code} > LOG.info("Ramping down all scheduled reduces:" > + scheduledRequests.reduces.size()); > for (ContainerRequest req : scheduledRequests.reduces.values()) { > pendingReduces.add(req); > } > scheduledRequests.reduces.clear(); > {code} > {noformat} > 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container not > assigned : container_1437451211867_1485_01_000215 > 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Cannot assign > container Container: [ContainerId: container_1437451211867_1485_01_000216, > NodeId: hdszzdcxdat6g06u04p:26009, NodeHttpAddress: > hdszzdcxdat6g06u04p:26010, Resource: , Priority: 10, > Token: Token { kind: ContainerToken, service: 10.2.33.236:26009 }, ] for a > reduce as either container memory less than required 4096 or no pending > reduce tasks - reduces.isEmpty=true > 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container not > assigned : container_1437451211867_1485_01_000216 > 2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Cannot assign > container Container: [ContainerId: container_1437451211867_1485_01_000217, > NodeId: hdszzdcxdat6g06u06p:26009, NodeHttpAddress: > hdszzdcxdat6g06u06p:26010, Resource: , Priority: 10, > Token: Token { kind: ContainerToken, service: 10.2.33.239:26009 }, ] for a > reduce as either container memory less than required 4096 or no pending > reduce tasks - reduces.isEmpty=true > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()
[ https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982729#comment-14982729 ] Hudson commented on MAPREDUCE-6528: --- FAILURE: Integrated in Hadoop-trunk-Commit #8732 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8732/]) MAPREDUCE-6528. Memory leak for HistoryFileManager.getJobSummary(). (jlowe: rev 6344b6a7694c70f296392b6462dba452ff762109) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java * hadoop-mapreduce-project/CHANGES.txt > Memory leak for HistoryFileManager.getJobSummary() > -- > > Key: MAPREDUCE-6528 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Fix For: 2.7.2, 2.6.3 > > Attachments: MAPREDUCE-6528.patch > > > We meet memory leak issues for JHS in a large cluster which is caused by code > below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 > should fix most cases that exceptions get thrown. However, we still need to > fix the memory leak for occasional case. > {code} > private String getJobSummary(FileContext fc, Path path) throws IOException { > Path qPath = fc.makeQualified(path); > FSDataInputStream in = fc.open(qPath); > String jobSummaryString = in.readUTF(); > in.close(); > return jobSummaryString; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()
[ https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-6528: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.6.3 2.7.2 Status: Resolved (was: Patch Available) Thanks to Junping for the contribution and to Brahma for additional review! I committed this to trunk, branch-2, branch-2.7, and branch-2.6. > Memory leak for HistoryFileManager.getJobSummary() > -- > > Key: MAPREDUCE-6528 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Fix For: 2.7.2, 2.6.3 > > Attachments: MAPREDUCE-6528.patch > > > We meet memory leak issues for JHS in a large cluster which is caused by code > below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 > should fix most cases that exceptions get thrown. However, we still need to > fix the memory leak for occasional case. > {code} > private String getJobSummary(FileContext fc, Path path) throws IOException { > Path qPath = fc.makeQualified(path); > FSDataInputStream in = fc.open(qPath); > String jobSummaryString = in.readUTF(); > in.close(); > return jobSummaryString; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()
[ https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982690#comment-14982690 ] Jason Lowe commented on MAPREDUCE-6528: --- +1, committing this. > Memory leak for HistoryFileManager.getJobSummary() > -- > > Key: MAPREDUCE-6528 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: MAPREDUCE-6528.patch > > > We meet memory leak issues for JHS in a large cluster which is caused by code > below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 > should fix most cases that exceptions get thrown. However, we still need to > fix the memory leak for occasional case. > {code} > private String getJobSummary(FileContext fc, Path path) throws IOException { > Path qPath = fc.makeQualified(path); > FSDataInputStream in = fc.open(qPath); > String jobSummaryString = in.readUTF(); > in.close(); > return jobSummaryString; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6273) HistoryFileManager should check whether summaryFile exists to avoid FileNotFoundException causing HistoryFileInfo into MOVE_FAILED state
[ https://issues.apache.org/jira/browse/MAPREDUCE-6273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-6273: -- Fix Version/s: 2.6.3 I committed this to branch-2.6 as well. > HistoryFileManager should check whether summaryFile exists to avoid > FileNotFoundException causing HistoryFileInfo into MOVE_FAILED state > > > Key: MAPREDUCE-6273 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6273 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Minor > Fix For: 2.7.2, 2.6.3 > > Attachments: MAPREDUCE-6273.000.patch, MAPREDUCE-6273.001.patch > > > HistoryFileManager should check whether summaryFile exists to avoid > FileNotFoundException causing HistoryFileInfo into MOVE_FAILED state, > I saw the following error message: > {code} > 2015-02-17 19:13:45,198 ERROR > org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Error while trying to > move a job to done > java.io.FileNotFoundException: File does not exist: > /user/history/done_intermediate/agd_laci-sluice/job_1423740288390_1884.summary > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1878) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1819) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1771) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:527) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:85) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:356) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) > at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown > Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) > at > org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1181) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1169) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1159) > at > org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:270) > at > org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:237) > at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:230) > at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1457) > at org.apache.hadoop.fs.Hdfs.open(Hdfs.java:318) > at org.apache.hadoop.fs.Hdfs.open(Hdfs.java:59) > at > org.apache.hadoop.fs.AbstractFileSystem.open(AbstractFileSystem.java:621) > at org.apache.hadoop.fs.FileContext$6.next(FileContext.java:789) > at org.apache.hadoop.fs.FileContext$6.next(FileContext.java:785) > at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) > at org.apache.hadoop.fs.FileContext.open(FileContext.java:785) > at > org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.getJobSummary(HistoryFileManager.java:953) > at > o
[jira] [Commented] (MAPREDUCE-5870) Support for passing Job priority through Application Submission Context in Mapreduce Side
[ https://issues.apache.org/jira/browse/MAPREDUCE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982632#comment-14982632 ] Sunil G commented on MAPREDUCE-5870: Test case failures looks related. I will fix them in the next patch. > Support for passing Job priority through Application Submission Context in > Mapreduce Side > - > > Key: MAPREDUCE-5870 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5870 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-MAPREDUCE-5870.patch, 0002-MAPREDUCE-5870.patch, > 0003-MAPREDUCE-5870.patch, 0004-MAPREDUCE-5870.patch, > 0005-MAPREDUCE-5870.patch, 0006-MAPREDUCE-5870.patch, > 0007-MAPREDUCE-5870.patch, Yarn-2002.1.patch > > > Job Prioirty can be set from client side as below [Configuration and api]. > a. JobConf.getJobPriority() and > Job.setPriority(JobPriority priority) > b. We can also use configuration > "mapreduce.job.priority". > Now this Job priority can be passed in Application Submission > context from Client side. > Here we can reuse the MRJobConfig.PRIORITY configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6529) AppMaster will not retry to request resource if AppMaster happens to decide to not use the resource
[ https://issues.apache.org/jira/browse/MAPREDUCE-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982611#comment-14982611 ] Jason Lowe commented on MAPREDUCE-6529: --- I'm a little confused based on the wording of the summary and the description. The summary implies there's a bug, but the description sounds like we're trying to add a feature where the MapReduce AM will voluntarily give up an assigned container because it's not "good enough." I'm assuming the latter, please correct me if I'm wrong. Could you elaborate more on the use-case for this? Seems to me in general it is not a good idea for the AM to second-guess the RM scheduler when it comes to container placement. The AM already conveyed to the RM where it wants containers, and the RM granted some containers. If the AM doesn't like where those containers were placed, how likely is it that giving them up will result in a better placement in a reasonable timeframe? The RM scheduler (at least the CapacityScheduler) already implements a delay to try to achieve node locality (see YARN-80), so at least in that setup it would seem a detriment to the job to give up a usable container now in hopes a better one comes along soon. The RM has already waited around for a better one and decided to give a suboptimal one instead. Unless it is very costly for the task to run off-node or off-rack, it will be worse to give up the assigned container than to just use it because it is unlikely a better container coming along soon. > AppMaster will not retry to request resource if AppMaster happens to decide > to not use the resource > --- > > Key: MAPREDUCE-6529 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6529 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.6.0 >Reporter: Wei Chen > > I am viewing code in RMContainerAllocator.java. I want to do some > improvement so that the AppMaster could give up some containers that may not > be optimal when it receives new assigned containers. But I found that if > AppMaster give up the containers, it will not retry to request the resource > again. > int RMContainerRequestor.java, Set ask is used to ask > resource from ResourceManager. I found each container could only be requested > once. It mean ask can be filled by addResourceRequestToAsk(ResourceRequest > remoteRequest[]), but it can only added for once for each container. If we > give up one assigned container, It will never request again -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5870) Support for passing Job priority through Application Submission Context in Mapreduce Side
[ https://issues.apache.org/jira/browse/MAPREDUCE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982600#comment-14982600 ] Hadoop QA commented on MAPREDUCE-5870: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 1s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 24s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 24s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 30s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 24s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 24s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 24s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s {color} | {color:red} Patch generated 22 new checkstyle issues in hadoop-mapreduce-project/hadoop-mapreduce-client (total was 380, now 397). {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 40s {color} | {color:green} hadoop-mapreduce-client-common in the patch passed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 40s {color} | {color:red} hadoop-mapreduce-client-core in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 103m 22s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed with JDK v1.8.0_60. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s {color} | {color:green} hadoop-mapreduce-client-common in the patch passed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 0s {color} | {color:red} hadoop-mapreduce-client-core in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 103m 18s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 34s {color} | {color:red} Patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 239m 46s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_60 Failed junit tests | hadoop.mapreduce.
[jira] [Updated] (MAPREDUCE-6527) Data race on field org.apache.hadoop.mapred.JobConf.credentials
[ https://issues.apache.org/jira/browse/MAPREDUCE-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edgar Pek updated MAPREDUCE-6527: - Affects Version/s: 2.7.1 > Data race on field org.apache.hadoop.mapred.JobConf.credentials > --- > > Key: MAPREDUCE-6527 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6527 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Ali Kheradmand > > I am running the test suite against a dynamic race detector called > RV-Predict. Here is a race report that I got: > {noformat} > Data race on field org.apache.hadoop.mapred.JobConf.credentials: {{{ > Concurrent read in thread T327 (locks held: {}) > > at org.apache.hadoop.mapred.JobConf.getCredentials(JobConf.java:505) > at > org.apache.hadoop.mapreduce.task.JobContextImpl.(JobContextImpl.java:70) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:524) > T327 is created by T22 > at > org.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:218) > Concurrent write in thread T22 (locks held: {Monitor@496c673a, > Monitor@496319b0}) > > at org.apache.hadoop.mapred.JobConf.setCredentials(JobConf.java:510) > at > org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:787) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:241) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > locked Monitor@496319b0 at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:n/a) > > at > org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:245) > locked Monitor@496c673a at > org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:229) > > T22 is created by T1 > at > org.apache.hadoop.mapred.jobcontrol.TestJobControl.doJobControlTest(TestJobControl.java:111) > }}} > {noformat} > In the source code of org.apache.hadoop.mapreduce.JobStatus.submitJob > function, we have the following lines: > {code} > Job job = new Job(JobID.downgrade(jobid), jobSubmitDir); > job.job.setCredentials(credentials); > {code} > It looks a bit suspicious: Job extends thread and at the end of its > constructor it starts a new thread which creates a new instance of > JobContextImpl which reads credentials. However, the first thread > concurrently sets credentials after a creating the Job instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5485) Allow repeating job commit by extending OutputCommitter API
[ https://issues.apache.org/jira/browse/MAPREDUCE-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982375#comment-14982375 ] Junping Du commented on MAPREDUCE-5485: --- Thanks [~bikassaha] for the comments! I agree it makes more sense to move retry logic into committer.commitJob() if it support repeatable. My original thinking is to combine this retry for committer.commitJob() with other AM exceptions in handleJobCommit (outside of committer), like: failed to write endCommitSuccessFile, etc. But now I think we should separate committer retry with AM specific handling for the reason you mentioned above. For this case, I would prefer we just let AM exit directly instead of fail the job (if commit job is repeatable). Most like the same as proposed above by [~nemon], but a slightly different is: we should apply AM fail (not job fail) even for commiter.commitJob() failed after retry for handling some corner cases, i.e. something goes wrong with related to committer in this AM but still get chance to success in another AM if we support repeatable in commit job. I will update a patch soon. > Allow repeating job commit by extending OutputCommitter API > --- > > Key: MAPREDUCE-5485 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5485 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 2.1.0-beta >Reporter: Nemon Lou >Assignee: Junping Du > Attachments: MAPREDUCE-5485-demo-2.patch, MAPREDUCE-5485-demo.patch > > > There are chances MRAppMaster crush during job committing,or NodeManager > restart cause the committing AM exit due to container expire.In these cases > ,the job will fail. > However,some jobs can redo commit so failing the job becomes unnecessary. > Let clients tell AM to allow redo commit or not is a better choice. > This idea comes from Jason Lowe's comments in MAPREDUCE-4819 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()
[ https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982317#comment-14982317 ] Junping Du commented on MAPREDUCE-6528: --- Thanks [~brahmareddy]! Can someone commit this patch in? It is quite straight-forward. > Memory leak for HistoryFileManager.getJobSummary() > -- > > Key: MAPREDUCE-6528 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: MAPREDUCE-6528.patch > > > We meet memory leak issues for JHS in a large cluster which is caused by code > below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 > should fix most cases that exceptions get thrown. However, we still need to > fix the memory leak for occasional case. > {code} > private String getJobSummary(FileContext fc, Path path) throws IOException { > Path qPath = fc.makeQualified(path); > FSDataInputStream in = fc.open(qPath); > String jobSummaryString = in.readUTF(); > in.close(); > return jobSummaryString; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6532) CLONE - Create Fake Log from Hadoop
Girmay Desta created MAPREDUCE-6532: --- Summary: CLONE - Create Fake Log from Hadoop Key: MAPREDUCE-6532 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6532 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: contrib/mumak Reporter: Girmay Desta Attachments: FakeLogs.tar.gz Version 1 of Mumak supports simulation of Map-Reduce jobs from the logs generated by original job run. Our main aim to run the job even without submitting it. So this task concerns to generate fake log file of Map-Reduce task, convert that into JSON by Rumen and run those files in Mumak. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6531) CLONE - Mumak: Map-Reduce Simulator
Girmay Desta created MAPREDUCE-6531: --- Summary: CLONE - Mumak: Map-Reduce Simulator Key: MAPREDUCE-6531 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6531 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 0.21.0 Reporter: Girmay Desta Assignee: Hong Tang Fix For: 0.21.0 Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, mapreduce-728-20090918-3.patch, mapreduce-728-20090918-5.patch, mapreduce-728-20090918-6.patch, mapreduce-728-20090918.patch, mumak.png h3. Vision: We want to build a Simulator to simulate large-scale Hadoop clusters, applications and workloads. This would be invaluable in furthering Hadoop by providing a tool for researchers and developers to prototype features (e.g. pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict their behaviour and performance with reasonable amount of confidence, there-by aiding rapid innovation. h3. First Cut: Simulator for the Map-Reduce Scheduler The Map-Reduce Scheduler is a fertile area of interest with at least four schedulers, each with their own set of features, currently in existence: Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority Scheduler. Each scheduler's scheduling decisions are driven by many factors, such as fairness, capacity guarantee, resource availability, data-locality etc. Given that, it is non-trivial to accurately choose a single scheduler or even a set of desired features to predict the right scheduler (or features) for a given workload. Hence a simulator which can predict how well a particular scheduler works for some specific workload by quickly iterating over schedulers and/or scheduler features would be quite useful. So, the first cut is to implement a simulator for the Map-Reduce scheduler which take as input a job trace derived from production workload and a cluster definition, and simulates the execution of the jobs in as defined in the trace in this virtual cluster. As output, the detailed job execution trace (recorded in relation to virtual simulated time) could then be analyzed to understand various traits of individual schedulers (individual jobs turn around time, throughput, faireness, capacity guarantee, etc). To support this, we would need a simulator which could accurately model the conditions of the actual system which would affect a schedulers decisions. These include very large-scale clusters (thousands of nodes), the detailed characteristics of the workload thrown at the clusters, job or task failures, data locality, and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()
[ https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982241#comment-14982241 ] Brahma Reddy Battula commented on MAPREDUCE-6528: - Agreed..Since it's planned for 2.6.3 release..Current patch LGTM..+ 1 (nonbinding).. > Memory leak for HistoryFileManager.getJobSummary() > -- > > Key: MAPREDUCE-6528 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: MAPREDUCE-6528.patch > > > We meet memory leak issues for JHS in a large cluster which is caused by code > below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 > should fix most cases that exceptions get thrown. However, we still need to > fix the memory leak for occasional case. > {code} > private String getJobSummary(FileContext fc, Path path) throws IOException { > Path qPath = fc.makeQualified(path); > FSDataInputStream in = fc.open(qPath); > String jobSummaryString = in.readUTF(); > in.close(); > return jobSummaryString; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6528) Memory leak for HistoryFileManager.getJobSummary()
[ https://issues.apache.org/jira/browse/MAPREDUCE-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982214#comment-14982214 ] Junping Du commented on MAPREDUCE-6528: --- Good point, Vinod! Let's keep the patch as it is now as try-with-resources won't be supported in earlier version of JDKs. > Memory leak for HistoryFileManager.getJobSummary() > -- > > Key: MAPREDUCE-6528 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6528 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: MAPREDUCE-6528.patch > > > We meet memory leak issues for JHS in a large cluster which is caused by code > below doesn't release FSDataInputStream in exception case. MAPREDUCE-6273 > should fix most cases that exceptions get thrown. However, we still need to > fix the memory leak for occasional case. > {code} > private String getJobSummary(FileContext fc, Path path) throws IOException { > Path qPath = fc.makeQualified(path); > FSDataInputStream in = fc.open(qPath); > String jobSummaryString = in.readUTF(); > in.close(); > return jobSummaryString; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5870) Support for passing Job priority through Application Submission Context in Mapreduce Side
[ https://issues.apache.org/jira/browse/MAPREDUCE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated MAPREDUCE-5870: --- Attachment: 0007-MAPREDUCE-5870.patch As MAPREDUCE-6515 is committed, we can get the priority from AM in JobStatus. Updating the patch with this support. > Support for passing Job priority through Application Submission Context in > Mapreduce Side > - > > Key: MAPREDUCE-5870 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5870 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-MAPREDUCE-5870.patch, 0002-MAPREDUCE-5870.patch, > 0003-MAPREDUCE-5870.patch, 0004-MAPREDUCE-5870.patch, > 0005-MAPREDUCE-5870.patch, 0006-MAPREDUCE-5870.patch, > 0007-MAPREDUCE-5870.patch, Yarn-2002.1.patch > > > Job Prioirty can be set from client side as below [Configuration and api]. > a. JobConf.getJobPriority() and > Job.setPriority(JobPriority priority) > b. We can also use configuration > "mapreduce.job.priority". > Now this Job priority can be passed in Application Submission > context from Client side. > Here we can reuse the MRJobConfig.PRIORITY configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6530) Jobtracker is slow when more JT UI requests
[ https://issues.apache.org/jira/browse/MAPREDUCE-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982031#comment-14982031 ] Prabhu Joseph commented on MAPREDUCE-6530: -- [~kasha] It is MR1 Hadoop 2.5.1. I entered Wrong version. > Jobtracker is slow when more JT UI requests > --- > > Key: MAPREDUCE-6530 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6530 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.5.1 >Reporter: Prabhu Joseph > > JobTracker is slow when there are huge number of Jobs running and 30 > connections were established to info port to view Job status and counters. > hadoop job -list took 4m22.412s > We took Jstack traces and found most of the server threads waiting on > JobTracker object and the thread which has the lock on JobTracker waits for > ResourceBundle object. > "retireJobs" prio=10 tid=0x7f2345200800 nid=0x11c1 waiting for > monitor entry [0x7f22e3499000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56) > - waiting to lock <0x000197cc6218> (a java.lang.Class for > org.apache.hadoop.mapreduce.util.ResourceBundles) > at > org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75) > at > org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130) > at > org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534) > - locked <0x0007f8411608> (a org.apache.hadoop.mapred.Counters) > at > org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1728) > at > org.apache.hadoop.mapred.JobInProgress.getMapCounters(JobInProgress.java:1669) > at > org.apache.hadoop.mapred.JobTracker$RetireJobs.addToCache(JobTracker.java:657) > - locked <0x9644ae08> (a > org.apache.hadoop.mapred.JobTracker$RetireJobs) > at > org.apache.hadoop.mapred.JobTracker$RetireJobs.run(JobTracker.java:769) > - locked <0x964c5550> (a > org.apache.hadoop.mapred.FairScheduler) > - locked <0x9644a9d0> (a > java.util.Collections$SynchronizedMap) > - locked <0x962ac660> (a org.apache.hadoop.mapred.JobTracker) > at java.lang.Thread.run(Thread.java:745) > The ResourceBundle object is locked most of the time by JT GUI jobtracker_jsp > and does getMapCounters(). > "926410165@qtp-1732070199-56" daemon prio=10 tid=0x7f232c4df000 nid=0x27c0 > runnable [0x7f22db7bf000] >java.lang.Thread.State: RUNNABLE > at java.lang.Throwable.fillInStackTrace(Native Method) > at java.lang.Throwable.fillInStackTrace(Throwable.java:783) > - locked <0x00061a49ede0> (a java.util.MissingResourceException) > at java.lang.Throwable.(Throwable.java:287) > at java.lang.Exception.(Exception.java:84) > at java.lang.RuntimeException.(RuntimeException.java:80) > at > java.util.MissingResourceException.(MissingResourceException.java:85) > at > java.util.ResourceBundle.throwMissingResourceException(ResourceBundle.java:1499) > at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1322) > at java.util.ResourceBundle.getBundle(ResourceBundle.java:1028) > at > org.apache.hadoop.mapreduce.util.ResourceBundles.getBundle(ResourceBundles.java:37) > at > org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56) > - locked <0x000197cc6218> (a java.lang.Class for > org.apache.hadoop.mapreduce.util.ResourceBundles) > at > org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75) > at > org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130) > at > org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534) > - locked <0x0007ed1024b8> (a org.apache.hadoop.mapred.Counters) > at > org.apache.hadoop.mapred.
[jira] [Updated] (MAPREDUCE-6530) Jobtracker is slow when more JT UI requests
[ https://issues.apache.org/jira/browse/MAPREDUCE-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated MAPREDUCE-6530: - Affects Version/s: (was: 1.2.1) 2.5.1 > Jobtracker is slow when more JT UI requests > --- > > Key: MAPREDUCE-6530 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6530 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.5.1 >Reporter: Prabhu Joseph > > JobTracker is slow when there are huge number of Jobs running and 30 > connections were established to info port to view Job status and counters. > hadoop job -list took 4m22.412s > We took Jstack traces and found most of the server threads waiting on > JobTracker object and the thread which has the lock on JobTracker waits for > ResourceBundle object. > "retireJobs" prio=10 tid=0x7f2345200800 nid=0x11c1 waiting for > monitor entry [0x7f22e3499000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56) > - waiting to lock <0x000197cc6218> (a java.lang.Class for > org.apache.hadoop.mapreduce.util.ResourceBundles) > at > org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75) > at > org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130) > at > org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534) > - locked <0x0007f8411608> (a org.apache.hadoop.mapred.Counters) > at > org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1728) > at > org.apache.hadoop.mapred.JobInProgress.getMapCounters(JobInProgress.java:1669) > at > org.apache.hadoop.mapred.JobTracker$RetireJobs.addToCache(JobTracker.java:657) > - locked <0x9644ae08> (a > org.apache.hadoop.mapred.JobTracker$RetireJobs) > at > org.apache.hadoop.mapred.JobTracker$RetireJobs.run(JobTracker.java:769) > - locked <0x964c5550> (a > org.apache.hadoop.mapred.FairScheduler) > - locked <0x9644a9d0> (a > java.util.Collections$SynchronizedMap) > - locked <0x962ac660> (a org.apache.hadoop.mapred.JobTracker) > at java.lang.Thread.run(Thread.java:745) > The ResourceBundle object is locked most of the time by JT GUI jobtracker_jsp > and does getMapCounters(). > "926410165@qtp-1732070199-56" daemon prio=10 tid=0x7f232c4df000 nid=0x27c0 > runnable [0x7f22db7bf000] >java.lang.Thread.State: RUNNABLE > at java.lang.Throwable.fillInStackTrace(Native Method) > at java.lang.Throwable.fillInStackTrace(Throwable.java:783) > - locked <0x00061a49ede0> (a java.util.MissingResourceException) > at java.lang.Throwable.(Throwable.java:287) > at java.lang.Exception.(Exception.java:84) > at java.lang.RuntimeException.(RuntimeException.java:80) > at > java.util.MissingResourceException.(MissingResourceException.java:85) > at > java.util.ResourceBundle.throwMissingResourceException(ResourceBundle.java:1499) > at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1322) > at java.util.ResourceBundle.getBundle(ResourceBundle.java:1028) > at > org.apache.hadoop.mapreduce.util.ResourceBundles.getBundle(ResourceBundles.java:37) > at > org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56) > - locked <0x000197cc6218> (a java.lang.Class for > org.apache.hadoop.mapreduce.util.ResourceBundles) > at > org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75) > at > org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130) > at > org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534) > - locked <0x0007ed1024b8> (a org.apache.hadoop.mapred.Counters) > at > org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgre
[jira] [Updated] (MAPREDUCE-6530) Jobtracker is slow when more JT UI requests
[ https://issues.apache.org/jira/browse/MAPREDUCE-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated MAPREDUCE-6530: - Target Version/s: (was: 1.3.0) > Jobtracker is slow when more JT UI requests > --- > > Key: MAPREDUCE-6530 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6530 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.5.1 >Reporter: Prabhu Joseph > > JobTracker is slow when there are huge number of Jobs running and 30 > connections were established to info port to view Job status and counters. > hadoop job -list took 4m22.412s > We took Jstack traces and found most of the server threads waiting on > JobTracker object and the thread which has the lock on JobTracker waits for > ResourceBundle object. > "retireJobs" prio=10 tid=0x7f2345200800 nid=0x11c1 waiting for > monitor entry [0x7f22e3499000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56) > - waiting to lock <0x000197cc6218> (a java.lang.Class for > org.apache.hadoop.mapreduce.util.ResourceBundles) > at > org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75) > at > org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130) > at > org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534) > - locked <0x0007f8411608> (a org.apache.hadoop.mapred.Counters) > at > org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1728) > at > org.apache.hadoop.mapred.JobInProgress.getMapCounters(JobInProgress.java:1669) > at > org.apache.hadoop.mapred.JobTracker$RetireJobs.addToCache(JobTracker.java:657) > - locked <0x9644ae08> (a > org.apache.hadoop.mapred.JobTracker$RetireJobs) > at > org.apache.hadoop.mapred.JobTracker$RetireJobs.run(JobTracker.java:769) > - locked <0x964c5550> (a > org.apache.hadoop.mapred.FairScheduler) > - locked <0x9644a9d0> (a > java.util.Collections$SynchronizedMap) > - locked <0x962ac660> (a org.apache.hadoop.mapred.JobTracker) > at java.lang.Thread.run(Thread.java:745) > The ResourceBundle object is locked most of the time by JT GUI jobtracker_jsp > and does getMapCounters(). > "926410165@qtp-1732070199-56" daemon prio=10 tid=0x7f232c4df000 nid=0x27c0 > runnable [0x7f22db7bf000] >java.lang.Thread.State: RUNNABLE > at java.lang.Throwable.fillInStackTrace(Native Method) > at java.lang.Throwable.fillInStackTrace(Throwable.java:783) > - locked <0x00061a49ede0> (a java.util.MissingResourceException) > at java.lang.Throwable.(Throwable.java:287) > at java.lang.Exception.(Exception.java:84) > at java.lang.RuntimeException.(RuntimeException.java:80) > at > java.util.MissingResourceException.(MissingResourceException.java:85) > at > java.util.ResourceBundle.throwMissingResourceException(ResourceBundle.java:1499) > at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1322) > at java.util.ResourceBundle.getBundle(ResourceBundle.java:1028) > at > org.apache.hadoop.mapreduce.util.ResourceBundles.getBundle(ResourceBundles.java:37) > at > org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56) > - locked <0x000197cc6218> (a java.lang.Class for > org.apache.hadoop.mapreduce.util.ResourceBundles) > at > org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47) > at > org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75) > at > org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130) > at > org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534) > - locked <0x0007ed1024b8> (a org.apache.hadoop.mapred.Counters) > at > org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1728) > at > org