[jira] [Updated] (MAPREDUCE-6353) Divide by zero error in MR AM when calculating available containers
[ https://issues.apache.org/jira/browse/MAPREDUCE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-6353: Labels: (was: BB2015-05-RFC) Divide by zero error in MR AM when calculating available containers --- Key: MAPREDUCE-6353 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6353 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: MAPREDUCE-6353.001.patch, MAPREDUCE-6353.002.patch, MAPREDUCE-6353.002.patch When running a sleep job with zero CPU vcores i see the following exception 2015-04-30 06:41:06,954 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN CONTACTING RM. java.lang.ArithmeticException: / by zero at org.apache.hadoop.mapreduce.v2.app.rm.ResourceCalculatorUtils.computeAvailableContainers(ResourceCalculatorUtils.java:38) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assign(RMContainerAllocator.java:947) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.access$200(RMContainerAllocator.java:840) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:247) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:282) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6353) Divide by zero error in MR AM when calculating available containers
[ https://issues.apache.org/jira/browse/MAPREDUCE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-6353: Resolution: Fixed Fix Version/s: 2.8.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Anubhav for reporting and fixing this. Just committed this to trunk and branch-2. Divide by zero error in MR AM when calculating available containers --- Key: MAPREDUCE-6353 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6353 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 2.8.0 Attachments: MAPREDUCE-6353.001.patch, MAPREDUCE-6353.002.patch, MAPREDUCE-6353.002.patch When running a sleep job with zero CPU vcores i see the following exception 2015-04-30 06:41:06,954 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN CONTACTING RM. java.lang.ArithmeticException: / by zero at org.apache.hadoop.mapreduce.v2.app.rm.ResourceCalculatorUtils.computeAvailableContainers(ResourceCalculatorUtils.java:38) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assign(RMContainerAllocator.java:947) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.access$200(RMContainerAllocator.java:840) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:247) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:282) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5748) Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived()
[ https://issues.apache.org/jira/browse/MAPREDUCE-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536871#comment-14536871 ] Hadoop QA commented on MAPREDUCE-5748: -- \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 37s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 31s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 31s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 20s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 37s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 36s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | mapreduce tests | 0m 19s | Tests passed in hadoop-mapreduce-client-shuffle. | | | | 35m 30s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731742/MAPREDUCE-5748.03.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / df36ad0 | | hadoop-mapreduce-client-shuffle test log | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5697/artifact/patchprocess/testrun_hadoop-mapreduce-client-shuffle.txt | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5697/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5697/console | This message was automatically generated. Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived() Key: MAPREDUCE-5748 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5748 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.7.0 Reporter: Ted Yu Assignee: Varun Saxena Priority: Minor Labels: BB2015-05-RFC Attachments: 0001-MAPREDUCE-5748-Potential-null-pointer-deference-in-S.patch, MAPREDUCE-5748.02.patch, MAPREDUCE-5748.03.patch Starting around line 510: {code} ChannelFuture lastMap = null; for (String mapId : mapIds) { ... } lastMap.addListener(metrics); lastMap.addListener(ChannelFutureListener.CLOSE); {code} If mapIds is empty, lastMap would remain null, leading to NPE in addListener() call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6174) Combine common stream code into parent class for InMemoryMapOutput and OnDiskMapOutput.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537006#comment-14537006 ] Hadoop QA commented on MAPREDUCE-6174: -- \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 38s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 30s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 35s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 17s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 39s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 15s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | mapreduce tests | 1m 37s | Tests passed in hadoop-mapreduce-client-core. | | | | 38m 30s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731580/MAPREDUCE-6174.004.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / a60f78e | | hadoop-mapreduce-client-core test log | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5698/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5698/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5698/console | This message was automatically generated. Combine common stream code into parent class for InMemoryMapOutput and OnDiskMapOutput. --- Key: MAPREDUCE-6174 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6174 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 3.0.0, 2.6.0 Reporter: Eric Payne Assignee: Eric Payne Labels: BB2015-05-RFC Attachments: MAPREDUCE-6174.002.patch, MAPREDUCE-6174.003.patch, MAPREDUCE-6174.004.patch, MAPREDUCE-6174.v1.txt Per MAPREDUCE-6166, both InMemoryMapOutput and OnDiskMapOutput will be doing similar things with regards to IFile streams. In order to make it explicit that InMemoryMapOutput and OnDiskMapOutput are different from 3rd-party implementations, this JIRA will make them subclass a common class (see https://issues.apache.org/jira/browse/MAPREDUCE-6166?focusedCommentId=14223368page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14223368) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6353) Divide by zero error in MR AM when calculating available containers
[ https://issues.apache.org/jira/browse/MAPREDUCE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536920#comment-14536920 ] Hudson commented on MAPREDUCE-6353: --- FAILURE: Integrated in Hadoop-trunk-Commit #7786 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7786/]) MAPREDUCE-6353. Divide by zero error in MR AM when calculating available containers. (Anubhav Dhoot via kasha) (kasha: rev 1773aac780585871072960a5863af461e112a030) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/ResourceCalculatorUtils.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestResourceCalculatorUtils.java * hadoop-mapreduce-project/CHANGES.txt Divide by zero error in MR AM when calculating available containers --- Key: MAPREDUCE-6353 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6353 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 2.8.0 Attachments: MAPREDUCE-6353.001.patch, MAPREDUCE-6353.002.patch, MAPREDUCE-6353.002.patch When running a sleep job with zero CPU vcores i see the following exception 2015-04-30 06:41:06,954 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN CONTACTING RM. java.lang.ArithmeticException: / by zero at org.apache.hadoop.mapreduce.v2.app.rm.ResourceCalculatorUtils.computeAvailableContainers(ResourceCalculatorUtils.java:38) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assign(RMContainerAllocator.java:947) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.access$200(RMContainerAllocator.java:840) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:247) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:282) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6353) Divide by zero error in MR AM when calculating available containers
[ https://issues.apache.org/jira/browse/MAPREDUCE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536952#comment-14536952 ] Anubhav Dhoot commented on MAPREDUCE-6353: -- Thanks [~kasha] for review and commit! Divide by zero error in MR AM when calculating available containers --- Key: MAPREDUCE-6353 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6353 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 2.8.0 Attachments: MAPREDUCE-6353.001.patch, MAPREDUCE-6353.002.patch, MAPREDUCE-6353.002.patch When running a sleep job with zero CPU vcores i see the following exception 2015-04-30 06:41:06,954 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN CONTACTING RM. java.lang.ArithmeticException: / by zero at org.apache.hadoop.mapreduce.v2.app.rm.ResourceCalculatorUtils.computeAvailableContainers(ResourceCalculatorUtils.java:38) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assign(RMContainerAllocator.java:947) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.access$200(RMContainerAllocator.java:840) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:247) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:282) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6174) Combine common stream code into parent class for InMemoryMapOutput and OnDiskMapOutput.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536996#comment-14536996 ] Gera Shegalov commented on MAPREDUCE-6174: -- Thanks for 004, [~eepayne]! Few suggestions: - We can move setting conf and merger from constructors for OnDisk and InMem to IFileWrappedMapOutput via super - We can remove unused parameters {{reduceId}} and {{mapOutputFile}} from the OnDiskMapOutput constructors. I am not sure the new test contributes a lot to code coverage. I think it's not necessary. Instead, in that same TestFetcher we can refer to special classes by the the base class. For example we can replace declarations: {code} -InMemoryMapOutputText, Text immo = mock(InMemoryMapOutput.class); +IFileWrappedMapOutputText, Text immo = mock(InMemoryMapOutput.class); {code} and {code} -OnDiskMapOutputText,Text odmo = new OnDiskMapOutputText,Text(map1ID, -id, mm, 100L, job, mof, fetcher, true, fs, onDiskMapOutputPath); +IFileWrappedMapOutputText,Text odmo = +new OnDiskMapOutputText,Text(map1ID, mm, 100L, job, fetcher, true, fs, +onDiskMapOutputPath); {code} throughout TestFetcher. Combine common stream code into parent class for InMemoryMapOutput and OnDiskMapOutput. --- Key: MAPREDUCE-6174 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6174 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 3.0.0, 2.6.0 Reporter: Eric Payne Assignee: Eric Payne Labels: BB2015-05-RFC Attachments: MAPREDUCE-6174.002.patch, MAPREDUCE-6174.003.patch, MAPREDUCE-6174.004.patch, MAPREDUCE-6174.v1.txt Per MAPREDUCE-6166, both InMemoryMapOutput and OnDiskMapOutput will be doing similar things with regards to IFile streams. In order to make it explicit that InMemoryMapOutput and OnDiskMapOutput are different from 3rd-party implementations, this JIRA will make them subclass a common class (see https://issues.apache.org/jira/browse/MAPREDUCE-6166?focusedCommentId=14223368page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14223368) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5748) Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived()
[ https://issues.apache.org/jira/browse/MAPREDUCE-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536320#comment-14536320 ] Varun Saxena commented on MAPREDUCE-5748: - [~kasha], which test case are you talking about ? The test added passes in local. Not sure why Jenkins took the old patch. Resubmitting the patch again to kick Jenkins Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived() Key: MAPREDUCE-5748 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5748 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.7.0 Reporter: Ted Yu Assignee: Varun Saxena Priority: Minor Attachments: 0001-MAPREDUCE-5748-Potential-null-pointer-deference-in-S.patch, MAPREDUCE-5748.02.patch Starting around line 510: {code} ChannelFuture lastMap = null; for (String mapId : mapIds) { ... } lastMap.addListener(metrics); lastMap.addListener(ChannelFutureListener.CLOSE); {code} If mapIds is empty, lastMap would remain null, leading to NPE in addListener() call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5748) Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived()
[ https://issues.apache.org/jira/browse/MAPREDUCE-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536338#comment-14536338 ] Hadoop QA commented on MAPREDUCE-5748: -- \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 34s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 32s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 38s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 20s | The applied patch generated 1 new checkstyle issues (total was 60, now 61). | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 8 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 36s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | mapreduce tests | 0m 19s | Tests passed in hadoop-mapreduce-client-shuffle. | | | | 35m 34s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731626/MAPREDUCE-5748.02.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 02a4a22 | | checkstyle | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5695/artifact/patchprocess/diffcheckstylehadoop-mapreduce-client-shuffle.txt | | whitespace | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5695/artifact/patchprocess/whitespace.txt | | hadoop-mapreduce-client-shuffle test log | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5695/artifact/patchprocess/testrun_hadoop-mapreduce-client-shuffle.txt | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5695/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5695/console | This message was automatically generated. Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived() Key: MAPREDUCE-5748 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5748 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.7.0 Reporter: Ted Yu Assignee: Varun Saxena Priority: Minor Attachments: 0001-MAPREDUCE-5748-Potential-null-pointer-deference-in-S.patch, MAPREDUCE-5748.02.patch Starting around line 510: {code} ChannelFuture lastMap = null; for (String mapId : mapIds) { ... } lastMap.addListener(metrics); lastMap.addListener(ChannelFutureListener.CLOSE); {code} If mapIds is empty, lastMap would remain null, leading to NPE in addListener() call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-4750) Enable NNBenchWithoutMR in MapredTestDriver
[ https://issues.apache.org/jira/browse/MAPREDUCE-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536391#comment-14536391 ] Hudson commented on MAPREDUCE-4750: --- SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #191 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/191/]) MAPREDUCE-4750. Enable NNBenchWithoutMR in MapredTestDriver (Liang Xie and Jason Lowe via raviprak) (raviprak: rev 5aab014340b53ebc9363ee244b2cbea7a4c1f573) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/test/MapredTestDriver.java Enable NNBenchWithoutMR in MapredTestDriver --- Key: MAPREDUCE-4750 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4750 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: MAPREDUCE-4750.002.patch, MAPREDUCE-4750.txt Right now, we could run nnbench from MapredTestDriver only, there's no entry for NNBenchWithoutMR, it would be better enable it explicitly, such that we can do namenode benchmark with less influence factors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-3383) Duplicate job.getOutputValueGroupingComparator() in ReduceTask
[ https://issues.apache.org/jira/browse/MAPREDUCE-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536416#comment-14536416 ] Hudson commented on MAPREDUCE-3383: --- SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #191 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/191/]) MAPREDUCE-3383. Duplicate job.getOutputValueGroupingComparator() in ReduceTask. Contributed by Binglin Chang (jlowe: rev c39012f4a0444f9e4b7d67957d5192127d143d90) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/ReduceTask.java * hadoop-mapreduce-project/CHANGES.txt Duplicate job.getOutputValueGroupingComparator() in ReduceTask -- Key: MAPREDUCE-3383 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3383 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.1 Reporter: Binglin Chang Assignee: Binglin Chang Fix For: 2.8.0 Attachments: MAPREDUCE-3383-1.patch, MAPREDUCE-3383.patch, MAPREDUCE-3383.patch This is probably just a small error by mistake. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5248) Let NNBenchWithoutMR specify the replication factor for its test
[ https://issues.apache.org/jira/browse/MAPREDUCE-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536418#comment-14536418 ] Hudson commented on MAPREDUCE-5248: --- SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #191 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/191/]) MAPREDUCE-5248. Let NNBenchWithoutMR specify the replication factor for its test. Contributed by Erik Paulson (jlowe: rev 30099a36c6b0f658d25fb505a9f3ce15d19f7ba6) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/hdfs/NNBenchWithoutMR.java * hadoop-mapreduce-project/CHANGES.txt Let NNBenchWithoutMR specify the replication factor for its test Key: MAPREDUCE-5248 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5248 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client, test Affects Versions: 3.0.0 Reporter: Erik Paulson Assignee: Erik Paulson Priority: Minor Fix For: 2.8.0 Attachments: MAPREDUCE-5248.patch, MAPREDUCE-5248.txt Original Estimate: 1h Remaining Estimate: 1h The NNBenchWithoutMR test creates files with a replicationFactorPerFile hard-coded to 1. It'd be nice to be able to specify that on the commandline. Also, it'd be great if MAPREDUCE-4750 was merged along with this fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536448#comment-14536448 ] Hudson commented on MAPREDUCE-2632: --- SUCCESS: Integrated in Hadoop-Yarn-trunk #922 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/922/]) MAPREDUCE-2632. Avoid calling the partitioner when the numReduceTasks is 1. (Ravi Teja Ch N V and Sunil G via kasha) (kasha: rev bdbd10fde1539920de937404a785e6ed34dd5628) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Partitioner.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/output/TestMapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestMapFileOutputFormat.java * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Partitioner.java Avoid calling the partitioner when the numReduceTasks is 1. --- Key: MAPREDUCE-2632 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.23.0 Reporter: Ravi Teja Ch N V Assignee: Sunil G Fix For: 3.0.0 Attachments: 0001-MAPREDUCE-2632.patch, MAPREDUCE-2632-1.patch, MAPREDUCE-2632.patch, mr-2632-2.patch, mr-2632-3.patch, mr-2632-4.patch We can avoid the call to the partitioner when the number of reducers is 1.This will avoid the unnecessary computations by the partitioner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5981) Log levels of certain MR logs can be changed to DEBUG
[ https://issues.apache.org/jira/browse/MAPREDUCE-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536443#comment-14536443 ] Hudson commented on MAPREDUCE-5981: --- SUCCESS: Integrated in Hadoop-Yarn-trunk #922 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/922/]) MAPREDUCE-5981. Log levels of certain MR logs can be changed to DEBUG. (devaraj: rev dc2b2ae31f2eb6dae324c2e14ed7660ce605a89b) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/ShuffleSchedulerImpl.java Log levels of certain MR logs can be changed to DEBUG - Key: MAPREDUCE-5981 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5981 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Varun Saxena Assignee: Varun Saxena Fix For: 2.8.0 Attachments: MAPREDUCE-5981.02.patch, MAPREDUCE-5981.patch Following map reduce logs can be changed to DEBUG log level as they appear too many times in the log file and are not that important for debugging. 1. In org.apache.hadoop.mapreduce.task.reduce.Fetcher#copyFromHost(Fetcher.java : 313), the second log is not required to be at info level. This can be moved to debug as a warn log is anyways printed if verifyReply fails. SecureShuffleUtils.verifyReply(replyHash, encHash, shuffleSecretKey); LOG.info(for url=+msgToEncode+ sent hash and received reply); 2. Thread related info need not be printed in logs at INFO level. Below 2 logs can be moved to DEBUG a) In org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl#getHost(ShuffleSchedulerImpl.java : 381), below log can be changed to DEBUG LOG.info(Assigning + host + with + host.getNumKnownMapOutputs() + to + Thread.currentThread().getName()); b) In org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.getMapsForHost(ShuffleSchedulerImpl.java : 411), below log can be changed to DEBUG LOG.info(assigned + includedMaps + of + totalSize + to + host + to + Thread.currentThread().getName()); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-3383) Duplicate job.getOutputValueGroupingComparator() in ReduceTask
[ https://issues.apache.org/jira/browse/MAPREDUCE-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536468#comment-14536468 ] Hudson commented on MAPREDUCE-3383: --- SUCCESS: Integrated in Hadoop-Yarn-trunk #922 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/922/]) MAPREDUCE-3383. Duplicate job.getOutputValueGroupingComparator() in ReduceTask. Contributed by Binglin Chang (jlowe: rev c39012f4a0444f9e4b7d67957d5192127d143d90) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/ReduceTask.java Duplicate job.getOutputValueGroupingComparator() in ReduceTask -- Key: MAPREDUCE-3383 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3383 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.1 Reporter: Binglin Chang Assignee: Binglin Chang Fix For: 2.8.0 Attachments: MAPREDUCE-3383-1.patch, MAPREDUCE-3383.patch, MAPREDUCE-3383.patch This is probably just a small error by mistake. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-4750) Enable NNBenchWithoutMR in MapredTestDriver
[ https://issues.apache.org/jira/browse/MAPREDUCE-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536440#comment-14536440 ] Hudson commented on MAPREDUCE-4750: --- SUCCESS: Integrated in Hadoop-Yarn-trunk #922 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/922/]) MAPREDUCE-4750. Enable NNBenchWithoutMR in MapredTestDriver (Liang Xie and Jason Lowe via raviprak) (raviprak: rev 5aab014340b53ebc9363ee244b2cbea7a4c1f573) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/test/MapredTestDriver.java * hadoop-mapreduce-project/CHANGES.txt Enable NNBenchWithoutMR in MapredTestDriver --- Key: MAPREDUCE-4750 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4750 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: MAPREDUCE-4750.002.patch, MAPREDUCE-4750.txt Right now, we could run nnbench from MapredTestDriver only, there's no entry for NNBenchWithoutMR, it would be better enable it explicitly, such that we can do namenode benchmark with less influence factors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5248) Let NNBenchWithoutMR specify the replication factor for its test
[ https://issues.apache.org/jira/browse/MAPREDUCE-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536470#comment-14536470 ] Hudson commented on MAPREDUCE-5248: --- SUCCESS: Integrated in Hadoop-Yarn-trunk #922 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/922/]) MAPREDUCE-5248. Let NNBenchWithoutMR specify the replication factor for its test. Contributed by Erik Paulson (jlowe: rev 30099a36c6b0f658d25fb505a9f3ce15d19f7ba6) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/hdfs/NNBenchWithoutMR.java * hadoop-mapreduce-project/CHANGES.txt Let NNBenchWithoutMR specify the replication factor for its test Key: MAPREDUCE-5248 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5248 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client, test Affects Versions: 3.0.0 Reporter: Erik Paulson Assignee: Erik Paulson Priority: Minor Fix For: 2.8.0 Attachments: MAPREDUCE-5248.patch, MAPREDUCE-5248.txt Original Estimate: 1h Remaining Estimate: 1h The NNBenchWithoutMR test creates files with a replicationFactorPerFile hard-coded to 1. It'd be nice to be able to specify that on the commandline. Also, it'd be great if MAPREDUCE-4750 was merged along with this fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-2094) LineRecordReader should not seek into non-splittable, compressed streams.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536444#comment-14536444 ] Hudson commented on MAPREDUCE-2094: --- SUCCESS: Integrated in Hadoop-Yarn-trunk #922 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/922/]) MAPREDUCE-2094. LineRecordReader should not seek into non-splittable, compressed streams. (cdouglas: rev 2edcf931d7843cddcf3da5666a73d6ee9a10d00d) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/resources/TestSafeguardSplittingUnsplittableFiles.txt.gz * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/pom.xml * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestLineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/LineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestLineRecordReader.java LineRecordReader should not seek into non-splittable, compressed streams. - Key: MAPREDUCE-2094 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2094 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 2.8.0 Attachments: M2094-1.patch, M2094.patch, MAPREDUCE-2094-2011-05-19.patch, MAPREDUCE-2094-20140727-svn-fixed-spaces.patch, MAPREDUCE-2094-20140727-svn.patch, MAPREDUCE-2094-20140727.patch, MAPREDUCE-2094-2015-05-05-2328.patch, MAPREDUCE-2094-FileInputFormat-docs-v2.patch When implementing a custom derivative of FileInputFormat we ran into the effect that a large Gzipped input file would be processed several times. A near 1GiB file would be processed around 36 times in its entirety. Thus producing garbage results and taking up a lot more CPU time than needed. It took a while to figure out and what we found is that the default implementation of the isSplittable method in [org.apache.hadoop.mapreduce.lib.input.FileInputFormat | http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java?view=markup ] is simply return true;. This is a very unsafe default and is in contradiction with the JavaDoc of the method which states: Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be. . The actual implementation effectively does Is the given filename splitable? Always true, even if the file is stream compressed using an unsplittable compression codec. For our situation (where we always have Gzipped input) we took the easy way out and simply implemented an isSplittable in our class that does return false; Now there are essentially 3 ways I can think of for fixing this (in order of what I would find preferable): # Implement something that looks at the used compression of the file (i.e. do migrate the implementation from TextInputFormat to FileInputFormat). This would make the method do what the JavaDoc describes. # Force developers to think about it and make this method abstract. # Use a safe default (i.e. return false) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5748) Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived()
[ https://issues.apache.org/jira/browse/MAPREDUCE-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated MAPREDUCE-5748: Status: Patch Available (was: Open) Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived() Key: MAPREDUCE-5748 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5748 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.7.0 Reporter: Ted Yu Assignee: Varun Saxena Priority: Minor Attachments: 0001-MAPREDUCE-5748-Potential-null-pointer-deference-in-S.patch, MAPREDUCE-5748.02.patch Starting around line 510: {code} ChannelFuture lastMap = null; for (String mapId : mapIds) { ... } lastMap.addListener(metrics); lastMap.addListener(ChannelFutureListener.CLOSE); {code} If mapIds is empty, lastMap would remain null, leading to NPE in addListener() call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-2094) LineRecordReader should not seek into non-splittable, compressed streams.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536395#comment-14536395 ] Hudson commented on MAPREDUCE-2094: --- SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #191 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/191/]) MAPREDUCE-2094. LineRecordReader should not seek into non-splittable, compressed streams. (cdouglas: rev 2edcf931d7843cddcf3da5666a73d6ee9a10d00d) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestLineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/LineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/resources/TestSafeguardSplittingUnsplittableFiles.txt.gz * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/pom.xml * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestLineRecordReader.java LineRecordReader should not seek into non-splittable, compressed streams. - Key: MAPREDUCE-2094 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2094 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 2.8.0 Attachments: M2094-1.patch, M2094.patch, MAPREDUCE-2094-2011-05-19.patch, MAPREDUCE-2094-20140727-svn-fixed-spaces.patch, MAPREDUCE-2094-20140727-svn.patch, MAPREDUCE-2094-20140727.patch, MAPREDUCE-2094-2015-05-05-2328.patch, MAPREDUCE-2094-FileInputFormat-docs-v2.patch When implementing a custom derivative of FileInputFormat we ran into the effect that a large Gzipped input file would be processed several times. A near 1GiB file would be processed around 36 times in its entirety. Thus producing garbage results and taking up a lot more CPU time than needed. It took a while to figure out and what we found is that the default implementation of the isSplittable method in [org.apache.hadoop.mapreduce.lib.input.FileInputFormat | http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java?view=markup ] is simply return true;. This is a very unsafe default and is in contradiction with the JavaDoc of the method which states: Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be. . The actual implementation effectively does Is the given filename splitable? Always true, even if the file is stream compressed using an unsplittable compression codec. For our situation (where we always have Gzipped input) we took the easy way out and simply implemented an isSplittable in our class that does return false; Now there are essentially 3 ways I can think of for fixing this (in order of what I would find preferable): # Implement something that looks at the used compression of the file (i.e. do migrate the implementation from TextInputFormat to FileInputFormat). This would make the method do what the JavaDoc describes. # Force developers to think about it and make this method abstract. # Use a safe default (i.e. return false) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536399#comment-14536399 ] Hudson commented on MAPREDUCE-2632: --- SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #191 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/191/]) MAPREDUCE-2632. Avoid calling the partitioner when the numReduceTasks is 1. (Ravi Teja Ch N V and Sunil G via kasha) (kasha: rev bdbd10fde1539920de937404a785e6ed34dd5628) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Partitioner.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Partitioner.java * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestMapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/output/TestMapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapFileOutputFormat.java Avoid calling the partitioner when the numReduceTasks is 1. --- Key: MAPREDUCE-2632 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.23.0 Reporter: Ravi Teja Ch N V Assignee: Sunil G Fix For: 3.0.0 Attachments: 0001-MAPREDUCE-2632.patch, MAPREDUCE-2632-1.patch, MAPREDUCE-2632.patch, mr-2632-2.patch, mr-2632-3.patch, mr-2632-4.patch We can avoid the call to the partitioner when the number of reducers is 1.This will avoid the unnecessary computations by the partitioner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5876) SequenceFileRecordReader NPE if close() is called before initialize()
[ https://issues.apache.org/jira/browse/MAPREDUCE-5876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536282#comment-14536282 ] Hadoop QA commented on MAPREDUCE-5876: -- \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 37s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 29s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 34s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 20s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 56s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | mapreduce tests | 1m 36s | Tests passed in hadoop-mapreduce-client-core. | | {color:green}+1{color} | mapreduce tests | 107m 29s | Tests passed in hadoop-mapreduce-client-jobclient. | | | | 146m 36s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12643448/MAPREDUCE-5876.1.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 6471d18 | | whitespace | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5693/artifact/patchprocess/whitespace.txt | | hadoop-mapreduce-client-core test log | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5693/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt | | hadoop-mapreduce-client-jobclient test log | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5693/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5693/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5693/console | This message was automatically generated. SequenceFileRecordReader NPE if close() is called before initialize() - Key: MAPREDUCE-5876 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5876 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.3.0, 2.4.0 Reporter: Reinis Vicups Assignee: Tsuyoshi Ozawa Labels: BB2015-05-RFC Attachments: MAPREDUCE-5876.1.patch org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader extends org.apache.hadoop.mapreduce.RecordReader which in turn implements java.io.Closeable. According to java spec the java.io.Closeable#close() has to be idempotent (http://docs.oracle.com/javase/7/docs/api/java/io/Closeable.html) which is not. An NPE is being thrown if close() method is invoked without previously calling initialize() method. This happens because SequenceFile.Reader in is null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6359) RM HA setup, Cluster tab links populated with AM hostname instead of RM
[ https://issues.apache.org/jira/browse/MAPREDUCE-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536709#comment-14536709 ] Hudson commented on MAPREDUCE-6359: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2138 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2138/]) MAPREDUCE-6359. In RM HA setup, Cluster tab links populated with AM hostname instead of RM. Contributed by zhaoyunjiong. (junping_du: rev df36ad0a08261b03c250b6f745b27e5f83e4286e) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/AppController.java * hadoop-mapreduce-project/CHANGES.txt RM HA setup, Cluster tab links populated with AM hostname instead of RM -- Key: MAPREDUCE-6359 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6359 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Environment: centOS-6.x Reporter: Aroop Maliakkal Assignee: zhaoyunjiong Priority: Minor Fix For: 2.8.0 Attachments: YARN-3423.patch In RM HA setup ( e.g. http://rm-1.vip.abc.com:50030/proxy/application_1427789305393_0002/ ), go to the job details and click on the Cluster tab on left top side. Click on any of the links , About, Applications , Scheduler. You can see that the hyperlink is pointing to http://am-1.vip.abc.com:port/cluster ). The port details for secure and unsecure cluster is given below :- 8088 ( DEFAULT_RM_WEBAPP_PORT = 8088 ) 8090 ( DEFAULT_RM_WEBAPP_HTTPS_PORT = 8090 ) Ideally, it should have pointed to resourcemanager hostname instead of AM hostname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5981) Log levels of certain MR logs can be changed to DEBUG
[ https://issues.apache.org/jira/browse/MAPREDUCE-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536697#comment-14536697 ] Hudson commented on MAPREDUCE-5981: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2138 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2138/]) MAPREDUCE-5981. Log levels of certain MR logs can be changed to DEBUG. (devaraj: rev dc2b2ae31f2eb6dae324c2e14ed7660ce605a89b) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/ShuffleSchedulerImpl.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java * hadoop-mapreduce-project/CHANGES.txt Log levels of certain MR logs can be changed to DEBUG - Key: MAPREDUCE-5981 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5981 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Varun Saxena Assignee: Varun Saxena Fix For: 2.8.0 Attachments: MAPREDUCE-5981.02.patch, MAPREDUCE-5981.patch Following map reduce logs can be changed to DEBUG log level as they appear too many times in the log file and are not that important for debugging. 1. In org.apache.hadoop.mapreduce.task.reduce.Fetcher#copyFromHost(Fetcher.java : 313), the second log is not required to be at info level. This can be moved to debug as a warn log is anyways printed if verifyReply fails. SecureShuffleUtils.verifyReply(replyHash, encHash, shuffleSecretKey); LOG.info(for url=+msgToEncode+ sent hash and received reply); 2. Thread related info need not be printed in logs at INFO level. Below 2 logs can be moved to DEBUG a) In org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl#getHost(ShuffleSchedulerImpl.java : 381), below log can be changed to DEBUG LOG.info(Assigning + host + with + host.getNumKnownMapOutputs() + to + Thread.currentThread().getName()); b) In org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.getMapsForHost(ShuffleSchedulerImpl.java : 411), below log can be changed to DEBUG LOG.info(assigned + includedMaps + of + totalSize + to + host + to + Thread.currentThread().getName()); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-2094) LineRecordReader should not seek into non-splittable, compressed streams.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536698#comment-14536698 ] Hudson commented on MAPREDUCE-2094: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2138 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2138/]) MAPREDUCE-2094. LineRecordReader should not seek into non-splittable, compressed streams. (cdouglas: rev 2edcf931d7843cddcf3da5666a73d6ee9a10d00d) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestLineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestLineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/pom.xml * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/resources/TestSafeguardSplittingUnsplittableFiles.txt.gz * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/LineRecordReader.java LineRecordReader should not seek into non-splittable, compressed streams. - Key: MAPREDUCE-2094 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2094 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 2.8.0 Attachments: M2094-1.patch, M2094.patch, MAPREDUCE-2094-2011-05-19.patch, MAPREDUCE-2094-20140727-svn-fixed-spaces.patch, MAPREDUCE-2094-20140727-svn.patch, MAPREDUCE-2094-20140727.patch, MAPREDUCE-2094-2015-05-05-2328.patch, MAPREDUCE-2094-FileInputFormat-docs-v2.patch When implementing a custom derivative of FileInputFormat we ran into the effect that a large Gzipped input file would be processed several times. A near 1GiB file would be processed around 36 times in its entirety. Thus producing garbage results and taking up a lot more CPU time than needed. It took a while to figure out and what we found is that the default implementation of the isSplittable method in [org.apache.hadoop.mapreduce.lib.input.FileInputFormat | http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java?view=markup ] is simply return true;. This is a very unsafe default and is in contradiction with the JavaDoc of the method which states: Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be. . The actual implementation effectively does Is the given filename splitable? Always true, even if the file is stream compressed using an unsplittable compression codec. For our situation (where we always have Gzipped input) we took the easy way out and simply implemented an isSplittable in our class that does return false; Now there are essentially 3 ways I can think of for fixing this (in order of what I would find preferable): # Implement something that looks at the used compression of the file (i.e. do migrate the implementation from TextInputFormat to FileInputFormat). This would make the method do what the JavaDoc describes. # Force developers to think about it and make this method abstract. # Use a safe default (i.e. return false) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5248) Let NNBenchWithoutMR specify the replication factor for its test
[ https://issues.apache.org/jira/browse/MAPREDUCE-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536719#comment-14536719 ] Hudson commented on MAPREDUCE-5248: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2138 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2138/]) MAPREDUCE-5248. Let NNBenchWithoutMR specify the replication factor for its test. Contributed by Erik Paulson (jlowe: rev 30099a36c6b0f658d25fb505a9f3ce15d19f7ba6) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/hdfs/NNBenchWithoutMR.java * hadoop-mapreduce-project/CHANGES.txt Let NNBenchWithoutMR specify the replication factor for its test Key: MAPREDUCE-5248 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5248 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client, test Affects Versions: 3.0.0 Reporter: Erik Paulson Assignee: Erik Paulson Priority: Minor Fix For: 2.8.0 Attachments: MAPREDUCE-5248.patch, MAPREDUCE-5248.txt Original Estimate: 1h Remaining Estimate: 1h The NNBenchWithoutMR test creates files with a replicationFactorPerFile hard-coded to 1. It'd be nice to be able to specify that on the commandline. Also, it'd be great if MAPREDUCE-4750 was merged along with this fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-3383) Duplicate job.getOutputValueGroupingComparator() in ReduceTask
[ https://issues.apache.org/jira/browse/MAPREDUCE-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536717#comment-14536717 ] Hudson commented on MAPREDUCE-3383: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2138 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2138/]) MAPREDUCE-3383. Duplicate job.getOutputValueGroupingComparator() in ReduceTask. Contributed by Binglin Chang (jlowe: rev c39012f4a0444f9e4b7d67957d5192127d143d90) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/ReduceTask.java * hadoop-mapreduce-project/CHANGES.txt Duplicate job.getOutputValueGroupingComparator() in ReduceTask -- Key: MAPREDUCE-3383 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3383 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.1 Reporter: Binglin Chang Assignee: Binglin Chang Fix For: 2.8.0 Attachments: MAPREDUCE-3383-1.patch, MAPREDUCE-3383.patch, MAPREDUCE-3383.patch This is probably just a small error by mistake. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536701#comment-14536701 ] Hudson commented on MAPREDUCE-2632: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2138 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2138/]) MAPREDUCE-2632. Avoid calling the partitioner when the numReduceTasks is 1. (Ravi Teja Ch N V and Sunil G via kasha) (kasha: rev bdbd10fde1539920de937404a785e6ed34dd5628) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Partitioner.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/output/TestMapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Partitioner.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestMapFileOutputFormat.java * hadoop-mapreduce-project/CHANGES.txt Avoid calling the partitioner when the numReduceTasks is 1. --- Key: MAPREDUCE-2632 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.23.0 Reporter: Ravi Teja Ch N V Assignee: Sunil G Fix For: 3.0.0 Attachments: 0001-MAPREDUCE-2632.patch, MAPREDUCE-2632-1.patch, MAPREDUCE-2632.patch, mr-2632-2.patch, mr-2632-3.patch, mr-2632-4.patch We can avoid the call to the partitioner when the number of reducers is 1.This will avoid the unnecessary computations by the partitioner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-4750) Enable NNBenchWithoutMR in MapredTestDriver
[ https://issues.apache.org/jira/browse/MAPREDUCE-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536694#comment-14536694 ] Hudson commented on MAPREDUCE-4750: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2138 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2138/]) MAPREDUCE-4750. Enable NNBenchWithoutMR in MapredTestDriver (Liang Xie and Jason Lowe via raviprak) (raviprak: rev 5aab014340b53ebc9363ee244b2cbea7a4c1f573) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/test/MapredTestDriver.java Enable NNBenchWithoutMR in MapredTestDriver --- Key: MAPREDUCE-4750 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4750 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: MAPREDUCE-4750.002.patch, MAPREDUCE-4750.txt Right now, we could run nnbench from MapredTestDriver only, there's no entry for NNBenchWithoutMR, it would be better enable it explicitly, such that we can do namenode benchmark with less influence factors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6353) Divide by zero error in MR AM when calculating available containers
[ https://issues.apache.org/jira/browse/MAPREDUCE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated MAPREDUCE-6353: - Attachment: MAPREDUCE-6353.002.patch Kick jenkins Divide by zero error in MR AM when calculating available containers --- Key: MAPREDUCE-6353 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6353 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Labels: BB2015-05-RFC Attachments: MAPREDUCE-6353.001.patch, MAPREDUCE-6353.002.patch, MAPREDUCE-6353.002.patch When running a sleep job with zero CPU vcores i see the following exception 2015-04-30 06:41:06,954 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN CONTACTING RM. java.lang.ArithmeticException: / by zero at org.apache.hadoop.mapreduce.v2.app.rm.ResourceCalculatorUtils.computeAvailableContainers(ResourceCalculatorUtils.java:38) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assign(RMContainerAllocator.java:947) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.access$200(RMContainerAllocator.java:840) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:247) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:282) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6353) Divide by zero error in MR AM when calculating available containers
[ https://issues.apache.org/jira/browse/MAPREDUCE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536753#comment-14536753 ] Hadoop QA commented on MAPREDUCE-6353: -- \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 45s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 33s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 37s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 16s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 37s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 56s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | mapreduce tests | 9m 35s | Tests passed in hadoop-mapreduce-client-app. | | | | 45m 20s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731716/MAPREDUCE-6353.002.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / df36ad0 | | hadoop-mapreduce-client-app test log | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5696/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5696/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5696/console | This message was automatically generated. Divide by zero error in MR AM when calculating available containers --- Key: MAPREDUCE-6353 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6353 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Labels: BB2015-05-RFC Attachments: MAPREDUCE-6353.001.patch, MAPREDUCE-6353.002.patch, MAPREDUCE-6353.002.patch When running a sleep job with zero CPU vcores i see the following exception 2015-04-30 06:41:06,954 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN CONTACTING RM. java.lang.ArithmeticException: / by zero at org.apache.hadoop.mapreduce.v2.app.rm.ResourceCalculatorUtils.computeAvailableContainers(ResourceCalculatorUtils.java:38) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assign(RMContainerAllocator.java:947) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.access$200(RMContainerAllocator.java:840) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:247) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:282) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6269) improve JobConf to add option to not reference same credentials between jobs.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536771#comment-14536771 ] zhihai xu commented on MAPREDUCE-6269: -- The following checkstyle issue is the file JobConf.java too big. It is already more than 2,000 before my change. {code} ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobConf.java:1: File length is 2,164 lines (max allowed is 2,000). {code} improve JobConf to add option to not reference same credentials between jobs. - Key: MAPREDUCE-6269 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6269 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client Reporter: zhihai xu Assignee: zhihai xu Labels: BB2015-05-RFC Attachments: MAPREDUCE-6269.000.patch, MAPREDUCE-6269.001.patch Improve JobConf to add constructor to avoid sharing Credentials between jobs. By default the Credentials will be shared to keep the backward compatibility. We can add a new constructor with a new parameter to decide whether to share Credentials. Some issues reported in cascading is due to corrupted credentials at https://github.com/Cascading/cascading/commit/45b33bb864172486ac43782a4d13329312d01c0e If we add this support in JobConf, it will benefit all job clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6359) RM HA setup, Cluster tab links populated with AM hostname instead of RM
[ https://issues.apache.org/jira/browse/MAPREDUCE-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6359: -- Labels: (was: BB2015-05-TBR) RM HA setup, Cluster tab links populated with AM hostname instead of RM -- Key: MAPREDUCE-6359 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6359 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Environment: centOS-6.x Reporter: Aroop Maliakkal Assignee: zhaoyunjiong Priority: Minor Fix For: 2.8.0 Attachments: YARN-3423.patch In RM HA setup ( e.g. http://rm-1.vip.abc.com:50030/proxy/application_1427789305393_0002/ ), go to the job details and click on the Cluster tab on left top side. Click on any of the links , About, Applications , Scheduler. You can see that the hyperlink is pointing to http://am-1.vip.abc.com:port/cluster ). The port details for secure and unsecure cluster is given below :- 8088 ( DEFAULT_RM_WEBAPP_PORT = 8088 ) 8090 ( DEFAULT_RM_WEBAPP_HTTPS_PORT = 8090 ) Ideally, it should have pointed to resourcemanager hostname instead of AM hostname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6359) RM HA setup, Cluster tab links populated with AM hostname instead of RM
[ https://issues.apache.org/jira/browse/MAPREDUCE-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536500#comment-14536500 ] Hudson commented on MAPREDUCE-6359: --- FAILURE: Integrated in Hadoop-trunk-Commit #7784 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7784/]) MAPREDUCE-6359. In RM HA setup, Cluster tab links populated with AM hostname instead of RM. Contributed by zhaoyunjiong. (junping_du: rev df36ad0a08261b03c250b6f745b27e5f83e4286e) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/AppController.java RM HA setup, Cluster tab links populated with AM hostname instead of RM -- Key: MAPREDUCE-6359 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6359 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Environment: centOS-6.x Reporter: Aroop Maliakkal Assignee: zhaoyunjiong Priority: Minor Fix For: 2.8.0 Attachments: YARN-3423.patch In RM HA setup ( e.g. http://rm-1.vip.abc.com:50030/proxy/application_1427789305393_0002/ ), go to the job details and click on the Cluster tab on left top side. Click on any of the links , About, Applications , Scheduler. You can see that the hyperlink is pointing to http://am-1.vip.abc.com:port/cluster ). The port details for secure and unsecure cluster is given below :- 8088 ( DEFAULT_RM_WEBAPP_PORT = 8088 ) 8090 ( DEFAULT_RM_WEBAPP_HTTPS_PORT = 8090 ) Ideally, it should have pointed to resourcemanager hostname instead of AM hostname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (MAPREDUCE-3642) Remove hardcoded strings from the JC#displayTasks() call.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kengo Seki resolved MAPREDUCE-3642. --- Resolution: Fixed Target Version/s: (was: ) Closing because already fixed. Please reopen if I am wrong. Remove hardcoded strings from the JC#displayTasks() call. - Key: MAPREDUCE-3642 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3642 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: client Affects Versions: 2.0.0-alpha Reporter: Harsh J Labels: newbie Attachments: MR-3642.1.patch This is to address Eli's comments on the parent task: bq. 1. The error messages should generate the lists of valid states and types from their definitions rather than hard-coding them into the error messages. bq. 2. Aren't these types and states defined somewhere already? Seems like they're a public API and therefore shouldn't have to duplicate the definition of them in taskTypes and taskStates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-2094) LineRecordReader should not seek into non-splittable, compressed streams.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536539#comment-14536539 ] Hudson commented on MAPREDUCE-2094: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #2120 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2120/]) MAPREDUCE-2094. LineRecordReader should not seek into non-splittable, compressed streams. (cdouglas: rev 2edcf931d7843cddcf3da5666a73d6ee9a10d00d) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/resources/TestSafeguardSplittingUnsplittableFiles.txt.gz * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestLineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestLineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/LineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/pom.xml * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java LineRecordReader should not seek into non-splittable, compressed streams. - Key: MAPREDUCE-2094 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2094 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 2.8.0 Attachments: M2094-1.patch, M2094.patch, MAPREDUCE-2094-2011-05-19.patch, MAPREDUCE-2094-20140727-svn-fixed-spaces.patch, MAPREDUCE-2094-20140727-svn.patch, MAPREDUCE-2094-20140727.patch, MAPREDUCE-2094-2015-05-05-2328.patch, MAPREDUCE-2094-FileInputFormat-docs-v2.patch When implementing a custom derivative of FileInputFormat we ran into the effect that a large Gzipped input file would be processed several times. A near 1GiB file would be processed around 36 times in its entirety. Thus producing garbage results and taking up a lot more CPU time than needed. It took a while to figure out and what we found is that the default implementation of the isSplittable method in [org.apache.hadoop.mapreduce.lib.input.FileInputFormat | http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java?view=markup ] is simply return true;. This is a very unsafe default and is in contradiction with the JavaDoc of the method which states: Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be. . The actual implementation effectively does Is the given filename splitable? Always true, even if the file is stream compressed using an unsplittable compression codec. For our situation (where we always have Gzipped input) we took the easy way out and simply implemented an isSplittable in our class that does return false; Now there are essentially 3 ways I can think of for fixing this (in order of what I would find preferable): # Implement something that looks at the used compression of the file (i.e. do migrate the implementation from TextInputFormat to FileInputFormat). This would make the method do what the JavaDoc describes. # Force developers to think about it and make this method abstract. # Use a safe default (i.e. return false) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5981) Log levels of certain MR logs can be changed to DEBUG
[ https://issues.apache.org/jira/browse/MAPREDUCE-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536589#comment-14536589 ] Hudson commented on MAPREDUCE-5981: --- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #180 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/180/]) MAPREDUCE-5981. Log levels of certain MR logs can be changed to DEBUG. (devaraj: rev dc2b2ae31f2eb6dae324c2e14ed7660ce605a89b) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/ShuffleSchedulerImpl.java * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java Log levels of certain MR logs can be changed to DEBUG - Key: MAPREDUCE-5981 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5981 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Varun Saxena Assignee: Varun Saxena Fix For: 2.8.0 Attachments: MAPREDUCE-5981.02.patch, MAPREDUCE-5981.patch Following map reduce logs can be changed to DEBUG log level as they appear too many times in the log file and are not that important for debugging. 1. In org.apache.hadoop.mapreduce.task.reduce.Fetcher#copyFromHost(Fetcher.java : 313), the second log is not required to be at info level. This can be moved to debug as a warn log is anyways printed if verifyReply fails. SecureShuffleUtils.verifyReply(replyHash, encHash, shuffleSecretKey); LOG.info(for url=+msgToEncode+ sent hash and received reply); 2. Thread related info need not be printed in logs at INFO level. Below 2 logs can be moved to DEBUG a) In org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl#getHost(ShuffleSchedulerImpl.java : 381), below log can be changed to DEBUG LOG.info(Assigning + host + with + host.getNumKnownMapOutputs() + to + Thread.currentThread().getName()); b) In org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.getMapsForHost(ShuffleSchedulerImpl.java : 411), below log can be changed to DEBUG LOG.info(assigned + includedMaps + of + totalSize + to + host + to + Thread.currentThread().getName()); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-3383) Duplicate job.getOutputValueGroupingComparator() in ReduceTask
[ https://issues.apache.org/jira/browse/MAPREDUCE-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536610#comment-14536610 ] Hudson commented on MAPREDUCE-3383: --- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #180 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/180/]) MAPREDUCE-3383. Duplicate job.getOutputValueGroupingComparator() in ReduceTask. Contributed by Binglin Chang (jlowe: rev c39012f4a0444f9e4b7d67957d5192127d143d90) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/ReduceTask.java * hadoop-mapreduce-project/CHANGES.txt Duplicate job.getOutputValueGroupingComparator() in ReduceTask -- Key: MAPREDUCE-3383 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3383 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.1 Reporter: Binglin Chang Assignee: Binglin Chang Fix For: 2.8.0 Attachments: MAPREDUCE-3383-1.patch, MAPREDUCE-3383.patch, MAPREDUCE-3383.patch This is probably just a small error by mistake. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5248) Let NNBenchWithoutMR specify the replication factor for its test
[ https://issues.apache.org/jira/browse/MAPREDUCE-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536612#comment-14536612 ] Hudson commented on MAPREDUCE-5248: --- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #180 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/180/]) MAPREDUCE-5248. Let NNBenchWithoutMR specify the replication factor for its test. Contributed by Erik Paulson (jlowe: rev 30099a36c6b0f658d25fb505a9f3ce15d19f7ba6) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/hdfs/NNBenchWithoutMR.java Let NNBenchWithoutMR specify the replication factor for its test Key: MAPREDUCE-5248 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5248 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client, test Affects Versions: 3.0.0 Reporter: Erik Paulson Assignee: Erik Paulson Priority: Minor Fix For: 2.8.0 Attachments: MAPREDUCE-5248.patch, MAPREDUCE-5248.txt Original Estimate: 1h Remaining Estimate: 1h The NNBenchWithoutMR test creates files with a replicationFactorPerFile hard-coded to 1. It'd be nice to be able to specify that on the commandline. Also, it'd be great if MAPREDUCE-4750 was merged along with this fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536543#comment-14536543 ] Hudson commented on MAPREDUCE-2632: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #2120 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2120/]) MAPREDUCE-2632. Avoid calling the partitioner when the numReduceTasks is 1. (Ravi Teja Ch N V and Sunil G via kasha) (kasha: rev bdbd10fde1539920de937404a785e6ed34dd5628) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Partitioner.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Partitioner.java * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestMapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/output/TestMapFileOutputFormat.java Avoid calling the partitioner when the numReduceTasks is 1. --- Key: MAPREDUCE-2632 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.23.0 Reporter: Ravi Teja Ch N V Assignee: Sunil G Fix For: 3.0.0 Attachments: 0001-MAPREDUCE-2632.patch, MAPREDUCE-2632-1.patch, MAPREDUCE-2632.patch, mr-2632-2.patch, mr-2632-3.patch, mr-2632-4.patch We can avoid the call to the partitioner when the number of reducers is 1.This will avoid the unnecessary computations by the partitioner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5981) Log levels of certain MR logs can be changed to DEBUG
[ https://issues.apache.org/jira/browse/MAPREDUCE-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536538#comment-14536538 ] Hudson commented on MAPREDUCE-5981: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #2120 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2120/]) MAPREDUCE-5981. Log levels of certain MR logs can be changed to DEBUG. (devaraj: rev dc2b2ae31f2eb6dae324c2e14ed7660ce605a89b) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/ShuffleSchedulerImpl.java * hadoop-mapreduce-project/CHANGES.txt Log levels of certain MR logs can be changed to DEBUG - Key: MAPREDUCE-5981 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5981 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Varun Saxena Assignee: Varun Saxena Fix For: 2.8.0 Attachments: MAPREDUCE-5981.02.patch, MAPREDUCE-5981.patch Following map reduce logs can be changed to DEBUG log level as they appear too many times in the log file and are not that important for debugging. 1. In org.apache.hadoop.mapreduce.task.reduce.Fetcher#copyFromHost(Fetcher.java : 313), the second log is not required to be at info level. This can be moved to debug as a warn log is anyways printed if verifyReply fails. SecureShuffleUtils.verifyReply(replyHash, encHash, shuffleSecretKey); LOG.info(for url=+msgToEncode+ sent hash and received reply); 2. Thread related info need not be printed in logs at INFO level. Below 2 logs can be moved to DEBUG a) In org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl#getHost(ShuffleSchedulerImpl.java : 381), below log can be changed to DEBUG LOG.info(Assigning + host + with + host.getNumKnownMapOutputs() + to + Thread.currentThread().getName()); b) In org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.getMapsForHost(ShuffleSchedulerImpl.java : 411), below log can be changed to DEBUG LOG.info(assigned + includedMaps + of + totalSize + to + host + to + Thread.currentThread().getName()); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-3383) Duplicate job.getOutputValueGroupingComparator() in ReduceTask
[ https://issues.apache.org/jira/browse/MAPREDUCE-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536559#comment-14536559 ] Hudson commented on MAPREDUCE-3383: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #2120 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2120/]) MAPREDUCE-3383. Duplicate job.getOutputValueGroupingComparator() in ReduceTask. Contributed by Binglin Chang (jlowe: rev c39012f4a0444f9e4b7d67957d5192127d143d90) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/ReduceTask.java * hadoop-mapreduce-project/CHANGES.txt Duplicate job.getOutputValueGroupingComparator() in ReduceTask -- Key: MAPREDUCE-3383 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3383 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.1 Reporter: Binglin Chang Assignee: Binglin Chang Fix For: 2.8.0 Attachments: MAPREDUCE-3383-1.patch, MAPREDUCE-3383.patch, MAPREDUCE-3383.patch This is probably just a small error by mistake. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5248) Let NNBenchWithoutMR specify the replication factor for its test
[ https://issues.apache.org/jira/browse/MAPREDUCE-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536561#comment-14536561 ] Hudson commented on MAPREDUCE-5248: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #2120 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2120/]) MAPREDUCE-5248. Let NNBenchWithoutMR specify the replication factor for its test. Contributed by Erik Paulson (jlowe: rev 30099a36c6b0f658d25fb505a9f3ce15d19f7ba6) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/hdfs/NNBenchWithoutMR.java * hadoop-mapreduce-project/CHANGES.txt Let NNBenchWithoutMR specify the replication factor for its test Key: MAPREDUCE-5248 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5248 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client, test Affects Versions: 3.0.0 Reporter: Erik Paulson Assignee: Erik Paulson Priority: Minor Fix For: 2.8.0 Attachments: MAPREDUCE-5248.patch, MAPREDUCE-5248.txt Original Estimate: 1h Remaining Estimate: 1h The NNBenchWithoutMR test creates files with a replicationFactorPerFile hard-coded to 1. It'd be nice to be able to specify that on the commandline. Also, it'd be great if MAPREDUCE-4750 was merged along with this fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-4750) Enable NNBenchWithoutMR in MapredTestDriver
[ https://issues.apache.org/jira/browse/MAPREDUCE-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536535#comment-14536535 ] Hudson commented on MAPREDUCE-4750: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #2120 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2120/]) MAPREDUCE-4750. Enable NNBenchWithoutMR in MapredTestDriver (Liang Xie and Jason Lowe via raviprak) (raviprak: rev 5aab014340b53ebc9363ee244b2cbea7a4c1f573) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/test/MapredTestDriver.java Enable NNBenchWithoutMR in MapredTestDriver --- Key: MAPREDUCE-4750 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4750 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: MAPREDUCE-4750.002.patch, MAPREDUCE-4750.txt Right now, we could run nnbench from MapredTestDriver only, there's no entry for NNBenchWithoutMR, it would be better enable it explicitly, such that we can do namenode benchmark with less influence factors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6359) RM HA setup, Cluster tab links populated with AM hostname instead of RM
[ https://issues.apache.org/jira/browse/MAPREDUCE-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6359: -- Resolution: Fixed Fix Version/s: 2.8.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I have commit this patch to trunk and branch-2. Thanks [~zhaoyunjiong] for contributing the patch and congratulations to the first patch contribution! Also, Thanks review comments from [~aroop], [~adhoot] and [~kasha]! RM HA setup, Cluster tab links populated with AM hostname instead of RM -- Key: MAPREDUCE-6359 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6359 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Environment: centOS-6.x Reporter: Aroop Maliakkal Assignee: zhaoyunjiong Priority: Minor Labels: BB2015-05-TBR Fix For: 2.8.0 Attachments: YARN-3423.patch In RM HA setup ( e.g. http://rm-1.vip.abc.com:50030/proxy/application_1427789305393_0002/ ), go to the job details and click on the Cluster tab on left top side. Click on any of the links , About, Applications , Scheduler. You can see that the hyperlink is pointing to http://am-1.vip.abc.com:port/cluster ). The port details for secure and unsecure cluster is given below :- 8088 ( DEFAULT_RM_WEBAPP_PORT = 8088 ) 8090 ( DEFAULT_RM_WEBAPP_HTTPS_PORT = 8090 ) Ideally, it should have pointed to resourcemanager hostname instead of AM hostname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-4750) Enable NNBenchWithoutMR in MapredTestDriver
[ https://issues.apache.org/jira/browse/MAPREDUCE-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536586#comment-14536586 ] Hudson commented on MAPREDUCE-4750: --- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #180 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/180/]) MAPREDUCE-4750. Enable NNBenchWithoutMR in MapredTestDriver (Liang Xie and Jason Lowe via raviprak) (raviprak: rev 5aab014340b53ebc9363ee244b2cbea7a4c1f573) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/test/MapredTestDriver.java Enable NNBenchWithoutMR in MapredTestDriver --- Key: MAPREDUCE-4750 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4750 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: MAPREDUCE-4750.002.patch, MAPREDUCE-4750.txt Right now, we could run nnbench from MapredTestDriver only, there's no entry for NNBenchWithoutMR, it would be better enable it explicitly, such that we can do namenode benchmark with less influence factors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536594#comment-14536594 ] Hudson commented on MAPREDUCE-2632: --- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #180 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/180/]) MAPREDUCE-2632. Avoid calling the partitioner when the numReduceTasks is 1. (Ravi Teja Ch N V and Sunil G via kasha) (kasha: rev bdbd10fde1539920de937404a785e6ed34dd5628) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Partitioner.java * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Partitioner.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/output/TestMapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestMapFileOutputFormat.java Avoid calling the partitioner when the numReduceTasks is 1. --- Key: MAPREDUCE-2632 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.23.0 Reporter: Ravi Teja Ch N V Assignee: Sunil G Fix For: 3.0.0 Attachments: 0001-MAPREDUCE-2632.patch, MAPREDUCE-2632-1.patch, MAPREDUCE-2632.patch, mr-2632-2.patch, mr-2632-3.patch, mr-2632-4.patch We can avoid the call to the partitioner when the number of reducers is 1.This will avoid the unnecessary computations by the partitioner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-2094) LineRecordReader should not seek into non-splittable, compressed streams.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536590#comment-14536590 ] Hudson commented on MAPREDUCE-2094: --- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #180 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/180/]) MAPREDUCE-2094. LineRecordReader should not seek into non-splittable, compressed streams. (cdouglas: rev 2edcf931d7843cddcf3da5666a73d6ee9a10d00d) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/LineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/resources/TestSafeguardSplittingUnsplittableFiles.txt.gz * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestLineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/pom.xml * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestLineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java LineRecordReader should not seek into non-splittable, compressed streams. - Key: MAPREDUCE-2094 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2094 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 2.8.0 Attachments: M2094-1.patch, M2094.patch, MAPREDUCE-2094-2011-05-19.patch, MAPREDUCE-2094-20140727-svn-fixed-spaces.patch, MAPREDUCE-2094-20140727-svn.patch, MAPREDUCE-2094-20140727.patch, MAPREDUCE-2094-2015-05-05-2328.patch, MAPREDUCE-2094-FileInputFormat-docs-v2.patch When implementing a custom derivative of FileInputFormat we ran into the effect that a large Gzipped input file would be processed several times. A near 1GiB file would be processed around 36 times in its entirety. Thus producing garbage results and taking up a lot more CPU time than needed. It took a while to figure out and what we found is that the default implementation of the isSplittable method in [org.apache.hadoop.mapreduce.lib.input.FileInputFormat | http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java?view=markup ] is simply return true;. This is a very unsafe default and is in contradiction with the JavaDoc of the method which states: Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be. . The actual implementation effectively does Is the given filename splitable? Always true, even if the file is stream compressed using an unsplittable compression codec. For our situation (where we always have Gzipped input) we took the easy way out and simply implemented an isSplittable in our class that does return false; Now there are essentially 3 ways I can think of for fixing this (in order of what I would find preferable): # Implement something that looks at the used compression of the file (i.e. do migrate the implementation from TextInputFormat to FileInputFormat). This would make the method do what the JavaDoc describes. # Force developers to think about it and make this method abstract. # Use a safe default (i.e. return false) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6359) RM HA setup, Cluster tab links populated with AM hostname instead of RM
[ https://issues.apache.org/jira/browse/MAPREDUCE-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536662#comment-14536662 ] Hudson commented on MAPREDUCE-6359: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #190 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/190/]) MAPREDUCE-6359. In RM HA setup, Cluster tab links populated with AM hostname instead of RM. Contributed by zhaoyunjiong. (junping_du: rev df36ad0a08261b03c250b6f745b27e5f83e4286e) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/AppController.java RM HA setup, Cluster tab links populated with AM hostname instead of RM -- Key: MAPREDUCE-6359 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6359 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Environment: centOS-6.x Reporter: Aroop Maliakkal Assignee: zhaoyunjiong Priority: Minor Fix For: 2.8.0 Attachments: YARN-3423.patch In RM HA setup ( e.g. http://rm-1.vip.abc.com:50030/proxy/application_1427789305393_0002/ ), go to the job details and click on the Cluster tab on left top side. Click on any of the links , About, Applications , Scheduler. You can see that the hyperlink is pointing to http://am-1.vip.abc.com:port/cluster ). The port details for secure and unsecure cluster is given below :- 8088 ( DEFAULT_RM_WEBAPP_PORT = 8088 ) 8090 ( DEFAULT_RM_WEBAPP_HTTPS_PORT = 8090 ) Ideally, it should have pointed to resourcemanager hostname instead of AM hostname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-4750) Enable NNBenchWithoutMR in MapredTestDriver
[ https://issues.apache.org/jira/browse/MAPREDUCE-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536647#comment-14536647 ] Hudson commented on MAPREDUCE-4750: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #190 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/190/]) MAPREDUCE-4750. Enable NNBenchWithoutMR in MapredTestDriver (Liang Xie and Jason Lowe via raviprak) (raviprak: rev 5aab014340b53ebc9363ee244b2cbea7a4c1f573) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/test/MapredTestDriver.java * hadoop-mapreduce-project/CHANGES.txt Enable NNBenchWithoutMR in MapredTestDriver --- Key: MAPREDUCE-4750 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4750 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, test Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: MAPREDUCE-4750.002.patch, MAPREDUCE-4750.txt Right now, we could run nnbench from MapredTestDriver only, there's no entry for NNBenchWithoutMR, it would be better enable it explicitly, such that we can do namenode benchmark with less influence factors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-2094) LineRecordReader should not seek into non-splittable, compressed streams.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536651#comment-14536651 ] Hudson commented on MAPREDUCE-2094: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #190 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/190/]) MAPREDUCE-2094. LineRecordReader should not seek into non-splittable, compressed streams. (cdouglas: rev 2edcf931d7843cddcf3da5666a73d6ee9a10d00d) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestLineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/pom.xml * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/resources/TestSafeguardSplittingUnsplittableFiles.txt.gz * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestLineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/LineRecordReader.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/LineRecordReader.java LineRecordReader should not seek into non-splittable, compressed streams. - Key: MAPREDUCE-2094 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2094 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 2.8.0 Attachments: M2094-1.patch, M2094.patch, MAPREDUCE-2094-2011-05-19.patch, MAPREDUCE-2094-20140727-svn-fixed-spaces.patch, MAPREDUCE-2094-20140727-svn.patch, MAPREDUCE-2094-20140727.patch, MAPREDUCE-2094-2015-05-05-2328.patch, MAPREDUCE-2094-FileInputFormat-docs-v2.patch When implementing a custom derivative of FileInputFormat we ran into the effect that a large Gzipped input file would be processed several times. A near 1GiB file would be processed around 36 times in its entirety. Thus producing garbage results and taking up a lot more CPU time than needed. It took a while to figure out and what we found is that the default implementation of the isSplittable method in [org.apache.hadoop.mapreduce.lib.input.FileInputFormat | http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java?view=markup ] is simply return true;. This is a very unsafe default and is in contradiction with the JavaDoc of the method which states: Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be. . The actual implementation effectively does Is the given filename splitable? Always true, even if the file is stream compressed using an unsplittable compression codec. For our situation (where we always have Gzipped input) we took the easy way out and simply implemented an isSplittable in our class that does return false; Now there are essentially 3 ways I can think of for fixing this (in order of what I would find preferable): # Implement something that looks at the used compression of the file (i.e. do migrate the implementation from TextInputFormat to FileInputFormat). This would make the method do what the JavaDoc describes. # Force developers to think about it and make this method abstract. # Use a safe default (i.e. return false) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-3383) Duplicate job.getOutputValueGroupingComparator() in ReduceTask
[ https://issues.apache.org/jira/browse/MAPREDUCE-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536670#comment-14536670 ] Hudson commented on MAPREDUCE-3383: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #190 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/190/]) MAPREDUCE-3383. Duplicate job.getOutputValueGroupingComparator() in ReduceTask. Contributed by Binglin Chang (jlowe: rev c39012f4a0444f9e4b7d67957d5192127d143d90) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/ReduceTask.java Duplicate job.getOutputValueGroupingComparator() in ReduceTask -- Key: MAPREDUCE-3383 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3383 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.1 Reporter: Binglin Chang Assignee: Binglin Chang Fix For: 2.8.0 Attachments: MAPREDUCE-3383-1.patch, MAPREDUCE-3383.patch, MAPREDUCE-3383.patch This is probably just a small error by mistake. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5248) Let NNBenchWithoutMR specify the replication factor for its test
[ https://issues.apache.org/jira/browse/MAPREDUCE-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536672#comment-14536672 ] Hudson commented on MAPREDUCE-5248: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #190 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/190/]) MAPREDUCE-5248. Let NNBenchWithoutMR specify the replication factor for its test. Contributed by Erik Paulson (jlowe: rev 30099a36c6b0f658d25fb505a9f3ce15d19f7ba6) * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/hdfs/NNBenchWithoutMR.java Let NNBenchWithoutMR specify the replication factor for its test Key: MAPREDUCE-5248 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5248 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client, test Affects Versions: 3.0.0 Reporter: Erik Paulson Assignee: Erik Paulson Priority: Minor Fix For: 2.8.0 Attachments: MAPREDUCE-5248.patch, MAPREDUCE-5248.txt Original Estimate: 1h Remaining Estimate: 1h The NNBenchWithoutMR test creates files with a replicationFactorPerFile hard-coded to 1. It'd be nice to be able to specify that on the commandline. Also, it'd be great if MAPREDUCE-4750 was merged along with this fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5981) Log levels of certain MR logs can be changed to DEBUG
[ https://issues.apache.org/jira/browse/MAPREDUCE-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536650#comment-14536650 ] Hudson commented on MAPREDUCE-5981: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #190 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/190/]) MAPREDUCE-5981. Log levels of certain MR logs can be changed to DEBUG. (devaraj: rev dc2b2ae31f2eb6dae324c2e14ed7660ce605a89b) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/ShuffleSchedulerImpl.java * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java Log levels of certain MR logs can be changed to DEBUG - Key: MAPREDUCE-5981 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5981 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Varun Saxena Assignee: Varun Saxena Fix For: 2.8.0 Attachments: MAPREDUCE-5981.02.patch, MAPREDUCE-5981.patch Following map reduce logs can be changed to DEBUG log level as they appear too many times in the log file and are not that important for debugging. 1. In org.apache.hadoop.mapreduce.task.reduce.Fetcher#copyFromHost(Fetcher.java : 313), the second log is not required to be at info level. This can be moved to debug as a warn log is anyways printed if verifyReply fails. SecureShuffleUtils.verifyReply(replyHash, encHash, shuffleSecretKey); LOG.info(for url=+msgToEncode+ sent hash and received reply); 2. Thread related info need not be printed in logs at INFO level. Below 2 logs can be moved to DEBUG a) In org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl#getHost(ShuffleSchedulerImpl.java : 381), below log can be changed to DEBUG LOG.info(Assigning + host + with + host.getNumKnownMapOutputs() + to + Thread.currentThread().getName()); b) In org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.getMapsForHost(ShuffleSchedulerImpl.java : 411), below log can be changed to DEBUG LOG.info(assigned + includedMaps + of + totalSize + to + host + to + Thread.currentThread().getName()); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536654#comment-14536654 ] Hudson commented on MAPREDUCE-2632: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #190 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/190/]) MAPREDUCE-2632. Avoid calling the partitioner when the numReduceTasks is 1. (Ravi Teja Ch N V and Sunil G via kasha) (kasha: rev bdbd10fde1539920de937404a785e6ed34dd5628) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/output/TestMapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MapFileOutputFormat.java * hadoop-mapreduce-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestMapFileOutputFormat.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Partitioner.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Partitioner.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapFileOutputFormat.java Avoid calling the partitioner when the numReduceTasks is 1. --- Key: MAPREDUCE-2632 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.23.0 Reporter: Ravi Teja Ch N V Assignee: Sunil G Fix For: 3.0.0 Attachments: 0001-MAPREDUCE-2632.patch, MAPREDUCE-2632-1.patch, MAPREDUCE-2632.patch, mr-2632-2.patch, mr-2632-3.patch, mr-2632-4.patch We can avoid the call to the partitioner when the number of reducers is 1.This will avoid the unnecessary computations by the partitioner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5748) Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived()
[ https://issues.apache.org/jira/browse/MAPREDUCE-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated MAPREDUCE-5748: Status: Open (was: Patch Available) Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived() Key: MAPREDUCE-5748 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5748 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.7.0 Reporter: Ted Yu Assignee: Varun Saxena Priority: Minor Attachments: 0001-MAPREDUCE-5748-Potential-null-pointer-deference-in-S.patch, MAPREDUCE-5748.02.patch Starting around line 510: {code} ChannelFuture lastMap = null; for (String mapId : mapIds) { ... } lastMap.addListener(metrics); lastMap.addListener(ChannelFutureListener.CLOSE); {code} If mapIds is empty, lastMap would remain null, leading to NPE in addListener() call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5748) Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived()
[ https://issues.apache.org/jira/browse/MAPREDUCE-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated MAPREDUCE-5748: Attachment: MAPREDUCE-5748.03.patch Fixed checkstyle issue Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived() Key: MAPREDUCE-5748 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5748 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.7.0 Reporter: Ted Yu Assignee: Varun Saxena Priority: Minor Attachments: 0001-MAPREDUCE-5748-Potential-null-pointer-deference-in-S.patch, MAPREDUCE-5748.02.patch, MAPREDUCE-5748.03.patch Starting around line 510: {code} ChannelFuture lastMap = null; for (String mapId : mapIds) { ... } lastMap.addListener(metrics); lastMap.addListener(ChannelFutureListener.CLOSE); {code} If mapIds is empty, lastMap would remain null, leading to NPE in addListener() call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5748) Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived()
[ https://issues.apache.org/jira/browse/MAPREDUCE-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated MAPREDUCE-5748: Labels: BB2015-05-RFC (was: ) Status: Patch Available (was: Open) Potential null pointer deference in ShuffleHandler#Shuffle#messageReceived() Key: MAPREDUCE-5748 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5748 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.7.0 Reporter: Ted Yu Assignee: Varun Saxena Priority: Minor Labels: BB2015-05-RFC Attachments: 0001-MAPREDUCE-5748-Potential-null-pointer-deference-in-S.patch, MAPREDUCE-5748.02.patch, MAPREDUCE-5748.03.patch Starting around line 510: {code} ChannelFuture lastMap = null; for (String mapId : mapIds) { ... } lastMap.addListener(metrics); lastMap.addListener(ChannelFutureListener.CLOSE); {code} If mapIds is empty, lastMap would remain null, leading to NPE in addListener() call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6269) improve JobConf to add option to not reference same credentials between jobs.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated MAPREDUCE-6269: - Summary: improve JobConf to add option to not reference same credentials between jobs. (was: improve JobConf to add option to not use same Credentials reference between jobs.) improve JobConf to add option to not reference same credentials between jobs. - Key: MAPREDUCE-6269 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6269 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client Reporter: zhihai xu Assignee: zhihai xu Labels: BB2015-05-RFC Attachments: MAPREDUCE-6269.000.patch, MAPREDUCE-6269.001.patch Improve JobConf to add constructor to avoid sharing Credentials between jobs. By default the Credentials will be shared to keep the backward compatibility. We can add a new constructor with a new parameter to decide whether to share Credentials. Some issues reported in cascading is due to corrupted credentials at https://github.com/Cascading/cascading/commit/45b33bb864172486ac43782a4d13329312d01c0e If we add this support in JobConf, it will benefit all job clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6269) improve JobConf to add option to not use same Credentials reference between jobs.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated MAPREDUCE-6269: - Attachment: MAPREDUCE-6269.001.patch improve JobConf to add option to not use same Credentials reference between jobs. - Key: MAPREDUCE-6269 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6269 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client Reporter: zhihai xu Assignee: zhihai xu Labels: BB2015-05-RFC Attachments: MAPREDUCE-6269.000.patch, MAPREDUCE-6269.001.patch Improve JobConf to add constructor to avoid sharing Credentials between jobs. By default the Credentials will be shared to keep the backward compatibility. We can add a new constructor with a new parameter to decide whether to share Credentials. Some issues reported in cascading is due to corrupted credentials at https://github.com/Cascading/cascading/commit/45b33bb864172486ac43782a4d13329312d01c0e If we add this support in JobConf, it will benefit all job clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6269) improve JobConf to add option to not reference same credentials between jobs.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536251#comment-14536251 ] zhihai xu commented on MAPREDUCE-6269: -- I uploaded a new patch MAPREDUCE-6269.001.patch based on [~tgraves]'s comments for review. improve JobConf to add option to not reference same credentials between jobs. - Key: MAPREDUCE-6269 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6269 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client Reporter: zhihai xu Assignee: zhihai xu Labels: BB2015-05-RFC Attachments: MAPREDUCE-6269.000.patch, MAPREDUCE-6269.001.patch Improve JobConf to add constructor to avoid sharing Credentials between jobs. By default the Credentials will be shared to keep the backward compatibility. We can add a new constructor with a new parameter to decide whether to share Credentials. Some issues reported in cascading is due to corrupted credentials at https://github.com/Cascading/cascading/commit/45b33bb864172486ac43782a4d13329312d01c0e If we add this support in JobConf, it will benefit all job clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6269) improve JobConf to add option to not reference same credentials between jobs.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536266#comment-14536266 ] Hadoop QA commented on MAPREDUCE-6269: -- \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 29s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 32s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 31s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 47s | The applied patch generated 1 new checkstyle issues (total was 96, now 96). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 38s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 15s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | mapreduce tests | 1m 37s | Tests passed in hadoop-mapreduce-client-core. | | | | 37m 47s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731683/MAPREDUCE-6269.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 02a4a22 | | checkstyle | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5694/artifact/patchprocess/diffcheckstylehadoop-mapreduce-client-core.txt | | hadoop-mapreduce-client-core test log | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5694/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5694/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5694/console | This message was automatically generated. improve JobConf to add option to not reference same credentials between jobs. - Key: MAPREDUCE-6269 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6269 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client Reporter: zhihai xu Assignee: zhihai xu Labels: BB2015-05-RFC Attachments: MAPREDUCE-6269.000.patch, MAPREDUCE-6269.001.patch Improve JobConf to add constructor to avoid sharing Credentials between jobs. By default the Credentials will be shared to keep the backward compatibility. We can add a new constructor with a new parameter to decide whether to share Credentials. Some issues reported in cascading is due to corrupted credentials at https://github.com/Cascading/cascading/commit/45b33bb864172486ac43782a4d13329312d01c0e If we add this support in JobConf, it will benefit all job clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)