[jira] [Updated] (MAPREDUCE-2627) guava-r09 JAR file needs to be added to mapreduce.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-2627: --- Affects Version/s: (was: 0.21.0) 0.22.0 It's actually 0.22. JobTracker won't start otherwise. Plamen, could you please post the exception here. guava-r09 JAR file needs to be added to mapreduce. -- Key: MAPREDUCE-2627 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2627 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.22.0 Reporter: Plamen Jeliazkov Priority: Trivial Original Estimate: 24h Remaining Estimate: 24h Need to add the guava-r09.jar file into the mapreduce/build/ivy/lib/Hadoop/common directory; missing from build. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057015#comment-13057015 ] Nigel Daley commented on MAPREDUCE-279: --- Arun, are you planning to get a Jenkins build running on this branch before merge? Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2627) guava-r09 JAR file needs to be added to mapreduce.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057017#comment-13057017 ] Harsh J commented on MAPREDUCE-2627: (On trunk MR, I'd added guava in my patch at MAPREDUCE-1347 -- Not sure of 0.22 having it or not) guava-r09 JAR file needs to be added to mapreduce. -- Key: MAPREDUCE-2627 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2627 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.22.0 Reporter: Plamen Jeliazkov Priority: Trivial Original Estimate: 24h Remaining Estimate: 24h Need to add the guava-r09.jar file into the mapreduce/build/ivy/lib/Hadoop/common directory; missing from build. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1347) Missing synchronization in MultipleOutputFormat
[ https://issues.apache.org/jira/browse/MAPREDUCE-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-1347: --- Attachment: MAPREDUCE-1347.r6.diff Comment's comment issues addressed :) The É thing was due to Mac's compose key, I'd typed a … there. Removed. Missing synchronization in MultipleOutputFormat --- Key: MAPREDUCE-1347 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1347 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.2, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: Harsh J Fix For: 0.23.0 Attachments: MAPREDUCE-1347.r2.diff, MAPREDUCE-1347.r3.diff, MAPREDUCE-1347.r4.diff, MAPREDUCE-1347.r5.diff, MAPREDUCE-1347.r6.diff, mapreduce.1347.r1.diff MultipleOutputFormat's RecordWriter implementation doesn't use synchronization when accessing the recordWriters member. When using multithreaded mappers or reducers, this can result in problems where two threads will both try to create the same file, causing AlreadyBeingCreatedException. Doing this more fine-grained than just synchronizing the whole method is probably a good idea, so that multithreaded mappers can actually achieve parallelism writing into separate output streams. From what I can tell, the new API's MultipleOutputs seems not to have this issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-2630) MR-279: refreshQueues leads to NPEs when used w/FifoScheduler
MR-279: refreshQueues leads to NPEs when used w/FifoScheduler - Key: MAPREDUCE-2630 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2630 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Environment: All Reporter: Josh Wills Priority: Minor The RM's admin service exposes a method refreshQueues that is used to update the queue configuration when used with the CapacityScheduler, but if it is used with the FifoScheduler, it will set the containerTokenSecretManager/clusterTracker fields on the FifoScheduler to null, which eventually leads to NPE. Since the FifoScheduler only has one queue that cannot be refreshed, the correct behavior is for the refreshQueues call to be a no-op. I will attach a patch that fixes this by splitting the ResourceScheduler's reinitialize method into separate initialize/updateQueues methods. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-1347) Missing synchronization in MultipleOutputFormat
[ https://issues.apache.org/jira/browse/MAPREDUCE-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057031#comment-13057031 ] Hadoop QA commented on MAPREDUCE-1347: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12484597/MAPREDUCE-1347.r6.diff against trunk revision 1140942. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. -1 javac. The patch appears to cause tar ant target to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: -1 contrib tests. The patch failed contrib unit tests. -1 system test framework. The patch failed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/432//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/432//console This message is automatically generated. Missing synchronization in MultipleOutputFormat --- Key: MAPREDUCE-1347 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1347 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.2, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: Harsh J Fix For: 0.23.0 Attachments: MAPREDUCE-1347.r2.diff, MAPREDUCE-1347.r3.diff, MAPREDUCE-1347.r4.diff, MAPREDUCE-1347.r5.diff, MAPREDUCE-1347.r6.diff, mapreduce.1347.r1.diff MultipleOutputFormat's RecordWriter implementation doesn't use synchronization when accessing the recordWriters member. When using multithreaded mappers or reducers, this can result in problems where two threads will both try to create the same file, causing AlreadyBeingCreatedException. Doing this more fine-grained than just synchronizing the whole method is probably a good idea, so that multithreaded mappers can actually achieve parallelism writing into separate output streams. From what I can tell, the new API's MultipleOutputs seems not to have this issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2630) MR-279: refreshQueues leads to NPEs when used w/FifoScheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Wills updated MAPREDUCE-2630: -- Attachment: MAPREDUCE-2630.patch MR-279: refreshQueues leads to NPEs when used w/FifoScheduler - Key: MAPREDUCE-2630 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2630 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Environment: All Reporter: Josh Wills Priority: Minor Attachments: MAPREDUCE-2630.patch Original Estimate: 1h Remaining Estimate: 1h The RM's admin service exposes a method refreshQueues that is used to update the queue configuration when used with the CapacityScheduler, but if it is used with the FifoScheduler, it will set the containerTokenSecretManager/clusterTracker fields on the FifoScheduler to null, which eventually leads to NPE. Since the FifoScheduler only has one queue that cannot be refreshed, the correct behavior is for the refreshQueues call to be a no-op. I will attach a patch that fixes this by splitting the ResourceScheduler's reinitialize method into separate initialize/updateQueues methods. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2596) Gridmix should notify job failures
[ https://issues.apache.org/jira/browse/MAPREDUCE-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-2596: -- Status: Open (was: Patch Available) Gridmix should notify job failures -- Key: MAPREDUCE-2596 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2596 Project: Hadoop Map/Reduce Issue Type: Improvement Components: benchmarks, contrib/gridmix Affects Versions: 0.23.0 Reporter: Arun C Murthy Assignee: Amar Kamat Attachments: gridmix-summary-v1.3.patch, gridmix-summary-v1.4.patch, gridmix-summary-v1.5.patch Gridmix doesn't warn the user if any of the jobs in the mix fail... it probably should printout a summary of the jobs and other statistics at the end too. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2596) Gridmix should notify job failures
[ https://issues.apache.org/jira/browse/MAPREDUCE-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-2596: -- Attachment: gridmix-summary-v1.6.patch Attaching a new patch that fixes a corner case (only data gen). test-patch passed on my box. Gridmix should notify job failures -- Key: MAPREDUCE-2596 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2596 Project: Hadoop Map/Reduce Issue Type: Improvement Components: benchmarks, contrib/gridmix Affects Versions: 0.23.0 Reporter: Arun C Murthy Assignee: Amar Kamat Attachments: gridmix-summary-v1.3.patch, gridmix-summary-v1.4.patch, gridmix-summary-v1.5.patch, gridmix-summary-v1.6.patch Gridmix doesn't warn the user if any of the jobs in the mix fail... it probably should printout a summary of the jobs and other statistics at the end too. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2576) Typo in comment in SimulatorLaunchTaskAction.java
[ https://issues.apache.org/jira/browse/MAPREDUCE-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057256#comment-13057256 ] Hudson commented on MAPREDUCE-2576: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) Typo in comment in SimulatorLaunchTaskAction.java - Key: MAPREDUCE-2576 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2576 Project: Hadoop Map/Reduce Issue Type: Bug Environment: trunk Reporter: Sherry Chen Assignee: Tim Sell Priority: Trivial Fix For: 0.23.0 Attachments: MAPREDUCE-2576.patch This JIRA is to track a fix to a super-trivial issue of a typo of or misspelled as xor in Line 24 of SimulatorLaunchTaskAction.java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2104) Rumen TraceBuilder Does Not Emit CPU/Memory Usage Details in Traces
[ https://issues.apache.org/jira/browse/MAPREDUCE-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057252#comment-13057252 ] Hudson commented on MAPREDUCE-2104: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) Rumen TraceBuilder Does Not Emit CPU/Memory Usage Details in Traces --- Key: MAPREDUCE-2104 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2104 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Affects Versions: 0.22.0 Reporter: Ranjit Mathew Assignee: Amar Kamat Fix For: 0.23.0 Attachments: mapreduce-2104-v1.1.patch, mapreduce-2104-v1.7.patch, mapreduce-2104-v1.8.1.patch Via MAPREDUCE-220, we now have CPU/Memory usage information in MapReduce JobHistory files. However, Rumen's TraceBuilder does not emit this information in the JSON traces. Without this information, GridMix3 cannot emulate CPU/Memory usage correctly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2573) New findbugs warning after MAPREDUCE-2494
[ https://issues.apache.org/jira/browse/MAPREDUCE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057253#comment-13057253 ] Hudson commented on MAPREDUCE-2573: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) New findbugs warning after MAPREDUCE-2494 - Key: MAPREDUCE-2573 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2573 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Todd Lipcon Assignee: Robert Joseph Evans Fix For: 0.23.0 Attachments: MR-2573-mr-trunk-V1.patch MAPREDUCE-2494 introduced the following findbugs warning in trunk: TrackerDistributedCacheManager.java:739, SIC_INNER_SHOULD_BE_STATIC, Priority: Low Should org.apache.hadoop.mapreduce.filecache.TrackerDistributedCacheManager$CacheDir be a _static_ inner class? This class is an inner class, but does not use its embedded reference to the object which created it. This reference makes the instances of the class larger, and may keep the reference to the creator object alive longer than necessary. If possible, the class should be made static. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2487) ChainReducer uses MAPPER_BY_VALUE instead of REDUCER_BY_VALUE
[ https://issues.apache.org/jira/browse/MAPREDUCE-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057254#comment-13057254 ] Hudson commented on MAPREDUCE-2487: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) ChainReducer uses MAPPER_BY_VALUE instead of REDUCER_BY_VALUE - Key: MAPREDUCE-2487 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2487 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Forrest Vines Assignee: Devaraj K Priority: Minor Fix For: 0.22.0 Attachments: MAPREDUCE-2487.patch On line 293 of o.a.h.mapred.lib.Chain in setReducer(...): reducerConf.setBoolean(MAPPER_BY_VALUE, byValue); this should be REDUCER_BY_VALUE. http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-737/org/apache/hadoop/mapred/lib/Chain.java#293 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2571) CombineFileInputFormat.getSplits throws a java.lang.ArrayStoreException
[ https://issues.apache.org/jira/browse/MAPREDUCE-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057255#comment-13057255 ] Hudson commented on MAPREDUCE-2571: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) CombineFileInputFormat.getSplits throws a java.lang.ArrayStoreException --- Key: MAPREDUCE-2571 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2571 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Bochun Bai Assignee: Bochun Bai Priority: Blocker Fix For: 0.22.0 Attachments: MAPREDUCE-2571.patch, MAPREDUCE-2571.patch, MAPREDUCE-2571.patch The getSplits methods of org.apache.hadoop.mapred.lib.CombineFileInputFormat not work. ...mapred.lib.CombineFileInputFormat(0.20-style) is a proxy for ...mapreduce.lib.input.CombineFileInputFormat(0.21-style) The 0.21-style getSplits returns ArrayList...mapreduce.lib.input.CombineFileSplit and the 0.20-style delegation calls toArray(...mapred.InputSplit[]) The ...mapreduce.lib.input.CombineFileSplit is based on ...mapreduce.InputSplit and ...mapred.InputSplit is a interface, not a super-class of ...mapreduce.InputSplit -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2550) bin/mapred no longer works from a source checkout
[ https://issues.apache.org/jira/browse/MAPREDUCE-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057260#comment-13057260 ] Hudson commented on MAPREDUCE-2550: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) bin/mapred no longer works from a source checkout - Key: MAPREDUCE-2550 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2550 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.20.3 Environment: Java 6, Redhat 5.5 Reporter: Eric Yang Assignee: Eric Yang Priority: Blocker Fix For: 0.23.0 Attachments: MAPREDUCE-2550-1.patch, MAPREDUCE-2550-2.patch, MAPREDUCE-2550.patch Developer may want to run hadoop without extracting tarball. It would be nice if existing method to run mapred scripts from source code is preserved for developers. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2554) Gridmix distributed cache emulation system tests.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057258#comment-13057258 ] Hudson commented on MAPREDUCE-2554: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) Gridmix distributed cache emulation system tests. - Key: MAPREDUCE-2554 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2554 Project: Hadoop Map/Reduce Issue Type: Task Components: contrib/gridmix Reporter: Vinay Kumar Thota Assignee: Vinay Kumar Thota Fix For: 0.23.0 Attachments: MAPREDUCE-2554-v2.patch, MAPREDUCE-2554.patch 1.Verify the emulation of HDFS and Local FS distributed cache files against the given input trace file. 2.Verify the Gridmix emulation of HDFS distributed cache files of different visibilities. 3.Verify the Gridmix emulation of HDFS distributed cache file which uses different jobs that are submitted with different users. 4.Verify the emulation of local FS distributed cache files. 5.Verify the Gridmix emulation of HDFS private distributed cache file. 6.Verify the Gridmix emulation of HDFS public distributed cache file. 7.Verify the Gridmix emulation of Multiple HDFS private distributed cache files. 8.Verify the Gridmix emulation of Multiple HDFS public distributed cache files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2430) Remove mrunit contrib
[ https://issues.apache.org/jira/browse/MAPREDUCE-2430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057261#comment-13057261 ] Hudson commented on MAPREDUCE-2430: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) Remove mrunit contrib - Key: MAPREDUCE-2430 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2430 Project: Hadoop Map/Reduce Issue Type: Task Reporter: Nigel Daley Assignee: Nigel Daley Fix For: 0.23.0 Attachments: MAPREDUCE-2430.patch As per vote on general@ (http://mail-archives.apache.org/mod_mbox/hadoop-general/201102.mbox/%3c77405974-6771-4604-926b-976240743...@mac.com%3E) mrunit contrib can now be removed from contrib since it's code has been moved to the incubator: svn remove mapreduce/trunk/src/contrib/mrunit A link to the incubator project should be added to our website related projects section. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2531) org.apache.hadoop.mapred.jobcontrol.getAssignedJobID throw class cast exception
[ https://issues.apache.org/jira/browse/MAPREDUCE-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057259#comment-13057259 ] Hudson commented on MAPREDUCE-2531: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) org.apache.hadoop.mapred.jobcontrol.getAssignedJobID throw class cast exception Key: MAPREDUCE-2531 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2531 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.22.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Fix For: 0.23.0 Attachments: MR-2531-V1-trunk.patch, MR-2531-yarn-v1.patch When using a combination of the mapred and mapreduce APIs (PIG) it is possible to have the following exception Caused by: java.lang.ClassCastException: org.apache.hadoop.mapreduce.JobID cannot be cast to org.apache.hadoop.mapred.JobID at org.apache.hadoop.mapred.jobcontrol.Job.getAssignedJobID(Job.java:71) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:239) at org.apache.pig.PigServer.launchPlan(PigServer.java:1325) ... 29 more This is because the JobID is just downcast. It should be calling JobID.downgrade -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2559) ant binary fails due to missing c++ lib dir
[ https://issues.apache.org/jira/browse/MAPREDUCE-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057263#comment-13057263 ] Hudson commented on MAPREDUCE-2559: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) ant binary fails due to missing c++ lib dir --- Key: MAPREDUCE-2559 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2559 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.20.3 Environment: Redhat 5.5, Java 6 Reporter: Eric Yang Assignee: Eric Yang Fix For: 0.23.0 Attachments: MAPREDUCE-2559-1.patch, MAPREDUCE-2559-2.patch, MAPREDUCE-2559-3.patch, MAPREDUCE-2559.patch, mapreduce-2559-4.patch Post MAPRED-2521 ant binary fails without -Dcompile.c++=true -Dcompile.native=true. The bin-package is trying to copy from the c++ lib dir which doesn't exist yet. The binary target should check for the existence of this dir or would also be reasonable to depend on the compile-c++ (since this is the binary target). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2539) NPE when calling JobClient.getMapTaskReports for retired job
[ https://issues.apache.org/jira/browse/MAPREDUCE-2539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057264#comment-13057264 ] Hudson commented on MAPREDUCE-2539: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) NPE when calling JobClient.getMapTaskReports for retired job Key: MAPREDUCE-2539 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2539 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.22.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-2539-trunk-v1.patch, MR-2539-yarn-v1.patch When calling JobClient.getMapTaskReports for a retired job this results in a NPE. In the 0.20.* version an empty TaskReport array was returned instead. Caused by: java.lang.NullPointerException at org.apache.hadoop.mapred.JobClient.getMapTaskReports(JobClient.java:588) at org.apache.pig.tools.pigstats.JobStats.addMapReduceStatistics(JobStats.java:388) .. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2107) Emulate Memory Usage of Tasks in GridMix3
[ https://issues.apache.org/jira/browse/MAPREDUCE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057266#comment-13057266 ] Hudson commented on MAPREDUCE-2107: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) Emulate Memory Usage of Tasks in GridMix3 - Key: MAPREDUCE-2107 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2107 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/gridmix Affects Versions: 0.22.0 Reporter: Ranjit Mathew Assignee: Amar Kamat Fix For: 0.23.0 Attachments: gridmix-memory-emulation-v1.5.patch MAPREDUCE-220 makes CPU/Memory usage of Tasks available in JobHistory files. Use this to emulate the memory usage of Tasks (of course, once MAPREDUCE-2104 is done). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2452) Delegation token cancellation shouldn't hold global JobTracker lock
[ https://issues.apache.org/jira/browse/MAPREDUCE-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057265#comment-13057265 ] Hudson commented on MAPREDUCE-2452: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) Delegation token cancellation shouldn't hold global JobTracker lock --- Key: MAPREDUCE-2452 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2452 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.22.0 Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.23.0 Attachments: cancel-delegation-token-fix-trunk.patch, cancel-delegation-token-fix-trunk.patch, cancel-delegation-token-fix.patch, cancel-delegation-token-fix.patch, cancel-delegation-token-fix.patch Currently, when the JobTracker cancels a job's delegation token (at the end of the job), it holds the global lock. This is not desired. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2455) Remove deprecated JobTracker.State in favour of JobTrackerStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057267#comment-13057267 ] Hudson commented on MAPREDUCE-2455: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) Remove deprecated JobTracker.State in favour of JobTrackerStatus Key: MAPREDUCE-2455 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2455 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: build, client Reporter: Tom White Assignee: Tom White Fix For: 0.23.0 Attachments: MAPREDUCE-2455.patch MAPREDUCE-2337 deprecated getJobTrackerState() on ClusterStatus, this issue is to remove the getter (in favour of getJobTrackerStatus(), which will remain) so there is no longer a direct dependency of the public API on JobTracker. This is for MAPREDUCE-1638. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2624) Update RAID for HDFS-2107
[ https://issues.apache.org/jira/browse/MAPREDUCE-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057272#comment-13057272 ] Hudson commented on MAPREDUCE-2624: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) MAPREDUCE-2624. Update RAID for HDFS-2107. szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1140942 Files : * /hadoop/common/trunk/mapreduce/src/contrib/raid/src/test/org/apache/hadoop/hdfs/server/namenode/TestBlockPlacementPolicyRaid.java * /hadoop/common/trunk/mapreduce/src/contrib/raid/src/test/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockPlacementPolicyRaid.java * /hadoop/common/trunk/mapreduce/src/contrib/raid/src/java/org/apache/hadoop/hdfs/server/namenode/BlockPlacementPolicyRaid.java * /hadoop/common/trunk/mapreduce/CHANGES.txt * /hadoop/common/trunk/mapreduce/src/contrib/raid/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyRaid.java * /hadoop/common/trunk/mapreduce/src/contrib/raid/src/test/org/apache/hadoop/hdfs/server/blockmanagement * /hadoop/common/trunk/mapreduce/src/contrib/raid/src/java/org/apache/hadoop/hdfs/server/blockmanagement * /hadoop/common/trunk/mapreduce/src/contrib/raid/src/test/org/apache/hadoop/hdfs/server/namenode/NameNodeRaidTestUtil.java * /hadoop/common/trunk/mapreduce/src/contrib/raid/src/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRaidUtil.java Update RAID for HDFS-2107 - Key: MAPREDUCE-2624 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2624 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.23.0 Attachments: m2624_20110627b.patch, m2624_20110628.patch, svn_mv.sh HDFS-2107 is going to move BlockPlacementPolicy to another package. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2544) Gridmix compression emulation system tests.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057274#comment-13057274 ] Hudson commented on MAPREDUCE-2544: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) Gridmix compression emulation system tests. --- Key: MAPREDUCE-2544 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2544 Project: Hadoop Map/Reduce Issue Type: Task Components: contrib/gridmix Reporter: Vinay Kumar Thota Assignee: Vinay Kumar Thota Fix For: 0.23.0 Attachments: MAPREDUCE-2544-v2.patch, MAPREDUCE-2544.patch Develop system tests for the following cases. 1. Enable a compression emulation and generated the data using gridmix and verify whether compressed input generated or not. 2. Verify a Gridmix jobs map input, map output and reduce output compression ratios against the default compression ratios. 3. Verify a Gridmix jobs map input, map output and reduce output compression ratios against user specified compression ratios. 4. Verify a Gridmix jobs map input, map output compression ratios with file output compression format is false in original trace against default compression ratios. 5. Verify a Gridmix jobs map input, map output compression ratios with file output compression format is false in original trace against user specified compression ratios. 6. Verify a Gridmix jobs reduce output with file output compression format is true in original trace against the default compression ratios. 7. Verify a Gridmix jobs reduce output with file output compression format is true in original trace against the user specified compression ratios. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2469) Task counters should also report the total heap usage of the task
[ https://issues.apache.org/jira/browse/MAPREDUCE-2469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057269#comment-13057269 ] Hudson commented on MAPREDUCE-2469: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) Task counters should also report the total heap usage of the task - Key: MAPREDUCE-2469 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2469 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Labels: mapreduce Fix For: 0.23.0 Attachments: 2469.v0.1.patch, mapreduce-2469-v1.1.patch, mapreduce-2469-v1.2.patch, mapreduce-2469-v1.3.patch, mapreduce-2469-v1.4.patch Currently, the task counters report VSS and RSS usage of the task. The task counter should also report the total heap usage of the task also. The task might be configured with a max heap size of M but the task's total heap usage might only be H, where H M. In such a case, knowing only M doesn't provide a complete picture of the task's memory usage. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2543) [Gridmix] Add support for HighRam jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057273#comment-13057273 ] Hudson commented on MAPREDUCE-2543: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) [Gridmix] Add support for HighRam jobs -- Key: MAPREDUCE-2543 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2543 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/gridmix Reporter: Amar Kamat Assignee: Amar Kamat Fix For: 0.23.0 Attachments: mapreduce-2543-v1.2.patch, mapreduce-2543-v1.4.patch Gridmix currently ignores high ram job configuration of the original job. It would be nice if Gridmix configures the simulated job's high ram parameters such that the simulated job has same effect on the job scheduler task-tracker as the original job. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2603) Gridmix system tests are failing due to high ram emulation enable by default for normal mr jobs in the trace which exceeds the solt capacity.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057268#comment-13057268 ] Hudson commented on MAPREDUCE-2603: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) Gridmix system tests are failing due to high ram emulation enable by default for normal mr jobs in the trace which exceeds the solt capacity. - Key: MAPREDUCE-2603 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2603 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: Vinay Kumar Thota Assignee: Vinay Kumar Thota Fix For: 0.23.0 Attachments: MAPREDUCE-2603-v2.patch, MAPREDUCE-2603.patch In Gridmix high ram emulation enable by default.Because of this feature, some of the gridmix system tests are hanging for some time and then failing after timeout. Actually the failure case was occurring whenever reserved slot capacity exceeds the cluster slot capacity.So for fixing the issue by disabling the high ram emulation in the tests which are using the normal mr jobs in the traces. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2620) Update RAID for HDFS-2087
[ https://issues.apache.org/jira/browse/MAPREDUCE-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057278#comment-13057278 ] Hudson commented on MAPREDUCE-2620: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) Update RAID for HDFS-2087 - Key: MAPREDUCE-2620 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2620 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Affects Versions: 0.23.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: m2620_20110624.patch DataTransferProtocol was changed by HDFS-2087. Need to update RAID. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2529) Recognize Jetty bug 1342 and handle it
[ https://issues.apache.org/jira/browse/MAPREDUCE-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057276#comment-13057276 ] Hudson commented on MAPREDUCE-2529: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) Recognize Jetty bug 1342 and handle it -- Key: MAPREDUCE-2529 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2529 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.20.204.0, 0.23.0 Reporter: Thomas Graves Assignee: Thomas Graves Fix For: 0.20.204.0, 0.23.0 Attachments: M2529-1-20s.patch, M2529-1.patch, jetty1342-20security.patch, mapred2529-trunk.patch We are seeing many instances of the Jetty-1342 (http://jira.codehaus.org/browse/JETTY-1342). The bug doesn't cause Jetty to stop responding altogether, some fetches go through but a lot of them throw exceptions and eventually fail. The only way we have found to get the TT out of this state is to restart the TT. This jira is to catch this particular exception (or perhaps a configurable regex) and handle it in an automated way to either blacklist or shutdown the TT after seeing it a configurable number of them. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2185) Infinite loop at creating splits using CombineFileInputFormat
[ https://issues.apache.org/jira/browse/MAPREDUCE-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057279#comment-13057279 ] Hudson commented on MAPREDUCE-2185: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) Infinite loop at creating splits using CombineFileInputFormat - Key: MAPREDUCE-2185 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2185 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Reporter: Hairong Kuang Assignee: Ramkumar Vadali Fix For: 0.23.0 Attachments: MAPREDUCE-2185.patch This is caused by a missing block in HDFS. So the block's locations are empty. The following code adds the block to blockToNodes map but not to rackToBlocks map. Later on when generating splits, only blocks in rackToBlocks are removed from blockToNodes map. So blockToNodes map can never become empty therefore causing infinite loop {code} // add this block to the block -- node locations map blockToNodes.put(oneblock, oneblock.hosts); // add this block to the rack -- block map for (int j = 0; j oneblock.racks.length; j++) { .. } {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2581) Spelling errors in log messages (MapTask)
[ https://issues.apache.org/jira/browse/MAPREDUCE-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057275#comment-13057275 ] Hudson commented on MAPREDUCE-2581: --- Integrated in Hadoop-Mapreduce-trunk #722 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/722/]) Spelling errors in log messages (MapTask) - Key: MAPREDUCE-2581 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2581 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Dave Syer Assignee: Tim Sell Priority: Trivial Fix For: 0.23.0 Attachments: MAPREDUCE-2581.2.patch, MAPREDUCE-2581.3.patch, MAPREDUCE-2581.patch Spelling errors in log messages (MapTask) - e.g. search for recieve (should be receive). A decent IDE should detect these errors as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2376) test-task-controller fails if run as a userid 1000
[ https://issues.apache.org/jira/browse/MAPREDUCE-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057493#comment-13057493 ] Wing Yew Poon commented on MAPREDUCE-2376: -- On CentOS 5.5 (and possibly other systems), non-system users have id = 500, not 1000. The test should check that the user id is between UID_MIN and UID_MAX in /etc/login.defs of the system. We are being bitten by this issue on our kerberos-secured cluster. test-task-controller fails if run as a userid 1000 Key: MAPREDUCE-2376 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2376 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.22.0 Attachments: mapreduce-2376-20.txt test-task-controller tries to verify that the task-controller won't run on behalf of users with uid 1000. This makes the test fail when running in some test environments - eg our hudson jobs internally run as a system user with uid 101. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-2631) Potential resource leaks in BinaryProtocol$TeeOutputStream.java and TaskLogServlet.java
Potential resource leaks in BinaryProtocol$TeeOutputStream.java and TaskLogServlet.java --- Key: MAPREDUCE-2631 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2631 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.23.0 Reporter: Ravi Teja Ch N V {code:title=TaskLogServlet.java|borderStyle=solid} private void printTaskLog(HttpServletResponse response, OutputStream out, TaskAttemptID taskId, long start, long end, boolean plainText, TaskLog.LogName filter, boolean isCleanup) throws IOException { if (!plainText) { out.write((brbu + filter + logs/u/bbr\n + pre\n).getBytes()); } try { InputStream taskLogReader = new TaskLog.Reader(taskId, filter, start, end, isCleanup); byte[] b = new byte[65536]; int result; while (true) { result = taskLogReader.read(b); if (result 0) { if (plainText) { out.write(b, 0, result); } else { HtmlQuoting.quoteHtmlChars(out, b, 0, result); } } else { break; } } taskLogReader.close(); {code} In the above code, if any exception thrown while reading (taskLogReader.read(b)), taskLogReader will not be closed. {code:title=BinaryProtocol$TeeOutputStream.java|borderStyle=solid} public void close() throws IOException { flush(); file.close(); out.close(); } {code} In the above code, if the file.close() throws any exception out will not be closed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.
Avoid calling the partitioner when the numReduceTasks is 1. --- Key: MAPREDUCE-2632 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Affects Versions: 0.23.0 Reporter: Ravi Teja Ch N V We can avoid the call to the partitioner when the number of reducers is 1.This will avoid the unnecessary computations by the partitioner. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057626#comment-13057626 ] Harsh J commented on MAPREDUCE-2632: This went in as an Incompatible Change in 0.21.0. Has it been reverted away to demand a re-entry? MAPREDUCE-1287 did the change earlier I think. Avoid calling the partitioner when the numReduceTasks is 1. --- Key: MAPREDUCE-2632 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Affects Versions: 0.23.0 Reporter: Ravi Teja Ch N V We can avoid the call to the partitioner when the number of reducers is 1.This will avoid the unnecessary computations by the partitioner. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-2632: --- Hadoop Flags: [Incompatible change] This would be an incompatible change since API users may always expect a partitioner logic to get called despite reducer # settings, and the change may not even initialize it. Avoid calling the partitioner when the numReduceTasks is 1. --- Key: MAPREDUCE-2632 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Affects Versions: 0.23.0 Reporter: Ravi Teja Ch N V We can avoid the call to the partitioner when the number of reducers is 1.This will avoid the unnecessary computations by the partitioner. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057630#comment-13057630 ] Ravi Teja Ch N V commented on MAPREDUCE-2632: - The fix was missed in this case. {code:title=MapFileOutputFormat.java|borderStyle=solid} public static K extends WritableComparable?, V extends Writable Writable getEntry(MapFile.Reader[] readers, PartitionerK, V partitioner, K key, V value) throws IOException { int part = partitioner.getPartition(key, value, readers.length); return readers[part].get(key, value); } {code} Avoid calling the partitioner when the numReduceTasks is 1. --- Key: MAPREDUCE-2632 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Affects Versions: 0.23.0 Reporter: Ravi Teja Ch N V We can avoid the call to the partitioner when the number of reducers is 1.This will avoid the unnecessary computations by the partitioner. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057633#comment-13057633 ] Harsh J commented on MAPREDUCE-2632: Ravi, Sorry for much noise! Not sure I understand your comment above… Do you intend to only apply fixes to some of the other remaining Partitioner implementations or is this change going to be deeper than that (as I'd thought)? Avoid calling the partitioner when the numReduceTasks is 1. --- Key: MAPREDUCE-2632 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Affects Versions: 0.23.0 Reporter: Ravi Teja Ch N V We can avoid the call to the partitioner when the number of reducers is 1.This will avoid the unnecessary computations by the partitioner. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057634#comment-13057634 ] Ravi Teja Ch N V commented on MAPREDUCE-2632: - Yes, As part of MAPREDUCE-1287, It has handled for all the partitioner implementations but in this case, we can avoid the call for the partitioner itself. Avoid calling the partitioner when the numReduceTasks is 1. --- Key: MAPREDUCE-2632 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Affects Versions: 0.23.0 Reporter: Ravi Teja Ch N V We can avoid the call to the partitioner when the number of reducers is 1.This will avoid the unnecessary computations by the partitioner. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira