[jira] [Updated] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files
[ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-2407: Status: Open (was: Patch Available) Make Gridmix emulate usage of Distributed Cache files - Key: MAPREDUCE-2407 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/gridmix Affects Versions: 0.23.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.23.0 Attachments: 2407.patch, 2407.v1.patch Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files
[ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-2407: Attachment: 2407.v1.1.patch Attaching new patch updating Amar's offline minor comments. Make Gridmix emulate usage of Distributed Cache files - Key: MAPREDUCE-2407 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/gridmix Affects Versions: 0.23.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.23.0 Attachments: 2407.patch, 2407.v1.1.patch, 2407.v1.patch Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files
[ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-2407: Status: Patch Available (was: Open) Make Gridmix emulate usage of Distributed Cache files - Key: MAPREDUCE-2407 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/gridmix Affects Versions: 0.23.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.23.0 Attachments: 2407.patch, 2407.v1.1.patch, 2407.v1.patch Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files
[ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037792#comment-13037792 ] Amar Kamat commented on MAPREDUCE-2407: --- Patch looks good to me. +1 Make Gridmix emulate usage of Distributed Cache files - Key: MAPREDUCE-2407 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/gridmix Affects Versions: 0.23.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.23.0 Attachments: 2407.patch, 2407.v1.1.patch, 2407.v1.patch Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task
[ https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-2492: -- Attachment: MAPREDUCE-2492-v1.6.patch Updated the testcase such that reduce() is called multiple times. test-patch and the modified testcases pass on my local box. [MAPREDUCE] The new MapReduce API should make available task's progress to the task --- Key: MAPREDUCE-2492 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch, MAPREDUCE-2492-v1.5.patch, MAPREDUCE-2492-v1.6.patch There is no way to get the task's current progress in the new MapReduce API. It would be nice to make it available so that the task (map/reduce) can use it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task
[ https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-2492: -- Status: Patch Available (was: Open) Running the latest patch (with NLineInputFormat) through Hudson. [MAPREDUCE] The new MapReduce API should make available task's progress to the task --- Key: MAPREDUCE-2492 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch, MAPREDUCE-2492-v1.5.patch, MAPREDUCE-2492-v1.6.patch There is no way to get the task's current progress in the new MapReduce API. It would be nice to make it available so that the task (map/reduce) can use it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files
[ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037892#comment-13037892 ] Hadoop QA commented on MAPREDUCE-2407: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480093/2407.v1.1.patch against trunk revision 1125599. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 10 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/291//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/291//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/291//console This message is automatically generated. Make Gridmix emulate usage of Distributed Cache files - Key: MAPREDUCE-2407 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/gridmix Affects Versions: 0.23.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.23.0 Attachments: 2407.patch, 2407.v1.1.patch, 2407.v1.patch Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task
[ https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-2492: -- Status: Open (was: Patch Available) [MAPREDUCE] The new MapReduce API should make available task's progress to the task --- Key: MAPREDUCE-2492 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch, MAPREDUCE-2492-v1.5.patch, MAPREDUCE-2492-v1.6.patch There is no way to get the task's current progress in the new MapReduce API. It would be nice to make it available so that the task (map/reduce) can use it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task
[ https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-2492: -- Attachment: MAPREDUCE-2492-v1.7.patch Found some issue with stale left over files in the shared directory. Modified the patch to use a unique folder for each test case. [MAPREDUCE] The new MapReduce API should make available task's progress to the task --- Key: MAPREDUCE-2492 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch, MAPREDUCE-2492-v1.5.patch, MAPREDUCE-2492-v1.6.patch, MAPREDUCE-2492-v1.7.patch There is no way to get the task's current progress in the new MapReduce API. It would be nice to make it available so that the task (map/reduce) can use it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task
[ https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-2492: -- Status: Patch Available (was: Open) [MAPREDUCE] The new MapReduce API should make available task's progress to the task --- Key: MAPREDUCE-2492 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch, MAPREDUCE-2492-v1.5.patch, MAPREDUCE-2492-v1.6.patch, MAPREDUCE-2492-v1.7.patch There is no way to get the task's current progress in the new MapReduce API. It would be nice to make it available so that the task (map/reduce) can use it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2470) Receiving NPE occasionally on RunningJob.getCounters() call
[ https://issues.apache.org/jira/browse/MAPREDUCE-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037940#comment-13037940 ] Robert Joseph Evans commented on MAPREDUCE-2470: I am very sorry about that. I ran the tests after V1, but not V2 of the patch. I will investigate the failures. Receiving NPE occasionally on RunningJob.getCounters() call --- Key: MAPREDUCE-2470 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2470 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.21.0 Environment: FreeBSD, Java6, Hadoop r0.21.0 Reporter: Aaron Baff Assignee: Robert Joseph Evans Fix For: 0.23.0 Attachments: MAPREDUCE-2470-v1.patch, MAPREDUCE-2470-v2.patch, counters_null_data.pcap This is running in a Java daemon that is used as an interface (Thrift) to get information and data from MR Jobs. Using JobClient.getJob(JobID) I successfully get a RunningJob object (I'm checking for NULL), and then rarely I get an NPE when I do RunningJob.getCounters(). This seems to occur after the daemon has been up and running for a while, and in the event of an Exception, I close the JobClient, set it to NULL, and a new one should then be created on the next request for data. Yet, I still seem to be unable to fetch the Counters. Below is the stack trace. java.lang.NullPointerException at org.apache.hadoop.mapred.Counters.downgrade(Counters.java:77) at org.apache.hadoop.mapred.JobClient$NetworkedJob.getCounters(JobClient.java:381) at com.telescope.HadoopThrift.service.ServiceImpl.getReportResults(ServiceImpl.java:350) at com.telescope.HadoopThrift.gen.HadoopThrift$Processor$getReportResults.process(HadoopThrift.java:545) at com.telescope.HadoopThrift.gen.HadoopThrift$Processor.process(HadoopThrift.java:421) at org.apache.thrift.server.TNonblockingServer$FrameBuffer.invoke(TNonblockingServer.java:697) at org.apache.thrift.server.THsHaServer$Invocation.run(THsHaServer.java:317) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files
[ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-2407: Resolution: Fixed Release Note: Makes Gridmix emulate HDFS based distributed cache files and local file system based distributed cache files. Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I just committed this to trunk. Make Gridmix emulate usage of Distributed Cache files - Key: MAPREDUCE-2407 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/gridmix Affects Versions: 0.23.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.23.0 Attachments: 2407.patch, 2407.v1.1.patch, 2407.v1.patch Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2528) NullPointerException in the job tracker UI, when we perform kill or change the priority of jobs without selecting the any job.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-2528: - Fix Version/s: 0.23.0 Status: Patch Available (was: Open) Checking for at least one job selection before performing kill selected jobs or change priority. NullPointerException in the job tracker UI, when we perform kill or change the priority of jobs without selecting the any job. -- Key: MAPREDUCE-2528 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2528 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.23.0 Reporter: Devaraj K Assignee: Devaraj K Fix For: 0.23.0 Attachments: MAPREDUCE-2528.patch If we click on Kill Selected Jobs or Change button without selecting any job, it is giving the below exception in the UI. {code} java.lang.NullPointerException at org.apache.hadoop.http.HttpServer$QuotingInputFilter$RequestQuoter.getParameterValues(HttpServer.java:798) at org.apache.hadoop.mapred.JSPUtil.processButtons(JSPUtil.java:209) at org.apache.hadoop.mapred.jobtracker_jsp._jspService(jobtracker_jsp.java:146) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1124) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:871) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1115) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:361) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:324) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:879) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:741) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:213) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522) {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files
[ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037978#comment-13037978 ] Hudson commented on MAPREDUCE-2407: --- Integrated in Hadoop-Mapreduce-trunk-Commit #695 (See [https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/695/]) MAPREDUCE-2407. Make GridMix emulate usage of distributed cache files in simulated jobs. ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1126499 Files : * /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/PseudoLocalFs.java * /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/DistributedCacheEmulator.java * /hadoop/mapreduce/trunk/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestPseudoLocalFs.java * /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/Gridmix.java * /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateData.java * /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/JobCreator.java * /hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/gridmix.xml * /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java * /hadoop/mapreduce/trunk/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/DebugJobProducer.java * /hadoop/mapreduce/trunk/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestDistCacheEmulation.java * /hadoop/mapreduce/trunk/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestGridmixSubmission.java Make Gridmix emulate usage of Distributed Cache files - Key: MAPREDUCE-2407 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/gridmix Affects Versions: 0.23.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.23.0 Attachments: 2407.patch, 2407.v1.1.patch, 2407.v1.patch Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task
[ https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038004#comment-13038004 ] Hadoop QA commented on MAPREDUCE-2492: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480106/MAPREDUCE-2492-v1.7.patch against trunk revision 1125599. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/293//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/293//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/293//console This message is automatically generated. [MAPREDUCE] The new MapReduce API should make available task's progress to the task --- Key: MAPREDUCE-2492 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch, MAPREDUCE-2492-v1.5.patch, MAPREDUCE-2492-v1.6.patch, MAPREDUCE-2492-v1.7.patch There is no way to get the task's current progress in the new MapReduce API. It would be nice to make it available so that the task (map/reduce) can use it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable
[ https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038008#comment-13038008 ] jirapos...@reviews.apache.org commented on MAPREDUCE-2489: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/776/ --- Review request for hadoop-mapreduce. Summary --- We saw an issue where a custom InputSplit was returning invalid hostnames (non-repeating) for the splits that were then causing the JobTracker to attempt to excessively resolve host names. This caused a major slowdown for the JobTracker. We should prevent invalid InputSplit hostnames from affecting everyone else. I propose we implement some verification for the hostnames to try to ensure that we only do DNS lookups on valid hostnames (and fail otherwise). We could also fail the job after a certain number of failures in the resolve. NOTE: This requires the changes in HADOOP-7314 This addresses bug MAPREDUCE-2489. https://issues.apache.org/jira/browse/MAPREDUCE-2489 Diffs - trunk/ivy.xml 1125074 trunk/ivy/libraries.properties 1125074 trunk/src/contrib/mumak/src/java/org/apache/hadoop/mapred/SimulatorJobTracker.java 1125074 trunk/src/java/org/apache/hadoop/mapred/JobInProgress.java 1125074 trunk/src/java/org/apache/hadoop/mapred/JobTracker.java 1125074 Diff: https://reviews.apache.org/r/776/diff Testing --- Thanks, Jeffrey Jobsplits with random hostnames can make the queue unusable --- Key: MAPREDUCE-2489 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Jeffrey Naisbitt Assignee: Jeffrey Naisbitt Attachments: MAPREDUCE-2489-mapred.patch We saw an issue where a custom InputSplit was returning invalid hostnames for the splits that were then causing the JobTracker to attempt to excessively resolve host names. This caused a major slowdown for the JobTracker. We should prevent invalid InputSplit hostnames from affecting everyone else. I propose we implement some verification for the hostnames to try to ensure that we only do DNS lookups on valid hostnames (and fail otherwise). We could also fail the job after a certain number of failures in the resolve. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2470) Receiving NPE occasionally on RunningJob.getCounters() call
[ https://issues.apache.org/jira/browse/MAPREDUCE-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-2470: --- Attachment: MAPREDUCE-2470-v3.patch Fixed issues with fault injection build. Receiving NPE occasionally on RunningJob.getCounters() call --- Key: MAPREDUCE-2470 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2470 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.21.0 Environment: FreeBSD, Java6, Hadoop r0.21.0 Reporter: Aaron Baff Assignee: Robert Joseph Evans Fix For: 0.23.0 Attachments: MAPREDUCE-2470-v1.patch, MAPREDUCE-2470-v2.patch, MAPREDUCE-2470-v3.patch, counters_null_data.pcap This is running in a Java daemon that is used as an interface (Thrift) to get information and data from MR Jobs. Using JobClient.getJob(JobID) I successfully get a RunningJob object (I'm checking for NULL), and then rarely I get an NPE when I do RunningJob.getCounters(). This seems to occur after the daemon has been up and running for a while, and in the event of an Exception, I close the JobClient, set it to NULL, and a new one should then be created on the next request for data. Yet, I still seem to be unable to fetch the Counters. Below is the stack trace. java.lang.NullPointerException at org.apache.hadoop.mapred.Counters.downgrade(Counters.java:77) at org.apache.hadoop.mapred.JobClient$NetworkedJob.getCounters(JobClient.java:381) at com.telescope.HadoopThrift.service.ServiceImpl.getReportResults(ServiceImpl.java:350) at com.telescope.HadoopThrift.gen.HadoopThrift$Processor$getReportResults.process(HadoopThrift.java:545) at com.telescope.HadoopThrift.gen.HadoopThrift$Processor.process(HadoopThrift.java:421) at org.apache.thrift.server.TNonblockingServer$FrameBuffer.invoke(TNonblockingServer.java:697) at org.apache.thrift.server.THsHaServer$Invocation.run(THsHaServer.java:317) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task
[ https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-2492: -- Resolution: Fixed Fix Version/s: 0.23.0 Release Note: Map and Reduce task can access the attempt's overall progress via TaskAttemptContext. Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I just committed this. [MAPREDUCE] The new MapReduce API should make available task's progress to the task --- Key: MAPREDUCE-2492 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Fix For: 0.23.0 Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch, MAPREDUCE-2492-v1.5.patch, MAPREDUCE-2492-v1.6.patch, MAPREDUCE-2492-v1.7.patch There is no way to get the task's current progress in the new MapReduce API. It would be nice to make it available so that the task (map/reduce) can use it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-2511) Progress reported by map tasks of a map-only job is incorrect
[ https://issues.apache.org/jira/browse/MAPREDUCE-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat resolved MAPREDUCE-2511. --- Resolution: Duplicate Fixed as part of MAPREDUCE-2492. Progress reported by map tasks of a map-only job is incorrect - Key: MAPREDUCE-2511 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2511 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat For a map task of a map-reduce job, the progress bar is (logically) split into 2 distinct phases 1. Map Phase 2. Sort Phase The map phase manages 66% of the overall tasks progress while the sort phase governs the rest i.e 33%. For a map task of a map-only job, there is no sort phase. Hence the entire map phase should govern 100% of the task's progress. Currently, the progress bar is split divided into 66%-33% irrespective of whether the job has reducers or not (i.e whether there is a sort phase or not). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-2523) TestTaskContext should cleanup its temporary files/folders on completion
[ https://issues.apache.org/jira/browse/MAPREDUCE-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat resolved MAPREDUCE-2523. --- Resolution: Duplicate Fixed as part of MAPREDUCE-2492. TestTaskContext should cleanup its temporary files/folders on completion Key: MAPREDUCE-2523 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2523 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Amar Kamat Assignee: Amar Kamat Labels: test Fix For: 0.23.0 TestTaskContext creates in and out folders in the current working directory. Ideally these files should go under test.build.data or /tmp. Also the testcase should delete these files on completion. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-2519) Progress reported by a reduce task executed via LocalJobRunner is incorrect
[ https://issues.apache.org/jira/browse/MAPREDUCE-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat resolved MAPREDUCE-2519. --- Resolution: Duplicate Fixed as part of MAPREDUCE-2492. Progress reported by a reduce task executed via LocalJobRunner is incorrect --- Key: MAPREDUCE-2519 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2519 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Labels: localjobrunner, progress, reduce ReduceTask splits its progress reporting into 3 phases 1. Copy 2. Shuffule 3. Reduce When the reduce task is run using a LocalJobRunner, the copy phase is ignored (skipped) but the progress is not updated. This results in a mismatch in the Reduce task's progress. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2528) NullPointerException in the job tracker UI, when we perform kill or change the priority of jobs without selecting the any job.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038068#comment-13038068 ] Hadoop QA commented on MAPREDUCE-2528: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480113/MAPREDUCE-2528.patch against trunk revision 1126499. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/294//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/294//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/294//console This message is automatically generated. NullPointerException in the job tracker UI, when we perform kill or change the priority of jobs without selecting the any job. -- Key: MAPREDUCE-2528 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2528 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.23.0 Reporter: Devaraj K Assignee: Devaraj K Fix For: 0.23.0 Attachments: MAPREDUCE-2528.patch If we click on Kill Selected Jobs or Change button without selecting any job, it is giving the below exception in the UI. {code} java.lang.NullPointerException at org.apache.hadoop.http.HttpServer$QuotingInputFilter$RequestQuoter.getParameterValues(HttpServer.java:798) at org.apache.hadoop.mapred.JSPUtil.processButtons(JSPUtil.java:209) at org.apache.hadoop.mapred.jobtracker_jsp._jspService(jobtracker_jsp.java:146) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1124) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:871) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1115) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:361) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:324) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:879) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:741) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:213) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522) {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task
[ https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038084#comment-13038084 ] Tom White commented on MAPREDUCE-2492: -- This is an incompatible change, since it adds a method to the public @Stable mapred.Reporter interface. Is it possible to rework this to only change the new API, as the title suggests? This would then be a compatible change since the classes that have been changed in the new API are private or @Evolving. Also, it looks like this wasn't reviewed before being committed. [MAPREDUCE] The new MapReduce API should make available task's progress to the task --- Key: MAPREDUCE-2492 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Fix For: 0.23.0 Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch, MAPREDUCE-2492-v1.5.patch, MAPREDUCE-2492-v1.6.patch, MAPREDUCE-2492-v1.7.patch There is no way to get the task's current progress in the new MapReduce API. It would be nice to make it available so that the task (map/reduce) can use it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2186) DistributedRaidFileSystem should implement getFileBlockLocations()
[ https://issues.apache.org/jira/browse/MAPREDUCE-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038086#comment-13038086 ] Ramkumar Vadali commented on MAPREDUCE-2186: The main motivation to open this jira was to allow CombineFileInputFormat to work when there are missing blocks. CombineFileInputFormat figures out the host/rack information for input blocks and uses that information to create input splits. It does not handle the case where a block does not have any host/rack information. The proposed fix to return the location of parity blocks in the case where source blocks are missing is not good because it is fixing the problem in the wrong place. It also causes us to get false locality. Instead of changing RAID FS to handle this case, its better to fix CFIF to handle the case when there are missing blocks (MAPREDUCE-2185) DistributedRaidFileSystem should implement getFileBlockLocations() -- Key: MAPREDUCE-2186 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2186 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid Reporter: Ramkumar Vadali Assignee: Ramkumar Vadali If a RAIDed file has missing blocks, DistributedRaidFileSystem.getFileBlockLocations() would return no block locations. This could lead a client to believe that the file is not readable. But if parity data is available, the file actually is readable. It would be better to implement getFileBlockLocations() and return the location of the parity blocks that would be needed to reconstruct the missing block. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-2186) DistributedRaidFileSystem should implement getFileBlockLocations()
[ https://issues.apache.org/jira/browse/MAPREDUCE-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali resolved MAPREDUCE-2186. Resolution: Won't Fix Better to fix MAPREDUCE-2185 DistributedRaidFileSystem should implement getFileBlockLocations() -- Key: MAPREDUCE-2186 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2186 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid Reporter: Ramkumar Vadali Assignee: Ramkumar Vadali If a RAIDed file has missing blocks, DistributedRaidFileSystem.getFileBlockLocations() would return no block locations. This could lead a client to believe that the file is not readable. But if parity data is available, the file actually is readable. It would be better to implement getFileBlockLocations() and return the location of the parity blocks that would be needed to reconstruct the missing block. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-2498) TestRaidShellFsck failing on trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali reassigned MAPREDUCE-2498: -- Assignee: Ramkumar Vadali (was: Todd Lipcon) TestRaidShellFsck failing on trunk -- Key: MAPREDUCE-2498 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2498 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Ramkumar Vadali Priority: Critical Fix For: 0.23.0 Attachments: mapreduce-2498.txt TestRaidShellFsck.testFileBlockAndParityBlockMissingHar2 has been failing the last several builds: Error Message: parity file not HARed after 40s java.io.IOException: parity file not HARed after 40s at org.apache.hadoop.raid.TestRaidShellFsck.raidTestFiles(TestRaidShellFsck.java:281) at org.apache.hadoop.raid.TestRaidShellFsck.setUp(TestRaidShellFsck.java:181) at org.apache.hadoop.raid.TestRaidShellFsck.testFileBlockAndParityBlockMissingHar2(TestRaidShellFsck.java:666) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task
[ https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038134#comment-13038134 ] Tom White commented on MAPREDUCE-2492: -- I think it makes sense to have Reporter provide the task's progress to the map/reduce task attempts. I would prefer marking this change as incompatible. It's certainly an improvement, but does it warrant an incompatible change to an interface marked as stable? There may be an argument that it does, but I would have expected to see some discussion about this before it was committed. Why not only change the new API and tell users that they need to use that one if they want to use this feature? [MAPREDUCE] The new MapReduce API should make available task's progress to the task --- Key: MAPREDUCE-2492 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Fix For: 0.23.0 Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch, MAPREDUCE-2492-v1.5.patch, MAPREDUCE-2492-v1.6.patch, MAPREDUCE-2492-v1.7.patch There is no way to get the task's current progress in the new MapReduce API. It would be nice to make it available so that the task (map/reduce) can use it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2470) Receiving NPE occasionally on RunningJob.getCounters() call
[ https://issues.apache.org/jira/browse/MAPREDUCE-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038143#comment-13038143 ] Hadoop QA commented on MAPREDUCE-2470: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480124/MAPREDUCE-2470-v3.patch against trunk revision 1126499. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/295//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/295//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/295//console This message is automatically generated. Receiving NPE occasionally on RunningJob.getCounters() call --- Key: MAPREDUCE-2470 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2470 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.21.0 Environment: FreeBSD, Java6, Hadoop r0.21.0 Reporter: Aaron Baff Assignee: Robert Joseph Evans Fix For: 0.23.0 Attachments: MAPREDUCE-2470-v1.patch, MAPREDUCE-2470-v2.patch, MAPREDUCE-2470-v3.patch, counters_null_data.pcap This is running in a Java daemon that is used as an interface (Thrift) to get information and data from MR Jobs. Using JobClient.getJob(JobID) I successfully get a RunningJob object (I'm checking for NULL), and then rarely I get an NPE when I do RunningJob.getCounters(). This seems to occur after the daemon has been up and running for a while, and in the event of an Exception, I close the JobClient, set it to NULL, and a new one should then be created on the next request for data. Yet, I still seem to be unable to fetch the Counters. Below is the stack trace. java.lang.NullPointerException at org.apache.hadoop.mapred.Counters.downgrade(Counters.java:77) at org.apache.hadoop.mapred.JobClient$NetworkedJob.getCounters(JobClient.java:381) at com.telescope.HadoopThrift.service.ServiceImpl.getReportResults(ServiceImpl.java:350) at com.telescope.HadoopThrift.gen.HadoopThrift$Processor$getReportResults.process(HadoopThrift.java:545) at com.telescope.HadoopThrift.gen.HadoopThrift$Processor.process(HadoopThrift.java:421) at org.apache.thrift.server.TNonblockingServer$FrameBuffer.invoke(TNonblockingServer.java:697) at org.apache.thrift.server.THsHaServer$Invocation.run(THsHaServer.java:317) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2495) The distributed cache cleanup thread has no monitoring to check to see if it has died for some reason
[ https://issues.apache.org/jira/browse/MAPREDUCE-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-2495: --- Status: Open (was: Patch Available) Chris indicated as a side comment in a different conversation that the sleeps in the tests are not very good, so I am reworking the tests to avoid using sleep. The distributed cache cleanup thread has no monitoring to check to see if it has died for some reason - Key: MAPREDUCE-2495 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2495 Project: Hadoop Map/Reduce Issue Type: Improvement Components: distributed-cache Affects Versions: 0.21.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Minor Attachments: MAPREDUCE-2495-20.20X-V1.patch, MAPREDUCE-2495-20.20X-V2.patch, MAPREDUCE-2495-20.20X-V3.patch, MAPREDUCE-2495-v1.patch, MAPREDUCE-2495-v2.patch, MAPREDUCE-2495-v3.patch The cleanup thread in the distributed cache handles IOExceptions and the like correctly, but just to be a bit more defensive it would be good to monitor the thread, and check that it is still alive regularly, so that the distributed cache does not fill up the entire disk on the node. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2495) The distributed cache cleanup thread has no monitoring to check to see if it has died for some reason
[ https://issues.apache.org/jira/browse/MAPREDUCE-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-2495: --- Status: Patch Available (was: Open) The distributed cache cleanup thread has no monitoring to check to see if it has died for some reason - Key: MAPREDUCE-2495 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2495 Project: Hadoop Map/Reduce Issue Type: Improvement Components: distributed-cache Affects Versions: 0.21.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Minor Attachments: MAPREDUCE-2495-20.20X-V1.patch, MAPREDUCE-2495-20.20X-V2.patch, MAPREDUCE-2495-20.20X-V3.patch, MAPREDUCE-2495-20.20X-V4.patch, MAPREDUCE-2495-v1.patch, MAPREDUCE-2495-v2.patch, MAPREDUCE-2495-v3.patch, MAPREDUCE-2495-v4.patch The cleanup thread in the distributed cache handles IOExceptions and the like correctly, but just to be a bit more defensive it would be good to monitor the thread, and check that it is still alive regularly, so that the distributed cache does not fill up the entire disk on the node. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2495) The distributed cache cleanup thread has no monitoring to check to see if it has died for some reason
[ https://issues.apache.org/jira/browse/MAPREDUCE-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-2495: --- Attachment: MAPREDUCE-2495-v4.patch MAPREDUCE-2495-20.20X-V4.patch Tests no longer sleep The distributed cache cleanup thread has no monitoring to check to see if it has died for some reason - Key: MAPREDUCE-2495 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2495 Project: Hadoop Map/Reduce Issue Type: Improvement Components: distributed-cache Affects Versions: 0.21.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Minor Attachments: MAPREDUCE-2495-20.20X-V1.patch, MAPREDUCE-2495-20.20X-V2.patch, MAPREDUCE-2495-20.20X-V3.patch, MAPREDUCE-2495-20.20X-V4.patch, MAPREDUCE-2495-v1.patch, MAPREDUCE-2495-v2.patch, MAPREDUCE-2495-v3.patch, MAPREDUCE-2495-v4.patch The cleanup thread in the distributed cache handles IOExceptions and the like correctly, but just to be a bit more defensive it would be good to monitor the thread, and check that it is still alive regularly, so that the distributed cache does not fill up the entire disk on the node. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2524) Backport trunk heuristics for failing maps when we get fetch failures retrieving map output during shuffle
[ https://issues.apache.org/jira/browse/MAPREDUCE-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-2524: - Attachment: MAPREDUCE2524-patch-20security.txt patch for the branch-0.20-security. Backport trunk heuristics for failing maps when we get fetch failures retrieving map output during shuffle -- Key: MAPREDUCE-2524 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2524 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Affects Versions: 0.20.204.0 Reporter: Thomas Graves Assignee: Thomas Graves Priority: Minor Fix For: 0.20.205.0 Attachments: MAPREDUCE2524-patch-20security.txt The heuristics for failing maps when we get map output fetch failures during the shuffle is pretty conservative in 20. Backport the heuristics from trunk which are more aggressive, simpler, and configurable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2524) Backport trunk heuristics for failing maps when we get fetch failures retrieving map output during shuffle
[ https://issues.apache.org/jira/browse/MAPREDUCE-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038174#comment-13038174 ] Thomas Graves commented on MAPREDUCE-2524: -- The javadoc and eclipse failures existed before/without these changes. [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] -1 javadoc. The javadoc tool appears to have generated 1 warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] -1 Eclipse classpath. The patch causes the Eclipse classpath to differ from the contents of the lib directories. [exec] [exec] Backport trunk heuristics for failing maps when we get fetch failures retrieving map output during shuffle -- Key: MAPREDUCE-2524 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2524 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Affects Versions: 0.20.204.0 Reporter: Thomas Graves Assignee: Thomas Graves Priority: Minor Fix For: 0.20.205.0 Attachments: MAPREDUCE2524-patch-20security.txt The heuristics for failing maps when we get map output fetch failures during the shuffle is pretty conservative in 20. Backport the heuristics from trunk which are more aggressive, simpler, and configurable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2185) Infinite loop at creating splits using CombineFileInputFormat
[ https://issues.apache.org/jira/browse/MAPREDUCE-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-2185: --- Attachment: MAPREDUCE-2185.patch For blocks that do not have hosts associated with them, use NetworkTopology.DEFAULT_RACK as the rack location. This avoids the infinite loop later on in getMoreSplits() Infinite loop at creating splits using CombineFileInputFormat - Key: MAPREDUCE-2185 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2185 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Reporter: Hairong Kuang Assignee: Hairong Kuang Attachments: MAPREDUCE-2185.patch This is caused by a missing block in HDFS. So the block's locations are empty. The following code adds the block to blockToNodes map but not to rackToBlocks map. Later on when generating splits, only blocks in rackToBlocks are removed from blockToNodes map. So blockToNodes map can never become empty therefore causing infinite loop {code} // add this block to the block -- node locations map blockToNodes.put(oneblock, oneblock.hosts); // add this block to the rack -- block map for (int j = 0; j oneblock.racks.length; j++) { .. } {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2185) Infinite loop at creating splits using CombineFileInputFormat
[ https://issues.apache.org/jira/browse/MAPREDUCE-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-2185: --- Assignee: Ramkumar Vadali (was: Hairong Kuang) Status: Patch Available (was: Open) Infinite loop at creating splits using CombineFileInputFormat - Key: MAPREDUCE-2185 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2185 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Reporter: Hairong Kuang Assignee: Ramkumar Vadali Attachments: MAPREDUCE-2185.patch This is caused by a missing block in HDFS. So the block's locations are empty. The following code adds the block to blockToNodes map but not to rackToBlocks map. Later on when generating splits, only blocks in rackToBlocks are removed from blockToNodes map. So blockToNodes map can never become empty therefore causing infinite loop {code} // add this block to the block -- node locations map blockToNodes.put(oneblock, oneblock.hosts); // add this block to the rack -- block map for (int j = 0; j oneblock.racks.length; j++) { .. } {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2494) Make the distributed cache delete entires using LRU priority
[ https://issues.apache.org/jira/browse/MAPREDUCE-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-2494: --- Status: Open (was: Patch Available) In a different conversation with Chris he mentioned that sleeps in the tests are bad, and that if they have to be there then they should be tied together with some constant values. I am reworking the tests to deal with constant values. Make the distributed cache delete entires using LRU priority Key: MAPREDUCE-2494 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2494 Project: Hadoop Map/Reduce Issue Type: Improvement Components: distributed-cache Affects Versions: 0.21.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MAPREDUCE-2494-V1.patch Currently the distributed cache will wait until a cache directory is above a preconfigured threshold. At which point it will delete all entries that are not currently being used. It seems like we would get far fewer cache misses if we kept some of them around, even when they are not being used. We should add in a configurable percentage for a goal of how much of the cache should remain clear when not in use, and select objects to delete based off of how recently they were used, and possibly also how large they are/how difficult is it to download them again. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2494) Make the distributed cache delete entires using LRU priority
[ https://issues.apache.org/jira/browse/MAPREDUCE-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-2494: --- Status: Patch Available (was: Open) Make the distributed cache delete entires using LRU priority Key: MAPREDUCE-2494 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2494 Project: Hadoop Map/Reduce Issue Type: Improvement Components: distributed-cache Affects Versions: 0.21.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MAPREDUCE-2494-V1.patch, MAPREDUCE-2494-V2.patch Currently the distributed cache will wait until a cache directory is above a preconfigured threshold. At which point it will delete all entries that are not currently being used. It seems like we would get far fewer cache misses if we kept some of them around, even when they are not being used. We should add in a configurable percentage for a goal of how much of the cache should remain clear when not in use, and select objects to delete based off of how recently they were used, and possibly also how large they are/how difficult is it to download them again. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2494) Make the distributed cache delete entires using LRU priority
[ https://issues.apache.org/jira/browse/MAPREDUCE-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-2494: --- Attachment: MAPREDUCE-2494-V2.patch Make the distributed cache delete entires using LRU priority Key: MAPREDUCE-2494 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2494 Project: Hadoop Map/Reduce Issue Type: Improvement Components: distributed-cache Affects Versions: 0.21.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MAPREDUCE-2494-V1.patch, MAPREDUCE-2494-V2.patch Currently the distributed cache will wait until a cache directory is above a preconfigured threshold. At which point it will delete all entries that are not currently being used. It seems like we would get far fewer cache misses if we kept some of them around, even when they are not being used. We should add in a configurable percentage for a goal of how much of the cache should remain clear when not in use, and select objects to delete based off of how recently they were used, and possibly also how large they are/how difficult is it to download them again. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task
[ https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038214#comment-13038214 ] Chris Douglas commented on MAPREDUCE-2492: -- This is partially my fault. I only had time for a quick pass over the patch, and had missed the mapred API change among the updates to {{mapreduce.lib}} classes. I agree with Tom. It's a useful feature, but changing only the new API is probably the better course. Any breakage is unlikely- it's adding, not removing a method from a framework type almost never implemented by users- but I'd lean away from any modifications to the old APIs unless they affect correctness. [MAPREDUCE] The new MapReduce API should make available task's progress to the task --- Key: MAPREDUCE-2492 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Fix For: 0.23.0 Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch, MAPREDUCE-2492-v1.5.patch, MAPREDUCE-2492-v1.6.patch, MAPREDUCE-2492-v1.7.patch There is no way to get the task's current progress in the new MapReduce API. It would be nice to make it available so that the task (map/reduce) can use it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2521) Mapreduce RPM integration project
[ https://issues.apache.org/jira/browse/MAPREDUCE-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated MAPREDUCE-2521: - Attachment: MAPREDUCE-2521-4.patch Change configuration directory from $PREFIX/conf to $PREFIX/etc/hadoop per Owen's recommendation. For RPM/deb, it will use /etc/hadoop as default, and create symlink for $PREFIX/etc/hadoop point to /etc/hadoop. Mapreduce RPM integration project - Key: MAPREDUCE-2521 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2521 Project: Hadoop Map/Reduce Issue Type: New Feature Components: build Environment: Java 6, RHEL 5.5 Reporter: Eric Yang Assignee: Eric Yang Attachments: MAPREDUCE-2521-1.patch, MAPREDUCE-2521-2.patch, MAPREDUCE-2521-3.patch, MAPREDUCE-2521-4.patch, MAPREDUCE-2521.patch This jira is corresponding to HADOOP-6255 and associated directory layout change. The patch for creating Mapreduce rpm packaging should be posted here for patch test build to verify against mapreduce svn trunk. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2185) Infinite loop at creating splits using CombineFileInputFormat
[ https://issues.apache.org/jira/browse/MAPREDUCE-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038301#comment-13038301 ] Hadoop QA commented on MAPREDUCE-2185: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480153/MAPREDUCE-2185.patch against trunk revision 1126591. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/297//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/297//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/297//console This message is automatically generated. Infinite loop at creating splits using CombineFileInputFormat - Key: MAPREDUCE-2185 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2185 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Reporter: Hairong Kuang Assignee: Ramkumar Vadali Attachments: MAPREDUCE-2185.patch This is caused by a missing block in HDFS. So the block's locations are empty. The following code adds the block to blockToNodes map but not to rackToBlocks map. Later on when generating splits, only blocks in rackToBlocks are removed from blockToNodes map. So blockToNodes map can never become empty therefore causing infinite loop {code} // add this block to the block -- node locations map blockToNodes.put(oneblock, oneblock.hosts); // add this block to the rack -- block map for (int j = 0; j oneblock.racks.length; j++) { .. } {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2494) Make the distributed cache delete entires using LRU priority
[ https://issues.apache.org/jira/browse/MAPREDUCE-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038309#comment-13038309 ] Hadoop QA commented on MAPREDUCE-2494: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480159/MAPREDUCE-2494-V2.patch against trunk revision 1126591. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/298//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/298//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/298//console This message is automatically generated. Make the distributed cache delete entires using LRU priority Key: MAPREDUCE-2494 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2494 Project: Hadoop Map/Reduce Issue Type: Improvement Components: distributed-cache Affects Versions: 0.21.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MAPREDUCE-2494-V1.patch, MAPREDUCE-2494-V2.patch Currently the distributed cache will wait until a cache directory is above a preconfigured threshold. At which point it will delete all entries that are not currently being used. It seems like we would get far fewer cache misses if we kept some of them around, even when they are not being used. We should add in a configurable percentage for a goal of how much of the cache should remain clear when not in use, and select objects to delete based off of how recently they were used, and possibly also how large they are/how difficult is it to download them again. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2485) reinitialize CLASSPATH variable when executing Mapper/Reducer code
[ https://issues.apache.org/jira/browse/MAPREDUCE-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038312#comment-13038312 ] Tom Melendez commented on MAPREDUCE-2485: - OK, so false alarm, it does work if I specify mapred.child.env. I did so like this in my config file and I'm good: property namemapred.child.env/name valueCLASSPATH=/usr/lib/hadoop/:/usr/lib/hadoop/lib/:/usr/lib/hadoop/conf:/usr/lib/hadoop/hadoop-0.20.2-cdh3u0-core.jar:/usr/share/java/commons-logging.jar/value /property I previously had quotes around classpath var and that barfed. reinitialize CLASSPATH variable when executing Mapper/Reducer code -- Key: MAPREDUCE-2485 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2485 Project: Hadoop Map/Reduce Issue Type: Improvement Components: pipes Affects Versions: 0.20.2 Environment: Ubuntu 10.04 LTS Reporter: Tom Melendez We're using pipes, and using libhdfs inside our mapper and reducer code. We've determined that we need to execute a putenv call in order for libhdfs to actually have access to the CLASSPATH. Ideally, it should just use the CLASSPATH we set when the job was executed. For some more context, see these threads: https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/ae9808d80fb132fb?tvc=2 http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25830 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-2485) reinitialize CLASSPATH variable when executing Mapper/Reducer code
[ https://issues.apache.org/jira/browse/MAPREDUCE-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Melendez resolved MAPREDUCE-2485. - Resolution: Invalid False alarm, works OK. reinitialize CLASSPATH variable when executing Mapper/Reducer code -- Key: MAPREDUCE-2485 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2485 Project: Hadoop Map/Reduce Issue Type: Improvement Components: pipes Affects Versions: 0.20.2 Environment: Ubuntu 10.04 LTS Reporter: Tom Melendez We're using pipes, and using libhdfs inside our mapper and reducer code. We've determined that we need to execute a putenv call in order for libhdfs to actually have access to the CLASSPATH. Ideally, it should just use the CLASSPATH we set when the job was executed. For some more context, see these threads: https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/ae9808d80fb132fb?tvc=2 http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25830 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2521) Mapreduce RPM integration project
[ https://issues.apache.org/jira/browse/MAPREDUCE-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038350#comment-13038350 ] Hadoop QA commented on MAPREDUCE-2521: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480179/MAPREDUCE-2521-4.patch against trunk revision 1126801. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 16 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit. The applied patch generated 4 release audit warnings (more than the trunk's current 2 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/299//testReport/ Release audit warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/299//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/299//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/299//console This message is automatically generated. Mapreduce RPM integration project - Key: MAPREDUCE-2521 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2521 Project: Hadoop Map/Reduce Issue Type: New Feature Components: build Environment: Java 6, RHEL 5.5 Reporter: Eric Yang Assignee: Eric Yang Attachments: MAPREDUCE-2521-1.patch, MAPREDUCE-2521-2.patch, MAPREDUCE-2521-3.patch, MAPREDUCE-2521-4.patch, MAPREDUCE-2521.patch This jira is corresponding to HADOOP-6255 and associated directory layout change. The patch for creating Mapreduce rpm packaging should be posted here for patch test build to verify against mapreduce svn trunk. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira