[jira] [Updated] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task
[ https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-2492: -- Attachment: MAPREDUCE-2492-v1.4.patch Attaching a new patch incorporating the comments by Chris. test-patch and the modified test-cases passed on my local box. Changes are 1. Using JUnit4 for the testcases and moved the code inside the instance initializer block into the {{@BeforeClass}} and {{@AfterClass}} methods. 2. Each test now has a parent directory under the test-root folder. This directory is recreated on startup and deleted in cleanup. 3. Incorporated MAPREDUCE-2523. [MAPREDUCE] The new MapReduce API should make available task's progress to the task --- Key: MAPREDUCE-2492 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch There is no way to get the task's current progress in the new MapReduce API. It would be nice to make it available so that the task (map/reduce) can use it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task
[ https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-2492: -- Status: Patch Available (was: Open) Running through Hudson. [MAPREDUCE] The new MapReduce API should make available task's progress to the task --- Key: MAPREDUCE-2492 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch There is no way to get the task's current progress in the new MapReduce API. It would be nice to make it available so that the task (map/reduce) can use it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task
[ https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037298#comment-13037298 ] Hadoop QA commented on MAPREDUCE-2492: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479987/MAPREDUCE-2492-v1.4.patch against trunk revision 1125599. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.mapred.TestReporter -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/288//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/288//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/288//console This message is automatically generated. [MAPREDUCE] The new MapReduce API should make available task's progress to the task --- Key: MAPREDUCE-2492 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch There is no way to get the task's current progress in the new MapReduce API. It would be nice to make it available so that the task (map/reduce) can use it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task
[ https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037311#comment-13037311 ] Amar Kamat commented on MAPREDUCE-2492: --- I downloaded the latest patch and ran the failed test. The testcase which failed on Hudson passed on my local box. Attached here (http://pastebin.com/qaKZ57TE) is the run log. See line no 13, 37, 53 for progress values in map/reduce task cleanup. [MAPREDUCE] The new MapReduce API should make available task's progress to the task --- Key: MAPREDUCE-2492 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.23.0 Reporter: Amar Kamat Assignee: Amar Kamat Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch There is no way to get the task's current progress in the new MapReduce API. It would be nice to make it available so that the task (map/reduce) can use it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2459) Cache HAR filesystem metadata
[ https://issues.apache.org/jira/browse/MAPREDUCE-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037407#comment-13037407 ] Hudson commented on MAPREDUCE-2459: --- Integrated in Hadoop-Mapreduce-trunk #686 (See [https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/686/]) MAPREDUCE-2459. Cache HAR filesystem metadata. (Mac Yang via mahadev) mahadev : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1125428 Files : * /hadoop/mapreduce/trunk/CHANGES.txt * /hadoop/mapreduce/trunk/src/tools/org/apache/hadoop/fs/HarFileSystem.java Cache HAR filesystem metadata - Key: MAPREDUCE-2459 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2459 Project: Hadoop Map/Reduce Issue Type: Improvement Components: harchive Reporter: Mac Yang Assignee: Mac Yang Fix For: 0.23.0 Attachments: MAPREDUCE-2459.1.patch, MAPREDUCE-2459.2.patch Each HAR file system has two index files that contains information on how files are stored in the part files. During the block location calculation, these indexes are reread for every file in the archive. Caching the indexes and the status of the part files will greatly reduce the number of name node operations during the job setup time. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2470) Receiving NPE occasionally on RunningJob.getCounters() call
[ https://issues.apache.org/jira/browse/MAPREDUCE-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037406#comment-13037406 ] Hudson commented on MAPREDUCE-2470: --- Integrated in Hadoop-Mapreduce-trunk #686 (See [https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/686/]) Revert MAPREDUCE-2470 MAPREDUCE-2470. Fix NPE in RunningJobs::getCounters. Contributed by Robert Joseph Evans cdouglas : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1125599 Files : * /hadoop/mapreduce/trunk/CHANGES.txt * /hadoop/mapreduce/trunk/src/test/mapred/org/apache/hadoop/mapred/TestNetworkedJob.java * /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapreduce/protocol/ClientProtocol.java * /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/RunningJob.java * /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/JobClient.java cdouglas : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1125578 Files : * /hadoop/mapreduce/trunk/CHANGES.txt * /hadoop/mapreduce/trunk/src/test/mapred/org/apache/hadoop/mapred/TestNetworkedJob.java * /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapreduce/protocol/ClientProtocol.java * /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/RunningJob.java * /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/JobClient.java Receiving NPE occasionally on RunningJob.getCounters() call --- Key: MAPREDUCE-2470 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2470 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.21.0 Environment: FreeBSD, Java6, Hadoop r0.21.0 Reporter: Aaron Baff Assignee: Robert Joseph Evans Fix For: 0.23.0 Attachments: MAPREDUCE-2470-v1.patch, MAPREDUCE-2470-v2.patch, counters_null_data.pcap This is running in a Java daemon that is used as an interface (Thrift) to get information and data from MR Jobs. Using JobClient.getJob(JobID) I successfully get a RunningJob object (I'm checking for NULL), and then rarely I get an NPE when I do RunningJob.getCounters(). This seems to occur after the daemon has been up and running for a while, and in the event of an Exception, I close the JobClient, set it to NULL, and a new one should then be created on the next request for data. Yet, I still seem to be unable to fetch the Counters. Below is the stack trace. java.lang.NullPointerException at org.apache.hadoop.mapred.Counters.downgrade(Counters.java:77) at org.apache.hadoop.mapred.JobClient$NetworkedJob.getCounters(JobClient.java:381) at com.telescope.HadoopThrift.service.ServiceImpl.getReportResults(ServiceImpl.java:350) at com.telescope.HadoopThrift.gen.HadoopThrift$Processor$getReportResults.process(HadoopThrift.java:545) at com.telescope.HadoopThrift.gen.HadoopThrift$Processor.process(HadoopThrift.java:421) at org.apache.thrift.server.TNonblockingServer$FrameBuffer.invoke(TNonblockingServer.java:697) at org.apache.thrift.server.THsHaServer$Invocation.run(THsHaServer.java:317) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2490) Log blacklist debug count
[ https://issues.apache.org/jira/browse/MAPREDUCE-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037408#comment-13037408 ] Hudson commented on MAPREDUCE-2490: --- Integrated in Hadoop-Mapreduce-trunk #686 (See [https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/686/]) MAPREDUCE-2490. Add logging to graylist and blacklist activity to aid diagnosis of related issues. Contributed by Jonathan Eagles cdouglas : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1125588 Files : * /hadoop/mapreduce/trunk/CHANGES.txt * /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/JobTracker.java Log blacklist debug count - Key: MAPREDUCE-2490 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2490 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Affects Versions: 0.20.204.0, 0.22.0 Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Trivial Fix For: 0.20.205.0, 0.23.0 Attachments: MAPREDUCE-2490-branch-0.20-security-v2.patch, MAPREDUCE-2490-branch-0.20-security.patch, MAPREDUCE-2490-trunk-v2.patch, MAPREDUCE-2490-trunk.patch Gain some insight into blacklist increments/decrements by enhancing the debug logging -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2514) ReinitTrackerAction class name misspelled RenitTrackerAction in task tracker log
[ https://issues.apache.org/jira/browse/MAPREDUCE-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037409#comment-13037409 ] Hudson commented on MAPREDUCE-2514: --- Integrated in Hadoop-Mapreduce-trunk #686 (See [https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/686/]) MAPREDUCE-2514. Fix typo in TaskTracker ReinitTrackerAction log message. Contributed by Jonathan Eagles. cdouglas : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1125585 Files : * /hadoop/mapreduce/trunk/CHANGES.txt * /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/TaskTracker.java ReinitTrackerAction class name misspelled RenitTrackerAction in task tracker log Key: MAPREDUCE-2514 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2514 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.20.205.0, 0.23.0 Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Trivial Fix For: 0.20.205.0, 0.23.0 Attachments: MAPREDUCE-2514-branch-0.20-security.patch, MAPREDUCE-2514-trunk.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?
[ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J Chouraria updated MAPREDUCE-2384: - Attachment: MAPREDUCE-2384.r1.diff Can MR make error response Immediately? --- Key: MAPREDUCE-2384 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384 Project: Hadoop Map/Reduce Issue Type: Improvement Components: job submission Affects Versions: 0.21.0 Reporter: Denny Ye Assignee: Harsh J Chouraria Attachments: MAPREDUCE-2384.r1.diff When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example: 1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 2. JobTracker. Job has been submitted to JobTracker. In first step, JT create JIT object that is very huge . Next step, JT start to verify job queue authority and memory requirements. In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed. It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?
[ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J Chouraria updated MAPREDUCE-2384: - Fix Version/s: 0.23.0 Release Note: Submitter should fail on errors early, before transferring files. Status: Patch Available (was: Open) As before, I do not think refactoring (2) is a good idea maintenance-wise. Here's a patch for just the reordering of (1). Some simple jobsubs pass with the change -- I believe existing test cases cover this change already; but let me know if not. Can MR make error response Immediately? --- Key: MAPREDUCE-2384 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384 Project: Hadoop Map/Reduce Issue Type: Improvement Components: job submission Affects Versions: 0.21.0 Reporter: Denny Ye Assignee: Harsh J Chouraria Fix For: 0.23.0 Attachments: MAPREDUCE-2384.r1.diff When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example: 1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 2. JobTracker. Job has been submitted to JobTracker. In first step, JT create JIT object that is very huge . Next step, JT start to verify job queue authority and memory requirements. In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed. It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2384) Can MR make error response Immediately?
[ https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037456#comment-13037456 ] Hadoop QA commented on MAPREDUCE-2384: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480002/MAPREDUCE-2384.r1.diff against trunk revision 1125599. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/289//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/289//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/289//console This message is automatically generated. Can MR make error response Immediately? --- Key: MAPREDUCE-2384 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384 Project: Hadoop Map/Reduce Issue Type: Improvement Components: job submission Affects Versions: 0.21.0 Reporter: Denny Ye Assignee: Harsh J Chouraria Fix For: 0.23.0 Attachments: MAPREDUCE-2384.r1.diff When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made me confused about error response. For example: 1. JobSubmitter checking output for each job. MapReduce makes rule to limit that each job output must be not exist to avoid fault overwrite. In my opinion, MR should verify output at the point of client submitting. Actually, it copies related files to specified target and then, doing the verifying. 2. JobTracker. Job has been submitted to JobTracker. In first step, JT create JIT object that is very huge . Next step, JT start to verify job queue authority and memory requirements. In normal case, verifying client input then response immediately if any cases in fault. Regular logic can be performed if all the inputs have passed. It seems like that those code does not make sense for understanding. Is only my personal opinion? Wish someone help me to explain the details. Thanks! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira