[jira] [Updated] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task

2011-05-21 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-2492:
--

Attachment: MAPREDUCE-2492-v1.4.patch

Attaching a new patch incorporating the comments by Chris. test-patch and the 
modified test-cases passed on my local box. Changes are
1. Using JUnit4 for the testcases and moved the code inside the instance 
initializer block into the {{@BeforeClass}} and {{@AfterClass}} methods.
2. Each test now has a parent directory under the test-root folder. This 
directory is recreated on startup and deleted in cleanup.
3. Incorporated MAPREDUCE-2523.

 [MAPREDUCE] The new MapReduce API should make available task's progress to 
 the task
 ---

 Key: MAPREDUCE-2492
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: task
Affects Versions: 0.23.0
Reporter: Amar Kamat
Assignee: Amar Kamat
 Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch


 There is no way to get the task's current progress in the new MapReduce API. 
 It would be nice to make it available so that the task (map/reduce) can use 
 it. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task

2011-05-21 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-2492:
--

Status: Patch Available  (was: Open)

Running through Hudson.

 [MAPREDUCE] The new MapReduce API should make available task's progress to 
 the task
 ---

 Key: MAPREDUCE-2492
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: task
Affects Versions: 0.23.0
Reporter: Amar Kamat
Assignee: Amar Kamat
 Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch


 There is no way to get the task's current progress in the new MapReduce API. 
 It would be nice to make it available so that the task (map/reduce) can use 
 it. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task

2011-05-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037298#comment-13037298
 ] 

Hadoop QA commented on MAPREDUCE-2492:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12479987/MAPREDUCE-2492-v1.4.patch
  against trunk revision 1125599.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 15 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.mapred.TestReporter

-1 contrib tests.  The patch failed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/288//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/288//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/288//console

This message is automatically generated.

 [MAPREDUCE] The new MapReduce API should make available task's progress to 
 the task
 ---

 Key: MAPREDUCE-2492
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: task
Affects Versions: 0.23.0
Reporter: Amar Kamat
Assignee: Amar Kamat
 Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch


 There is no way to get the task's current progress in the new MapReduce API. 
 It would be nice to make it available so that the task (map/reduce) can use 
 it. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2492) [MAPREDUCE] The new MapReduce API should make available task's progress to the task

2011-05-21 Thread Amar Kamat (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037311#comment-13037311
 ] 

Amar Kamat commented on MAPREDUCE-2492:
---

I downloaded the latest patch and ran the failed test. The testcase which 
failed on Hudson passed on my local box. Attached here 
(http://pastebin.com/qaKZ57TE) is the run log. See line no 13, 37, 53 for 
progress values in map/reduce task cleanup.

 [MAPREDUCE] The new MapReduce API should make available task's progress to 
 the task
 ---

 Key: MAPREDUCE-2492
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2492
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: task
Affects Versions: 0.23.0
Reporter: Amar Kamat
Assignee: Amar Kamat
 Attachments: MAPREDUCE-2492-v1.3.patch, MAPREDUCE-2492-v1.4.patch


 There is no way to get the task's current progress in the new MapReduce API. 
 It would be nice to make it available so that the task (map/reduce) can use 
 it. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2459) Cache HAR filesystem metadata

2011-05-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037407#comment-13037407
 ] 

Hudson commented on MAPREDUCE-2459:
---

Integrated in Hadoop-Mapreduce-trunk #686 (See 
[https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/686/])
MAPREDUCE-2459. Cache HAR filesystem metadata. (Mac Yang via mahadev)

mahadev : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1125428
Files : 
* /hadoop/mapreduce/trunk/CHANGES.txt
* /hadoop/mapreduce/trunk/src/tools/org/apache/hadoop/fs/HarFileSystem.java


 Cache HAR filesystem metadata
 -

 Key: MAPREDUCE-2459
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2459
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: harchive
Reporter: Mac Yang
Assignee: Mac Yang
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2459.1.patch, MAPREDUCE-2459.2.patch


 Each HAR file system has two index files that contains information on how 
 files are stored in the part files. During the block location calculation, 
 these indexes are reread for every file in the archive. Caching the indexes 
 and the status of the part files will greatly reduce the number of name node 
 operations during the job setup time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2470) Receiving NPE occasionally on RunningJob.getCounters() call

2011-05-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037406#comment-13037406
 ] 

Hudson commented on MAPREDUCE-2470:
---

Integrated in Hadoop-Mapreduce-trunk #686 (See 
[https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/686/])
Revert MAPREDUCE-2470
MAPREDUCE-2470. Fix NPE in RunningJobs::getCounters.
Contributed by Robert Joseph Evans

cdouglas : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1125599
Files : 
* /hadoop/mapreduce/trunk/CHANGES.txt
* 
/hadoop/mapreduce/trunk/src/test/mapred/org/apache/hadoop/mapred/TestNetworkedJob.java
* 
/hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapreduce/protocol/ClientProtocol.java
* /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/RunningJob.java
* /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/JobClient.java

cdouglas : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1125578
Files : 
* /hadoop/mapreduce/trunk/CHANGES.txt
* 
/hadoop/mapreduce/trunk/src/test/mapred/org/apache/hadoop/mapred/TestNetworkedJob.java
* 
/hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapreduce/protocol/ClientProtocol.java
* /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/RunningJob.java
* /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/JobClient.java


 Receiving NPE occasionally on RunningJob.getCounters() call
 ---

 Key: MAPREDUCE-2470
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2470
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 0.21.0
 Environment: FreeBSD, Java6, Hadoop r0.21.0
Reporter: Aaron Baff
Assignee: Robert Joseph Evans
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2470-v1.patch, MAPREDUCE-2470-v2.patch, 
 counters_null_data.pcap


 This is running in a Java daemon that is used as an interface (Thrift) to get 
 information and data from MR Jobs. Using JobClient.getJob(JobID) I 
 successfully get a RunningJob object (I'm checking for NULL), and then rarely 
 I get an NPE when I do RunningJob.getCounters(). This seems to occur after 
 the daemon has been up and running for a while, and in the event of an 
 Exception, I close the JobClient, set it to NULL, and a new one should then 
 be created on the next request for data. Yet, I still seem to be unable to 
 fetch the Counters. Below is the stack trace.
 java.lang.NullPointerException
 at org.apache.hadoop.mapred.Counters.downgrade(Counters.java:77)
 at 
 org.apache.hadoop.mapred.JobClient$NetworkedJob.getCounters(JobClient.java:381)
 at 
 com.telescope.HadoopThrift.service.ServiceImpl.getReportResults(ServiceImpl.java:350)
 at 
 com.telescope.HadoopThrift.gen.HadoopThrift$Processor$getReportResults.process(HadoopThrift.java:545)
 at 
 com.telescope.HadoopThrift.gen.HadoopThrift$Processor.process(HadoopThrift.java:421)
 at 
 org.apache.thrift.server.TNonblockingServer$FrameBuffer.invoke(TNonblockingServer.java:697)
 at 
 org.apache.thrift.server.THsHaServer$Invocation.run(THsHaServer.java:317)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2490) Log blacklist debug count

2011-05-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037408#comment-13037408
 ] 

Hudson commented on MAPREDUCE-2490:
---

Integrated in Hadoop-Mapreduce-trunk #686 (See 
[https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/686/])
MAPREDUCE-2490. Add logging to graylist and blacklist activity to aid
diagnosis of related issues. Contributed by Jonathan Eagles

cdouglas : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1125588
Files : 
* /hadoop/mapreduce/trunk/CHANGES.txt
* /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/JobTracker.java


 Log blacklist debug count
 -

 Key: MAPREDUCE-2490
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2490
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Affects Versions: 0.20.204.0, 0.22.0
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
Priority: Trivial
 Fix For: 0.20.205.0, 0.23.0

 Attachments: MAPREDUCE-2490-branch-0.20-security-v2.patch, 
 MAPREDUCE-2490-branch-0.20-security.patch, MAPREDUCE-2490-trunk-v2.patch, 
 MAPREDUCE-2490-trunk.patch


 Gain some insight into blacklist increments/decrements by enhancing the debug 
 logging

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2514) ReinitTrackerAction class name misspelled RenitTrackerAction in task tracker log

2011-05-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037409#comment-13037409
 ] 

Hudson commented on MAPREDUCE-2514:
---

Integrated in Hadoop-Mapreduce-trunk #686 (See 
[https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/686/])
MAPREDUCE-2514. Fix typo in TaskTracker ReinitTrackerAction log message.
Contributed by Jonathan Eagles.

cdouglas : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1125585
Files : 
* /hadoop/mapreduce/trunk/CHANGES.txt
* /hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/TaskTracker.java


 ReinitTrackerAction class name misspelled RenitTrackerAction in task tracker 
 log
 

 Key: MAPREDUCE-2514
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2514
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
Priority: Trivial
 Fix For: 0.20.205.0, 0.23.0

 Attachments: MAPREDUCE-2514-branch-0.20-security.patch, 
 MAPREDUCE-2514-trunk.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?

2011-05-21 Thread Harsh J Chouraria (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2384:
-

Attachment: MAPREDUCE-2384.r1.diff

 Can MR make error response Immediately?
 ---

 Key: MAPREDUCE-2384
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: job submission
Affects Versions: 0.21.0
Reporter: Denny Ye
Assignee: Harsh J Chouraria
 Attachments: MAPREDUCE-2384.r1.diff


 When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made 
 me confused about error response. For example:
 1. JobSubmitter checking output for each job. MapReduce makes rule to 
 limit that each job output must be not exist to avoid fault overwrite. In my 
 opinion, MR should verify output at the point of client submitting. Actually, 
 it copies related files to specified target and then, doing the verifying. 
 2. JobTracker.   Job has been submitted to JobTracker. In first step, 
 JT create JIT object that is very huge . Next step, JT start to verify job 
 queue authority and memory requirements.
  
 In normal case, verifying client input then response immediately if 
 any cases in fault. Regular logic can be performed if all the inputs have 
 passed.  
 It seems like that those code does not make sense for understanding. 
 Is only my personal opinion? Wish someone help me to explain the details. 
 Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2384) Can MR make error response Immediately?

2011-05-21 Thread Harsh J Chouraria (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated MAPREDUCE-2384:
-

Fix Version/s: 0.23.0
 Release Note: Submitter should fail on errors early, before transferring 
files.
   Status: Patch Available  (was: Open)

As before, I do not think refactoring (2) is a good idea maintenance-wise. 
Here's a patch for just the reordering of (1). Some simple jobsubs pass with 
the change -- I believe existing test cases cover this change already; but let 
me know if not.

 Can MR make error response Immediately?
 ---

 Key: MAPREDUCE-2384
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: job submission
Affects Versions: 0.21.0
Reporter: Denny Ye
Assignee: Harsh J Chouraria
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2384.r1.diff


 When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made 
 me confused about error response. For example:
 1. JobSubmitter checking output for each job. MapReduce makes rule to 
 limit that each job output must be not exist to avoid fault overwrite. In my 
 opinion, MR should verify output at the point of client submitting. Actually, 
 it copies related files to specified target and then, doing the verifying. 
 2. JobTracker.   Job has been submitted to JobTracker. In first step, 
 JT create JIT object that is very huge . Next step, JT start to verify job 
 queue authority and memory requirements.
  
 In normal case, verifying client input then response immediately if 
 any cases in fault. Regular logic can be performed if all the inputs have 
 passed.  
 It seems like that those code does not make sense for understanding. 
 Is only my personal opinion? Wish someone help me to explain the details. 
 Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2384) Can MR make error response Immediately?

2011-05-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037456#comment-13037456
 ] 

Hadoop QA commented on MAPREDUCE-2384:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12480002/MAPREDUCE-2384.r1.diff
  against trunk revision 1125599.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/289//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/289//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/289//console

This message is automatically generated.

 Can MR make error response Immediately?
 ---

 Key: MAPREDUCE-2384
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2384
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: job submission
Affects Versions: 0.21.0
Reporter: Denny Ye
Assignee: Harsh J Chouraria
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2384.r1.diff


 When I read the source code of MapReduce in Hadoop 0.21.0, sometimes it made 
 me confused about error response. For example:
 1. JobSubmitter checking output for each job. MapReduce makes rule to 
 limit that each job output must be not exist to avoid fault overwrite. In my 
 opinion, MR should verify output at the point of client submitting. Actually, 
 it copies related files to specified target and then, doing the verifying. 
 2. JobTracker.   Job has been submitted to JobTracker. In first step, 
 JT create JIT object that is very huge . Next step, JT start to verify job 
 queue authority and memory requirements.
  
 In normal case, verifying client input then response immediately if 
 any cases in fault. Regular logic can be performed if all the inputs have 
 passed.  
 It seems like that those code does not make sense for understanding. 
 Is only my personal opinion? Wish someone help me to explain the details. 
 Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira