[jira] [Updated] (MAPREDUCE-4843) When using DefaultTaskController, JobLocalizer not thread safe

2012-12-05 Thread zhaoyunjiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyunjiong updated MAPREDUCE-4843:


Attachment: MAPREDUCE-4843-branch-1.1.patch

Update patch.

 When using DefaultTaskController, JobLocalizer not thread safe
 --

 Key: MAPREDUCE-4843
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4843
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.1
Reporter: zhaoyunjiong
Priority: Critical
 Attachments: MAPREDUCE-4843-branch-1.1.patch


 In our cluster, some times job will failed due to below exception:
 2012-12-03 23:11:54,811 WARN org.apache.hadoop.mapred.TaskTracker: Error 
 initializing attempt_201212031626_1115_r_23_0:
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
 taskTracker/$username/jobcache/job_201212031626_1115/job.xml in any of the 
 configured local directories
   at 
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:424)
   at 
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
   at 
 org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1175)
   at 
 org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1058)
   at 
 org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2213)
 The root cause is JobLocalizer is not thread safe.
 In DefaultTaskController.initializeJob method:
  JobLocalizer localizer = new JobLocalizer((JobConf)getConf(), user, 
 jobid);
 but in JobLocalizer, it just simply keep the reference of the conf.
 When two TaskLauncher threads(mapLauncher and reduceLauncher) try to 
 initializeJob at same time, it will have two JobLocalizer, but only one conf 
 instance.
 So some times ttConf.setStrings(JOB_LOCAL_CTXT, localDirs) will reset 
 previous job's conf.
 Then it will cause the previous job's job.xml stored at another user's dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4843) When using DefaultTaskController, JobLocalizer not thread safe

2012-12-05 Thread zhaoyunjiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyunjiong updated MAPREDUCE-4843:


Status: Patch Available  (was: Open)

Testing patch

 When using DefaultTaskController, JobLocalizer not thread safe
 --

 Key: MAPREDUCE-4843
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4843
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.1
Reporter: zhaoyunjiong
Priority: Critical
 Attachments: MAPREDUCE-4843-branch-1.1.patch


 In our cluster, some times job will failed due to below exception:
 2012-12-03 23:11:54,811 WARN org.apache.hadoop.mapred.TaskTracker: Error 
 initializing attempt_201212031626_1115_r_23_0:
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
 taskTracker/$username/jobcache/job_201212031626_1115/job.xml in any of the 
 configured local directories
   at 
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:424)
   at 
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
   at 
 org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1175)
   at 
 org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1058)
   at 
 org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2213)
 The root cause is JobLocalizer is not thread safe.
 In DefaultTaskController.initializeJob method:
  JobLocalizer localizer = new JobLocalizer((JobConf)getConf(), user, 
 jobid);
 but in JobLocalizer, it just simply keep the reference of the conf.
 When two TaskLauncher threads(mapLauncher and reduceLauncher) try to 
 initializeJob at same time, it will have two JobLocalizer, but only one conf 
 instance.
 So some times ttConf.setStrings(JOB_LOCAL_CTXT, localDirs) will reset 
 previous job's conf.
 Then it will cause the previous job's job.xml stored at another user's dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4849) TaskSelector not used in FairScheduler

2012-12-05 Thread Vincent Behar (JIRA)
Vincent Behar created MAPREDUCE-4849:


 Summary: TaskSelector not used in FairScheduler
 Key: MAPREDUCE-4849
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4849
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/fair-share
Affects Versions: 1.1.1, 1.0.4
Reporter: Vincent Behar


The documentation (http://hadoop.apache.org/docs/r1.0.4/fair_scheduler.html) 
describes the mapred.fairscheduler.taskselector parameter as an extension 
point, but while the FairScheduler does instantiate the custom TaskSelector 
provided this way, it does not call any of its methods (obtainNewMapTask, 
obtainNewReduceTask, neededSpeculativeMaps or neededSpeculativeReduces).

We should either update the FairScheduler to use the TaskSelector when 
scheduling a task, or completely remove the TaskSelector and update the 
documentation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4732) testcase testJobRetire fails using IBM JAVA

2012-12-05 Thread Amir Sanjar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amir Sanjar updated MAPREDUCE-4732:
---

Summary: testcase testJobRetire fails using IBM JAVA   (was: testcase 
testJobRetire fails using IBM JAVA 7)

 testcase testJobRetire fails using IBM JAVA 
 

 Key: MAPREDUCE-4732
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4732
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 1.0.3
 Environment: RHEL 6.2 with IBM JAVA 7 on a x86_64 system
Reporter: Amir Sanjar

 Testcase: testJobRetire took 53.352 sec
 Testcase: testJobRetireWithUnreportedTasks took 41.173 sec
   FAILED
 Job did not retire
 junit.framework.AssertionFailedError: Job did not retire
   at 
 org.apache.hadoop.mapred.TestJobRetire.waitTillRetire(TestJobRetire.java:130)
   at 
 org.apache.hadoop.mapred.TestJobRetire.testJobRetireWithUnreportedTasks(TestJobRetire.java:229)
 Testcase: testJobRemoval took 1.073 sec

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4732) testcase testJobRetire fails using IBM JAVA

2012-12-05 Thread Amir Sanjar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510486#comment-13510486
 ] 

Amir Sanjar commented on MAPREDUCE-4732:


was able to reprouduce on IBM JAVA 6.. updatting abstract 

 testcase testJobRetire fails using IBM JAVA 
 

 Key: MAPREDUCE-4732
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4732
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 1.0.3
 Environment: RHEL 6.2 with IBM JAVA 7 on a x86_64 system
Reporter: Amir Sanjar

 Testcase: testJobRetire took 53.352 sec
 Testcase: testJobRetireWithUnreportedTasks took 41.173 sec
   FAILED
 Job did not retire
 junit.framework.AssertionFailedError: Job did not retire
   at 
 org.apache.hadoop.mapred.TestJobRetire.waitTillRetire(TestJobRetire.java:130)
   at 
 org.apache.hadoop.mapred.TestJobRetire.testJobRetireWithUnreportedTasks(TestJobRetire.java:229)
 Testcase: testJobRemoval took 1.073 sec

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4842) Shuffle race can hang reducer

2012-12-05 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4842:
-

Status: Open  (was: Patch Available)

 Shuffle race can hang reducer
 -

 Key: MAPREDUCE-4842
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4842
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.5, 2.0.2-alpha
Reporter: Jason Lowe
Assignee: Arun C Murthy
Priority: Blocker
 Attachments: MAPREDUCE-4842.patch, MAPREDUCE-4842.patch


 Saw an instance where the shuffle caused multiple reducers in a job to hang.  
 It looked similar to the problem described in MAPREDUCE-3721, where the 
 fetchers were all being told to WAIT by the MergeManager but no merge was 
 taking place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4842) Shuffle race can hang reducer

2012-12-05 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4842:
-

Attachment: MAPREDUCE-4842.patch

Jason, nice unit test! Thanks!

I've modified it a little to have 2 barriers (mergeStart and mergeComplete) 
rather than use the same 4 times (confused me a lot when I was reviewing it).

Other than that, it looks great. +1

Also, if you don't mind, I'll assign the jira to you - since you've done all 
the heavy lifting and deserve way more credit than I do. Thanks again!

 Shuffle race can hang reducer
 -

 Key: MAPREDUCE-4842
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4842
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Arun C Murthy
Priority: Blocker
 Attachments: MAPREDUCE-4842.patch, MAPREDUCE-4842.patch, 
 MAPREDUCE-4842.patch


 Saw an instance where the shuffle caused multiple reducers in a job to hang.  
 It looked similar to the problem described in MAPREDUCE-3721, where the 
 fetchers were all being told to WAIT by the MergeManager but no merge was 
 taking place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-4842) Shuffle race can hang reducer

2012-12-05 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy reassigned MAPREDUCE-4842:


Assignee: Jason Lowe  (was: Arun C Murthy)

 Shuffle race can hang reducer
 -

 Key: MAPREDUCE-4842
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4842
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Attachments: MAPREDUCE-4842.patch, MAPREDUCE-4842.patch, 
 MAPREDUCE-4842.patch


 Saw an instance where the shuffle caused multiple reducers in a job to hang.  
 It looked similar to the problem described in MAPREDUCE-3721, where the 
 fetchers were all being told to WAIT by the MergeManager but no merge was 
 taking place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4850) Job recovery may fail if staging directory has been deleted

2012-12-05 Thread Tom White (JIRA)
Tom White created MAPREDUCE-4850:


 Summary: Job recovery may fail if staging directory has been 
deleted
 Key: MAPREDUCE-4850
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4850
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Tom White
Assignee: Tom White


The job staging directory is deleted in the job cleanup task, which happens 
before the job-info file is deleted from the system directory (by the 
JobInProgress garbageCollect() method). If the JT shuts down between these two 
operations, then when the JT restarts and tries to recover the job, it fails 
since the job.xml and splits are no longer available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4850) Job recovery may fail if staging directory has been deleted

2012-12-05 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-4850:
-

Attachment: MAPREDUCE-4850.patch

A patch that deletes the staging directory after the system directory.

Manual testing showed that with this patch I couldn't get a recovery failure in 
the scenario in the description. It would be nice to add a unit test, but I'm 
still trying to figure out how to write one for this.


 Job recovery may fail if staging directory has been deleted
 ---

 Key: MAPREDUCE-4850
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4850
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-4850.patch


 The job staging directory is deleted in the job cleanup task, which happens 
 before the job-info file is deleted from the system directory (by the 
 JobInProgress garbageCollect() method). If the JT shuts down between these 
 two operations, then when the JT restarts and tries to recover the job, it 
 fails since the job.xml and splits are no longer available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4842) Shuffle race can hang reducer

2012-12-05 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510664#comment-13510664
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4842:
---

One minor NIT,  the scope of exceptionReporter instance var has been changed 
from private to protected for testing purposes. It should be package private 
instead. And preferable, we should add a getter method instead, package private 
(it could be annotated with the visiblefortesting guava annotation). Other than 
that looks good to me.

 Shuffle race can hang reducer
 -

 Key: MAPREDUCE-4842
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4842
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Attachments: MAPREDUCE-4842.patch, MAPREDUCE-4842.patch, 
 MAPREDUCE-4842.patch


 Saw an instance where the shuffle caused multiple reducers in a job to hang.  
 It looked similar to the problem described in MAPREDUCE-3721, where the 
 fetchers were all being told to WAIT by the MergeManager but no merge was 
 taking place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4842) Shuffle race can hang reducer

2012-12-05 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4842:
--

Attachment: MAPREDUCE-4842.patch

Thanks for the reviews, Alejandro and Arun.  I updated the patch to address 
Alejandro's comment and also added a comment clarifying why the merge callback 
occurs outside of the lock and after inProgress is cleared per a side 
discussion with Arun.

 Shuffle race can hang reducer
 -

 Key: MAPREDUCE-4842
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4842
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Attachments: MAPREDUCE-4842.patch, MAPREDUCE-4842.patch, 
 MAPREDUCE-4842.patch, MAPREDUCE-4842.patch


 Saw an instance where the shuffle caused multiple reducers in a job to hang.  
 It looked similar to the problem described in MAPREDUCE-3721, where the 
 fetchers were all being told to WAIT by the MergeManager but no merge was 
 taking place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4696) TestMRServerPorts throws NullReferenceException

2012-12-05 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510832#comment-13510832
 ] 

Siddharth Seth commented on MAPREDUCE-4696:
---

+1. Simple enough patch. Will commit this shortly.

 TestMRServerPorts throws NullReferenceException
 ---

 Key: MAPREDUCE-4696
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4696
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Attachments: mapreduce-4696-2.patch, mapreduce-4696.patch


 TestMRServerPorts throws 
 {code}
 java.lang.NullPointerException
 at 
 org.apache.hadoop.mapred.TestMRServerPorts.canStartJobTracker(TestMRServerPorts.java:99)
 at 
 org.apache.hadoop.mapred.TestMRServerPorts.testJobTrackerPorts(TestMRServerPorts.java:152)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4697) TestMapredHeartbeat fails assertion on HeartbeatInterval

2012-12-05 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510833#comment-13510833
 ] 

Siddharth Seth commented on MAPREDUCE-4697:
---

+1. Will commit shortly.

 TestMapredHeartbeat fails assertion on HeartbeatInterval
 

 Key: MAPREDUCE-4697
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4697
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Attachments: mapreduce-4697.patch


 TestMapredHeartbeat fails test on heart beat interval
 {code}
 FAILED
 expected:300 but was:500
 junit.framework.AssertionFailedError: expected:300 but was:500
 at 
 org.apache.hadoop.mapred.TestMapredHeartbeat.testJobDirCleanup(TestMapredHeartbeat.java:68)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4699) TestFairScheduler TestCapacityScheduler fails due to JobHistory exception

2012-12-05 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-4699:
--

Attachment: MAPREDUCE4699.txt

The current patch looks good for the CapacityScheduler test. Updating the patch 
with similar changes for TestFairScheduler - and committing.

 TestFairScheduler  TestCapacityScheduler fails due to JobHistory exception
 ---

 Key: MAPREDUCE-4699
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4699
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Attachments: mapreduce-4699.patch, MAPREDUCE4699.txt


 TestFairScheduler fails due to exception from mapred.JobHistory
 {code}
 null
 java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JobHistory$JobInfo.logJobPriority(JobHistory.java:1975)
   at 
 org.apache.hadoop.mapred.JobInProgress.setPriority(JobInProgress.java:895)
   at 
 org.apache.hadoop.mapred.TestFairScheduler.testFifoPool(TestFairScheduler.java:2617)
 {code}
 TestCapacityScheduler fails due to
 {code}
 java.lang.NullPointerException
 at 
 org.apache.hadoop.mapred.JobHistory$JobInfo.logJobPriority(JobHistory.java:1976)
 at 
 org.apache.hadoop.mapred.JobInProgress.setPriority(JobInProgress.java:895)
 at 
 org.apache.hadoop.mapred.TestCapacityScheduler$FakeTaskTrackerManager.setPriority(TestCapacityScheduler.java:653)
 at 
 org.apache.hadoop.mapred.TestCapacityScheduler.testHighPriorityJobInitialization(TestCapacityScheduler.java:2666)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4696) TestMRServerPorts throws NullReferenceException

2012-12-05 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-4696:
--

   Resolution: Fixed
Fix Version/s: 1.1.2
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Gopal!

 TestMRServerPorts throws NullReferenceException
 ---

 Key: MAPREDUCE-4696
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4696
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: 1.1.2

 Attachments: mapreduce-4696-2.patch, mapreduce-4696.patch


 TestMRServerPorts throws 
 {code}
 java.lang.NullPointerException
 at 
 org.apache.hadoop.mapred.TestMRServerPorts.canStartJobTracker(TestMRServerPorts.java:99)
 at 
 org.apache.hadoop.mapred.TestMRServerPorts.testJobTrackerPorts(TestMRServerPorts.java:152)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4697) TestMapredHeartbeat fails assertion on HeartbeatInterval

2012-12-05 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-4697:
--

   Resolution: Fixed
Fix Version/s: 1.1.2
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Gopal!

 TestMapredHeartbeat fails assertion on HeartbeatInterval
 

 Key: MAPREDUCE-4697
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4697
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: 1.1.2

 Attachments: mapreduce-4697.patch


 TestMapredHeartbeat fails test on heart beat interval
 {code}
 FAILED
 expected:300 but was:500
 junit.framework.AssertionFailedError: expected:300 but was:500
 at 
 org.apache.hadoop.mapred.TestMapredHeartbeat.testJobDirCleanup(TestMapredHeartbeat.java:68)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-4699) TestFairScheduler TestCapacityScheduler fails due to JobHistory exception

2012-12-05 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved MAPREDUCE-4699.
---

   Resolution: Fixed
Fix Version/s: 1.1.2
 Hadoop Flags: Reviewed

Committed. Thanks Gopal!

 TestFairScheduler  TestCapacityScheduler fails due to JobHistory exception
 ---

 Key: MAPREDUCE-4699
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4699
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: 1.1.2

 Attachments: mapreduce-4699.patch, MAPREDUCE4699.txt


 TestFairScheduler fails due to exception from mapred.JobHistory
 {code}
 null
 java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JobHistory$JobInfo.logJobPriority(JobHistory.java:1975)
   at 
 org.apache.hadoop.mapred.JobInProgress.setPriority(JobInProgress.java:895)
   at 
 org.apache.hadoop.mapred.TestFairScheduler.testFifoPool(TestFairScheduler.java:2617)
 {code}
 TestCapacityScheduler fails due to
 {code}
 java.lang.NullPointerException
 at 
 org.apache.hadoop.mapred.JobHistory$JobInfo.logJobPriority(JobHistory.java:1976)
 at 
 org.apache.hadoop.mapred.JobInProgress.setPriority(JobInProgress.java:895)
 at 
 org.apache.hadoop.mapred.TestCapacityScheduler$FakeTaskTrackerManager.setPriority(TestCapacityScheduler.java:653)
 at 
 org.apache.hadoop.mapred.TestCapacityScheduler.testHighPriorityJobInitialization(TestCapacityScheduler.java:2666)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4845) ClusterStatus.getMaxMemory() and getUsedMemory() exist in MR1 but not MR2

2012-12-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510896#comment-13510896
 ] 

Hadoop QA commented on MAPREDUCE-4845:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12556024/MAPREDUCE-4845-branch-1.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3094//console

This message is automatically generated.

 ClusterStatus.getMaxMemory() and getUsedMemory() exist in MR1 but not MR2 
 --

 Key: MAPREDUCE-4845
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4845
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 1.1.1, 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-4845-branch-1.patch, MAPREDUCE-4845.patch


 For backwards compatibility, these methods should exist in both MR1 and MR2.
 Confusingly, these methods return the max memory and used memory of the 
 jobtracker, not the entire cluster.
 I'd propose to add them to MR2 and return -1, and deprecate them in both MR1 
 and MR2.  Alternatively, I could add plumbing to get the resource manager 
 memory stats.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4839) TextPartioner for hashing Text with good hashing function to get better distribution

2012-12-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510901#comment-13510901
 ] 

Hadoop QA commented on MAPREDUCE-4839:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12555646/textpartitioner1.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3096//console

This message is automatically generated.

 TextPartioner for hashing Text with good hashing function to get better 
 distribution
 

 Key: MAPREDUCE-4839
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4839
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Radim Kolar
 Attachments: textpartitioner1.txt


 partitioner for Text keys using util.Hash framework for hashing function

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4843) When using DefaultTaskController, JobLocalizer not thread safe

2012-12-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510899#comment-13510899
 ] 

Hadoop QA commented on MAPREDUCE-4843:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12556081/MAPREDUCE-4843-branch-1.1.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3095//console

This message is automatically generated.

 When using DefaultTaskController, JobLocalizer not thread safe
 --

 Key: MAPREDUCE-4843
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4843
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.1
Reporter: zhaoyunjiong
Priority: Critical
 Attachments: MAPREDUCE-4843-branch-1.1.patch


 In our cluster, some times job will failed due to below exception:
 2012-12-03 23:11:54,811 WARN org.apache.hadoop.mapred.TaskTracker: Error 
 initializing attempt_201212031626_1115_r_23_0:
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
 taskTracker/$username/jobcache/job_201212031626_1115/job.xml in any of the 
 configured local directories
   at 
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:424)
   at 
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
   at 
 org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1175)
   at 
 org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1058)
   at 
 org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2213)
 The root cause is JobLocalizer is not thread safe.
 In DefaultTaskController.initializeJob method:
  JobLocalizer localizer = new JobLocalizer((JobConf)getConf(), user, 
 jobid);
 but in JobLocalizer, it just simply keep the reference of the conf.
 When two TaskLauncher threads(mapLauncher and reduceLauncher) try to 
 initializeJob at same time, it will have two JobLocalizer, but only one conf 
 instance.
 So some times ttConf.setStrings(JOB_LOCAL_CTXT, localDirs) will reset 
 previous job's conf.
 Then it will cause the previous job's job.xml stored at another user's dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4827) Increase hash quality of HashPartitioner

2012-12-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510911#comment-13510911
 ] 

Hadoop QA commented on MAPREDUCE-4827:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12555191/betterhash1.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3097//console

This message is automatically generated.

 Increase hash quality of HashPartitioner
 

 Key: MAPREDUCE-4827
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Radim Kolar
 Attachments: betterhash1.txt


 hash partitioner is using object.hashCode() for splitting keys into 
 partitions. This results in bad distributions because hashCode() quality is 
 poor. 
 These hashCode() functions are sometimes written by hand (very poor quality) 
 and sometimes generated from by commons lang code (poor quality). Applying 
 some transformation on top of hashCode() provides better distribution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4594) Add init/shutdown methods to mapreduce Partitioner

2012-12-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510922#comment-13510922
 ] 

Hadoop QA commented on MAPREDUCE-4594:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12556006/partitioner1.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3098//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3098//console

This message is automatically generated.

 Add init/shutdown methods to mapreduce Partitioner
 --

 Key: MAPREDUCE-4594
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4594
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: trunk
Reporter: Radim Kolar
 Attachments: partitioner1.txt


 The Partitioner supports only the Configurable API, which can be used for 
 basic init in setConf(). Problem is that there is no shutdown function.
 I propose to use standard setup() cleanup() functions like in mapper / 
 reducer.
 Use case is that I need to start and stop spring context and datagrid client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4839) TextPartioner for hashing Text with good hashing function to get better distribution

2012-12-05 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4839:
---

Attachment: textpartitioner2.txt

 TextPartioner for hashing Text with good hashing function to get better 
 distribution
 

 Key: MAPREDUCE-4839
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4839
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Radim Kolar
 Attachments: textpartitioner1.txt, textpartitioner2.txt


 partitioner for Text keys using util.Hash framework for hashing function

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4827) Increase hash quality of HashPartitioner

2012-12-05 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4827:
---

Attachment: betterhash2.txt

change it for old mapred api as well

 Increase hash quality of HashPartitioner
 

 Key: MAPREDUCE-4827
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Radim Kolar
 Attachments: betterhash1.txt, betterhash2.txt


 hash partitioner is using object.hashCode() for splitting keys into 
 partitions. This results in bad distributions because hashCode() quality is 
 poor. 
 These hashCode() functions are sometimes written by hand (very poor quality) 
 and sometimes generated from by commons lang code (poor quality). Applying 
 some transformation on top of hashCode() provides better distribution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4843) When using DefaultTaskController, JobLocalizer not thread safe

2012-12-05 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13511003#comment-13511003
 ] 

Karthik Kambatla commented on MAPREDUCE-4843:
-

[~zhaoyunjiong] The patch looks good. Can you post a patch against trunk for QA 
to be able to apply it. Also, I was wondering if it would be possible to add a 
test?

 When using DefaultTaskController, JobLocalizer not thread safe
 --

 Key: MAPREDUCE-4843
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4843
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.1
Reporter: zhaoyunjiong
Priority: Critical
 Attachments: MAPREDUCE-4843-branch-1.1.patch


 In our cluster, some times job will failed due to below exception:
 2012-12-03 23:11:54,811 WARN org.apache.hadoop.mapred.TaskTracker: Error 
 initializing attempt_201212031626_1115_r_23_0:
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
 taskTracker/$username/jobcache/job_201212031626_1115/job.xml in any of the 
 configured local directories
   at 
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:424)
   at 
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
   at 
 org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1175)
   at 
 org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1058)
   at 
 org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2213)
 The root cause is JobLocalizer is not thread safe.
 In DefaultTaskController.initializeJob method:
  JobLocalizer localizer = new JobLocalizer((JobConf)getConf(), user, 
 jobid);
 but in JobLocalizer, it just simply keep the reference of the conf.
 When two TaskLauncher threads(mapLauncher and reduceLauncher) try to 
 initializeJob at same time, it will have two JobLocalizer, but only one conf 
 instance.
 So some times ttConf.setStrings(JOB_LOCAL_CTXT, localDirs) will reset 
 previous job's conf.
 Then it will cause the previous job's job.xml stored at another user's dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4843) When using DefaultTaskController, JobLocalizer not thread safe

2012-12-05 Thread zhaoyunjiong (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13511019#comment-13511019
 ] 

zhaoyunjiong commented on MAPREDUCE-4843:
-

No need for trunk. In hadoop 2.0, the problem doesn't exist.
It's very difficult to test a thread safe problem, even it's not thread safe, 
in most case it will pass it.

 When using DefaultTaskController, JobLocalizer not thread safe
 --

 Key: MAPREDUCE-4843
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4843
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.1
Reporter: zhaoyunjiong
Priority: Critical
 Attachments: MAPREDUCE-4843-branch-1.1.patch


 In our cluster, some times job will failed due to below exception:
 2012-12-03 23:11:54,811 WARN org.apache.hadoop.mapred.TaskTracker: Error 
 initializing attempt_201212031626_1115_r_23_0:
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
 taskTracker/$username/jobcache/job_201212031626_1115/job.xml in any of the 
 configured local directories
   at 
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:424)
   at 
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
   at 
 org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1175)
   at 
 org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1058)
   at 
 org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2213)
 The root cause is JobLocalizer is not thread safe.
 In DefaultTaskController.initializeJob method:
  JobLocalizer localizer = new JobLocalizer((JobConf)getConf(), user, 
 jobid);
 but in JobLocalizer, it just simply keep the reference of the conf.
 When two TaskLauncher threads(mapLauncher and reduceLauncher) try to 
 initializeJob at same time, it will have two JobLocalizer, but only one conf 
 instance.
 So some times ttConf.setStrings(JOB_LOCAL_CTXT, localDirs) will reset 
 previous job's conf.
 Then it will cause the previous job's job.xml stored at another user's dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4594) Add init/shutdown methods to mapreduce Partitioner

2012-12-05 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13511028#comment-13511028
 ] 

Harsh J commented on MAPREDUCE-4594:


I notice no objects (such as an attempt context object) being passed into the 
setup and cleanup methods you wish to introduce here. Without that how is this 
helpful?

In my mind I was viewing your proposal as a step over writing extends 
Configurable for new API partitioner implementations, when one needs at least 
the Configuration object instance to pull values out from.

Plus, the ordering of these calls matter, so tests are absolutely necessary if 
we do not want to regress by accident in future.

 Add init/shutdown methods to mapreduce Partitioner
 --

 Key: MAPREDUCE-4594
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4594
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: trunk
Reporter: Radim Kolar
 Attachments: partitioner1.txt


 The Partitioner supports only the Configurable API, which can be used for 
 basic init in setConf(). Problem is that there is no shutdown function.
 I propose to use standard setup() cleanup() functions like in mapper / 
 reducer.
 Use case is that I need to start and stop spring context and datagrid client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4843) When using DefaultTaskController, JobLocalizer not thread safe

2012-12-05 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13511029#comment-13511029
 ] 

Karthik Kambatla commented on MAPREDUCE-4843:
-

My bad - read the branch name wrong. I applied the patch locally, and verified 
that the tests that directly use {{DefaultTaskController}} pass - 
TestTaskTrackerLocalization, TestJvmManager, TestTaskEnvironment.

+1

 When using DefaultTaskController, JobLocalizer not thread safe
 --

 Key: MAPREDUCE-4843
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4843
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.1
Reporter: zhaoyunjiong
Priority: Critical
 Attachments: MAPREDUCE-4843-branch-1.1.patch


 In our cluster, some times job will failed due to below exception:
 2012-12-03 23:11:54,811 WARN org.apache.hadoop.mapred.TaskTracker: Error 
 initializing attempt_201212031626_1115_r_23_0:
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
 taskTracker/$username/jobcache/job_201212031626_1115/job.xml in any of the 
 configured local directories
   at 
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:424)
   at 
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
   at 
 org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1175)
   at 
 org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1058)
   at 
 org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2213)
 The root cause is JobLocalizer is not thread safe.
 In DefaultTaskController.initializeJob method:
  JobLocalizer localizer = new JobLocalizer((JobConf)getConf(), user, 
 jobid);
 but in JobLocalizer, it just simply keep the reference of the conf.
 When two TaskLauncher threads(mapLauncher and reduceLauncher) try to 
 initializeJob at same time, it will have two JobLocalizer, but only one conf 
 instance.
 So some times ttConf.setStrings(JOB_LOCAL_CTXT, localDirs) will reset 
 previous job's conf.
 Then it will cause the previous job's job.xml stored at another user's dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4847) Command Parsing in Hadoop Streaming

2012-12-05 Thread Peng Lei (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13511031#comment-13511031
 ] 

Peng Lei commented on MAPREDUCE-4847:
-

Thank you for your comment!

I have put the command in a script file as a workaround, it works. But in this 
case, the command is not too complex to write a dedicate script file, and on 
fly script generating is a bit tricky(at least for maintainer).

It seems hadoop can't run on windows without cygwin. Another solution may be: 
add a new option to instruct streaming to use an alternative command invoker, 
such as:

  -command_invoker sh -c

This could solve the issue and didn't break the existing hadoop-streaming 
application.

-Peng


 Command Parsing in Hadoop Streaming
 ---

 Key: MAPREDUCE-4847
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4847
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/streaming
Reporter: Peng Lei
  Labels: features
   Original Estimate: 4h
  Remaining Estimate: 4h

 Hadoop streaming parse the mapper and reducer commands by itself, this is not 
 a good choice, when I write a complex mapper/reducer script inline, such as 
 'perl -ne ...', it don't work.
 An alternative way is to send the command to the shell, simply create new 
 process(sh -c command_and_args), this not also simplize the streaming code, 
 but also improve its capability!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4842) Shuffle race can hang reducer

2012-12-05 Thread Mariappan Asokan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13511039#comment-13511039
 ] 

Mariappan Asokan commented on MAPREDUCE-4842:
-

Hi Jason, Arun, and Alejandro,
  I came up with a simpler solution to solve this nasty problem.  Instead of a 
single list {{inputs}} in {{MergeThread,}} we can keep a FIFO list of these 
lists.  This will make sure that more than one merge can be pending.  The 
{{run()}} method in {{MergeThread}} will keep pulling out the map output lists 
from the FIFO list to merge them(this is a typical producer-consumer scenario.)

I will outline the changes below:

In {{MergeThread}},

* A {{LinkedListListT}} type member({{pendingToBeMerged}}) is added and the 
member {{inputs}} is removed.

* The {{isInProgress()}} method is removed.

* The {{startMerge()}} method will no longer be {{synchronized.}}  It will add 
the passed list to the tail of {{pendingToBeMerged}} and it will 
{{notifyAll()}} on the monitor of {{pendingToBeMerged.}}

* The {{run()}} method will sit in a tight loop.  So long as there is an 
item(list of map outputs) to be consumed, it will consume(merge) the item and 
remove it from {{pendingToBeMerged.}}  If {pendingToBeMerged}} has no more 
item, it will {{notifyAll()}} on the object's monitor after setting
{{inProgress}} to {{false.}}

In {{MergeManager}},

* All calls to {{isInProgress()}} are removed.

* Unnecessary {{synchronized}} clauses on merge thread objects are removed 
since the methods where they are in themselves are {{synchronized.}}

I created a patch with the above changes and tested it on my laptop.  The 
mapreduce tests seem to run without any problem.  However, I do not claim that 
it is completely tested.  It has to go through the rigorous testing that Jason 
did.

If you are interested in taking a look at the patch, I will post it to this 
Jira.  I welcome your questions and suggestions on the idea of the patch.

-- Asokan


 Shuffle race can hang reducer
 -

 Key: MAPREDUCE-4842
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4842
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Attachments: MAPREDUCE-4842.patch, MAPREDUCE-4842.patch, 
 MAPREDUCE-4842.patch, MAPREDUCE-4842.patch


 Saw an instance where the shuffle caused multiple reducers in a job to hang.  
 It looked similar to the problem described in MAPREDUCE-3721, where the 
 fetchers were all being told to WAIT by the MergeManager but no merge was 
 taking place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4839) TextPartioner for hashing Text with good hashing function to get better distribution

2012-12-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13511041#comment-13511041
 ] 

Hadoop QA commented on MAPREDUCE-4839:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12556180/textpartitioner2.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 2014 javac 
compiler warnings (more than the trunk's current 2013 warnings).

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3099//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3099//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3099//console

This message is automatically generated.

 TextPartioner for hashing Text with good hashing function to get better 
 distribution
 

 Key: MAPREDUCE-4839
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4839
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Radim Kolar
 Attachments: textpartitioner1.txt, textpartitioner2.txt


 partitioner for Text keys using util.Hash framework for hashing function

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4812) Create reduce input merger plugin in ReduceTask.java and pass it to Shuffle

2012-12-05 Thread Mariappan Asokan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13511045#comment-13511045
 ] 

Mariappan Asokan commented on MAPREDUCE-4812:
-

Hi Arun,
  I have some ideas to fix the problem in MAPREDUCE-4842.  I posted my comments 
there.  Please take a look.

Thanks.

-- Asokan

 Create reduce input merger plugin in ReduceTask.java and pass it to Shuffle
 ---

 Key: MAPREDUCE-4812
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4812
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Affects Versions: 2.0.2-alpha
Reporter: Mariappan Asokan
Assignee: Mariappan Asokan
 Fix For: 2.0.3-alpha

 Attachments: COMBO-mapreduce-4809-4812.patch, 
 COMBO-mapreduce-4809-4812.patch, mapreduce-4812.patch, mapreduce-4812.patch, 
 mapreduce-4812.patch, mapreduce-4812.patch, mapreduce-4812.patch


 This is part of MAPREDUCE-2454.  This further breaks down MAPREDUCE-4808

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4827) Increase hash quality of HashPartitioner

2012-12-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13511046#comment-13511046
 ] 

Hadoop QA commented on MAPREDUCE-4827:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12556183/betterhash2.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3100//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3100//console

This message is automatically generated.

 Increase hash quality of HashPartitioner
 

 Key: MAPREDUCE-4827
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Radim Kolar
 Attachments: betterhash1.txt, betterhash2.txt


 hash partitioner is using object.hashCode() for splitting keys into 
 partitions. This results in bad distributions because hashCode() quality is 
 poor. 
 These hashCode() functions are sometimes written by hand (very poor quality) 
 and sometimes generated from by commons lang code (poor quality). Applying 
 some transformation on top of hashCode() provides better distribution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira