[jira] [Commented] (MAPREDUCE-2413) TaskTracker should handle disk failures at both startup and runtime

2011-07-18 Thread Ravi Gummadi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13066837#comment-13066837
 ] 

Ravi Gummadi commented on MAPREDUCE-2413:
-

Am working on porting this patch to trunk.

 TaskTracker should handle disk failures at both startup and runtime
 ---

 Key: MAPREDUCE-2413
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2413
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: task-controller, tasktracker
Affects Versions: 0.20.204.0
Reporter: Bharath Mundlapudi
Assignee: Ravi Gummadi
 Fix For: 0.20.204.0

 Attachments: MR-2413.v0.1.patch, MR-2413.v0.2.patch, 
 MR-2413.v0.3.patch, MR-2413.v0.patch


 At present, TaskTracker doesn't handle disk failures properly both at startup 
 and runtime.
 (1) Currently TaskTracker doesn't come up if any of the mapred-local-dirs is 
 on a bad disk. TaskTracker should ignore that particular mapred-local-dir and 
 start up and use only the remaining good mapred-local-dirs.
 (2) If a disk goes bad while TaskTracker is running, currently TaskTracker 
 doesn't do anything special. This results in either
(a) TaskTracker continues to try to use that bad disk and this results 
 in lots of task failures and possibly job failures(because of multiple TTs 
 having bad disks) and eventually these TTs getting graylisted for all jobs. 
 And this needs manual restart of TT with modified configuration of 
 mapred-local-dirs avoiding the bad disk. OR
(b) Health check script identifying the disk as bad and the TT gets 
 blacklisted. And this also needs manual restart of TT with modified 
 configuration of mapred-local-dirs avoiding the bad disk.
 This JIRA is to make TaskTracker more fault-tolerant to disk failures solving 
 (1) and (2). i.e. TT should start even if at least one of the 
 mapred-local-dirs is on a good disk and TT should adjust its in-memory list 
 of mapred-local-dirs and avoid using bad mapred-local-dirs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2701) MR-279: app/Job.java needs UGI for the user that launched it

2011-07-18 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-2701:
---

Status: Patch Available  (was: Open)

 MR-279: app/Job.java needs UGI for the user that launched it
 

 Key: MAPREDUCE-2701
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2701
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.0

 Attachments: MR-2701-v1.patch


 ./mr-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java
  is missing some data that is needed by the Job History GUI.  It needs the 
 UGI for the user that launched it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2701) MR-279: app/Job.java needs UGI for the user that launched it

2011-07-18 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-2701:
---

Fix Version/s: 0.23.0

 MR-279: app/Job.java needs UGI for the user that launched it
 

 Key: MAPREDUCE-2701
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2701
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.0

 Attachments: MR-2701-v1.patch


 ./mr-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java
  is missing some data that is needed by the Job History GUI.  It needs the 
 UGI for the user that launched it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2701) MR-279: app/Job.java needs UGI for the user that launched it

2011-07-18 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-2701:
---

Attachment: MR-2701-v1.patch

This patch adds in UGI information to Job for the user that launched the job.  
This is in preparation for the GUI to display this information.

 MR-279: app/Job.java needs UGI for the user that launched it
 

 Key: MAPREDUCE-2701
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2701
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.0

 Attachments: MR-2701-v1.patch


 ./mr-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java
  is missing some data that is needed by the Job History GUI.  It needs the 
 UGI for the user that launched it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2701) MR-279: app/Job.java needs UGI for the user that launched it

2011-07-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067080#comment-13067080
 ] 

Hadoop QA commented on MAPREDUCE-2701:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12486879/MR-2701-v1.patch
  against trunk revision 1146517.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/476//console

This message is automatically generated.

 MR-279: app/Job.java needs UGI for the user that launched it
 

 Key: MAPREDUCE-2701
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2701
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.0

 Attachments: MR-2701-v1.patch


 ./mr-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java
  is missing some data that is needed by the Job History GUI.  It needs the 
 UGI for the user that launched it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-07-18 Thread Jeffrey Naisbitt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Naisbitt updated MAPREDUCE-2489:


Status: Patch Available  (was: Open)

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s-v3.patch, MAPREDUCE-2489-0.20s.patch, 
 MAPREDUCE-2489-mapred-v2.patch, MAPREDUCE-2489-mapred-v3.patch, 
 MAPREDUCE-2489-mapred-v4.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-07-18 Thread Jeffrey Naisbitt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Naisbitt updated MAPREDUCE-2489:


Status: Open  (was: Patch Available)

Resubmitting patch to run through hudson

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s-v3.patch, MAPREDUCE-2489-0.20s.patch, 
 MAPREDUCE-2489-mapred-v2.patch, MAPREDUCE-2489-mapred-v3.patch, 
 MAPREDUCE-2489-mapred-v4.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2701) MR-279: app/Job.java needs UGI for the user that launched it

2011-07-18 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067103#comment-13067103
 ] 

Robert Joseph Evans commented on MAPREDUCE-2701:


This patch is intended for the MR-279 branch not trunk.

 MR-279: app/Job.java needs UGI for the user that launched it
 

 Key: MAPREDUCE-2701
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2701
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.0

 Attachments: MR-2701-v1.patch


 ./mr-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java
  is missing some data that is needed by the Job History GUI.  It needs the 
 UGI for the user that launched it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2413) TaskTracker should handle disk failures at both startup and runtime

2011-07-18 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067119#comment-13067119
 ] 

Eli Collins commented on MAPREDUCE-2413:


@Ravi - trunk's task tracker or as a feature for MR2?

 TaskTracker should handle disk failures at both startup and runtime
 ---

 Key: MAPREDUCE-2413
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2413
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: task-controller, tasktracker
Affects Versions: 0.20.204.0
Reporter: Bharath Mundlapudi
Assignee: Ravi Gummadi
 Fix For: 0.20.204.0

 Attachments: MR-2413.v0.1.patch, MR-2413.v0.2.patch, 
 MR-2413.v0.3.patch, MR-2413.v0.patch


 At present, TaskTracker doesn't handle disk failures properly both at startup 
 and runtime.
 (1) Currently TaskTracker doesn't come up if any of the mapred-local-dirs is 
 on a bad disk. TaskTracker should ignore that particular mapred-local-dir and 
 start up and use only the remaining good mapred-local-dirs.
 (2) If a disk goes bad while TaskTracker is running, currently TaskTracker 
 doesn't do anything special. This results in either
(a) TaskTracker continues to try to use that bad disk and this results 
 in lots of task failures and possibly job failures(because of multiple TTs 
 having bad disks) and eventually these TTs getting graylisted for all jobs. 
 And this needs manual restart of TT with modified configuration of 
 mapred-local-dirs avoiding the bad disk. OR
(b) Health check script identifying the disk as bad and the TT gets 
 blacklisted. And this also needs manual restart of TT with modified 
 configuration of mapred-local-dirs avoiding the bad disk.
 This JIRA is to make TaskTracker more fault-tolerant to disk failures solving 
 (1) and (2). i.e. TT should start even if at least one of the 
 mapred-local-dirs is on a good disk and TT should adjust its in-memory list 
 of mapred-local-dirs and avoid using bad mapred-local-dirs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2623) Update ClusterMapReduceTestCase to use MiniDFSCluster.Builder

2011-07-18 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated MAPREDUCE-2623:
---

  Issue Type: Improvement  (was: Task)
Hadoop Flags: [Reviewed]

+1  looks good

 Update ClusterMapReduceTestCase to use MiniDFSCluster.Builder
 -

 Key: MAPREDUCE-2623
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2623
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: test
Affects Versions: 0.23.0
Reporter: Jim Plush
Assignee: Harsh J
Priority: Minor
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2623.r1.diff, MAPREDUCE-2623.r2.diff


 Looking at test class ClusterMapReduceTestCase it issues a warning that the 
 dfsCluster = new MiniDFSCluster(conf, 2, reformatDFS, null); line of code is 
 deprecated and MiniDFSCluster.Builder should be used instead. It notes that 
 the current API will be phased out in version 24. I propose to update the 
 test class to the most up to date code as it's referenced several places on 
 the internet as an example of how to write a Hadoop Unit Test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2623) Update ClusterMapReduceTestCase to use MiniDFSCluster.Builder

2011-07-18 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated MAPREDUCE-2623:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I've committed this. Thanks Harsh!

 Update ClusterMapReduceTestCase to use MiniDFSCluster.Builder
 -

 Key: MAPREDUCE-2623
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2623
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: test
Affects Versions: 0.23.0
Reporter: Jim Plush
Assignee: Harsh J
Priority: Minor
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2623.r1.diff, MAPREDUCE-2623.r2.diff


 Looking at test class ClusterMapReduceTestCase it issues a warning that the 
 dfsCluster = new MiniDFSCluster(conf, 2, reformatDFS, null); line of code is 
 deprecated and MiniDFSCluster.Builder should be used instead. It notes that 
 the current API will be phased out in version 24. I propose to update the 
 test class to the most up to date code as it's referenced several places on 
 the internet as an example of how to write a Hadoop Unit Test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2669) Some new examples and test cases for them.

2011-07-18 Thread Plamen Jeliazkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067168#comment-13067168
 ] 

Plamen Jeliazkov commented on MAPREDUCE-2669:
-

Thank you, Devaraj! Yes I have been filing it on the review board; I have been 
uploading the .patchs here as well as on the review board. I will add your 
comments to the patch and upload again soon.

 Some new examples and test cases for them.
 --

 Key: MAPREDUCE-2669
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2669
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: examples
Affects Versions: 0.22.0
Reporter: Plamen Jeliazkov
Priority: Minor
 Attachments: MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, 
 MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, mapreduce-new-examples-0.22.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Looking to add some more examples such as Mean, Median, and Standard 
 Deviation to the examples.
 I have some generic JUnit testcases as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2623) Update ClusterMapReduceTestCase to use MiniDFSCluster.Builder

2011-07-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067173#comment-13067173
 ] 

Hudson commented on MAPREDUCE-2623:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #747 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/747/])
MAPREDUCE-2623. Update ClusterMapReduceTestCase to use 
MiniDFSCluster.Builder. Contributed by Harsh J Chouraria

eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1147981
Files : 
* /hadoop/common/trunk/mapreduce/CHANGES.txt
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/ClusterMapReduceTestCase.java


 Update ClusterMapReduceTestCase to use MiniDFSCluster.Builder
 -

 Key: MAPREDUCE-2623
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2623
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: test
Affects Versions: 0.23.0
Reporter: Jim Plush
Assignee: Harsh J
Priority: Minor
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2623.r1.diff, MAPREDUCE-2623.r2.diff


 Looking at test class ClusterMapReduceTestCase it issues a warning that the 
 dfsCluster = new MiniDFSCluster(conf, 2, reformatDFS, null); line of code is 
 deprecated and MiniDFSCluster.Builder should be used instead. It notes that 
 the current API will be phased out in version 24. I propose to update the 
 test class to the most up to date code as it's referenced several places on 
 the internet as an example of how to write a Hadoop Unit Test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2701) MR-279: app/Job.java needs UGI for the user that launched it

2011-07-18 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067182#comment-13067182
 ] 

Robert Joseph Evans commented on MAPREDUCE-2701:


I am requesting that someone please review this patch.

 MR-279: app/Job.java needs UGI for the user that launched it
 

 Key: MAPREDUCE-2701
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2701
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.0

 Attachments: MR-2701-v1.patch


 ./mr-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java
  is missing some data that is needed by the Job History GUI.  It needs the 
 UGI for the user that launched it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2652) MR-279: Cannot run multiple NMs on a single node

2011-07-18 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067183#comment-13067183
 ] 

Robert Joseph Evans commented on MAPREDUCE-2652:


I am requesting that someone please review this patch Thanks.

 MR-279: Cannot run multiple NMs on a single node 
 -

 Key: MAPREDUCE-2652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.0

 Attachments: MR-2652-v1.txt, MR-2652-v2.txt


 Currently in MR-279 the Auxiliary services, like ShuffleHandler, have no way 
 to communicate information back to the applications.  Because of this the Map 
 Reduce Application Master has hardcoded in a port of 8080 for shuffle.  This 
 prevents the configuration mapreduce.shuffle.port form ever being set to 
 anything but 8080.  The code should be updated to allow this information to 
 be returned to the application master.  Also the data needs to be persisted 
 to the task log so that on restart the data is not lost.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2494) Make the distributed cache delete entires using LRU priority

2011-07-18 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067184#comment-13067184
 ] 

Robert Joseph Evans commented on MAPREDUCE-2494:


I am requesting that someone please review the patch for the 0.20 security 
line.  The changes are almost identical to what went into trunk.

Thanks

 Make the distributed cache delete entires using LRU priority
 

 Key: MAPREDUCE-2494
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2494
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: distributed-cache
Affects Versions: 0.20.205.0, 0.21.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2494-20.20X-V1.patch, MAPREDUCE-2494-V1.patch, 
 MAPREDUCE-2494-V2.patch


 Currently the distributed cache will wait until a cache directory is above a 
 preconfigured threshold.  At which point it will delete all entries that are 
 not currently being used.  It seems like we would get far fewer cache misses 
 if we kept some of them around, even when they are not being used.  We should 
 add in a configurable percentage for a goal of how much of the cache should 
 remain clear when not in use, and select objects to delete based off of how 
 recently they were used, and possibly also how large they are/how difficult 
 is it to download them again.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2324) Job should fail if a reduce task can't be scheduled anywhere

2011-07-18 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067193#comment-13067193
 ] 

Robert Joseph Evans commented on MAPREDUCE-2324:


I uploaded a patch a while ago and the conversation has kind of died off.  Can 
someone please review the patch and give me some feedback on it.  If it is 
something that you don't want to put into a sustaining release at this time 
then please give me some feedback possibly with a -1, depending on how adamant 
you are about it, so I can address those issues perhaps by fixing it just in 
0.23 instead.

 Job should fail if a reduce task can't be scheduled anywhere
 

 Key: MAPREDUCE-2324
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2324
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.20.205.0
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
 Attachments: MR-2324-security-v1.txt


 If there's a reduce task that needs more disk space than is available on any 
 mapred.local.dir in the cluster, that task will stay pending forever. For 
 example, we produced this in a QA cluster by accidentally running terasort 
 with one reducer - since no mapred.local.dir had 1T free, the job remained in 
 pending state for several days. The reason for the stuck task wasn't clear 
 from a user perspective until we looked at the JT logs.
 Probably better to just fail the job if a reduce task goes through all TTs 
 and finds that there isn't enough space.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-2705) tasks localized and launched serially by TaskLauncher - causing other tasks to be delayed

2011-07-18 Thread Thomas Graves (JIRA)
tasks localized and launched serially by TaskLauncher - causing other tasks to 
be delayed
-

 Key: MAPREDUCE-2705
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2705
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 0.20.205.0
Reporter: Thomas Graves
Assignee: Thomas Graves


The current TaskLauncher serially launches new tasks one at a time. During the 
launch it does the localization and then starts the map/reduce task.  This can 
cause any other tasks to be blocked waiting for the current task to be 
localized and started. In some instances we have seen a task that has a large 
file to localize (1.2MB) block another task for about 40 minutes. This 
particular task being blocked was a cleanup task which caused the job to be 
delayed finishing for the 40 minutes.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2658) Problem running full map reduce jobs on mrv2

2011-07-18 Thread Ahmed Radwan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067236#comment-13067236
 ] 

Ahmed Radwan commented on MAPREDUCE-2658:
-

Thanks Arun, I'll take a look. I think this will require considering the 
MAPREDUCE-2400 recent changes. Any other issues I should also consider?

 Problem running full map  reduce jobs on mrv2
 --

 Key: MAPREDUCE-2658
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2658
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-2658.patch


 Following the installation instructions at: 
 https://svn.apache.org/repos/asf/hadoop/common/branches/MR-279/mapreduce/INSTALL
 the randomwriter example runs successfully. However, other full map  reduce 
 jobs (e.g. wordcount) fail with the error:
 java.lang.UnsupportedOperationException: Incompatible with LocalRunner
   at 
 org.apache.hadoop.mapred.YarnOutputFiles.getInputFile(YarnOutputFiles.java:200)
   at org.apache.hadoop.mapred.ReduceTask.getMapFiles(ReduceTask.java:223)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:412)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:148)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1094)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:143)
 The ReduceTask evaluates the isLocal flag based on the property 
 mapreduce.jobtracker.address, the default value for this property in 
 mapred-default.xml is 'local' and this is the cause of the problem.
 Setting mapreduce.jobtracker.address in the mapred-site.xml to something 
 other than local seems to solve the problem. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2324) Job should fail if a reduce task can't be scheduled anywhere

2011-07-18 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067240#comment-13067240
 ] 

Todd Lipcon commented on MAPREDUCE-2324:


Hey Bobby. Sorry, was on vacation last week so only partially keeping up with 
JIRA traffic.

My worry mostly has to do with this feature being kicked in as a false 
positive. In general, false positives here are very expensive, whereas false 
negatives are not nearly as drastic.

For example, imagine a cluster with 10 nodes and a couple of jobs submitted. 
One of the nodes is out of disk space. The first job, when submitted, takes up 
all the reduce slots on the first 9 nodes, but the 10th node is left empty 
since it's out of space. When the second job is submitted, all of the free 
reduce slots on the cluster are located on this remaining node. Every time the 
node heartbeats, the counter will get incremented for the queued up job. After 
10 heartbeats, the job will fail, even though it was just a single problematic 
node.

So, I think we do need to wait for a scheduling opportunity on at least some 
number of unique nodes before failing the job. It seems we could do this with a 
single HashSet per job - whenever any reduce task is successfully scheduld, the 
set is cleared. Whenever a job is given an opportunity to schedule reduces on a 
node, but can't due to resource constraints, it's added to the set. Once the 
size of the set eclipses some percentage of the nodes on the cluster, it fails 
the job. This memory usage would be O(nodes*jobs) rather than O(nodes*tasks) -- 
and thus not too bad.

 Job should fail if a reduce task can't be scheduled anywhere
 

 Key: MAPREDUCE-2324
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2324
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.20.205.0
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
 Attachments: MR-2324-security-v1.txt


 If there's a reduce task that needs more disk space than is available on any 
 mapred.local.dir in the cluster, that task will stay pending forever. For 
 example, we produced this in a QA cluster by accidentally running terasort 
 with one reducer - since no mapred.local.dir had 1T free, the job remained in 
 pending state for several days. The reason for the stuck task wasn't clear 
 from a user perspective until we looked at the JT logs.
 Probably better to just fail the job if a reduce task goes through all TTs 
 and finds that there isn't enough space.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2705) tasks localized and launched serially by TaskLauncher - causing other tasks to be delayed

2011-07-18 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067250#comment-13067250
 ] 

Thomas Graves commented on MAPREDUCE-2705:
--

Note 1.2MB should be 1.2GB. 

 tasks localized and launched serially by TaskLauncher - causing other tasks 
 to be delayed
 -

 Key: MAPREDUCE-2705
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2705
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 0.20.205.0
Reporter: Thomas Graves
Assignee: Thomas Graves

 The current TaskLauncher serially launches new tasks one at a time. During 
 the launch it does the localization and then starts the map/reduce task.  
 This can cause any other tasks to be blocked waiting for the current task to 
 be localized and started. In some instances we have seen a task that has a 
 large file to localize (1.2MB) block another task for about 40 minutes. This 
 particular task being blocked was a cleanup task which caused the job to be 
 delayed finishing for the 40 minutes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2324) Job should fail if a reduce task can't be scheduled anywhere

2011-07-18 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067264#comment-13067264
 ] 

Robert Joseph Evans commented on MAPREDUCE-2324:


That is a very good point and I really like the solution.  I will incorporate 
your comments and upload a new patch.

 Job should fail if a reduce task can't be scheduled anywhere
 

 Key: MAPREDUCE-2324
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2324
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.20.205.0
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
 Attachments: MR-2324-security-v1.txt


 If there's a reduce task that needs more disk space than is available on any 
 mapred.local.dir in the cluster, that task will stay pending forever. For 
 example, we produced this in a QA cluster by accidentally running terasort 
 with one reducer - since no mapred.local.dir had 1T free, the job remained in 
 pending state for several days. The reason for the stuck task wasn't clear 
 from a user perspective until we looked at the JT logs.
 Probably better to just fail the job if a reduce task goes through all TTs 
 and finds that there isn't enough space.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-2706) MR-279: Submit jobs beyond the max jobs per queue limit no longer gets logged

2011-07-18 Thread Jeffrey Naisbitt (JIRA)
MR-279: Submit jobs beyond the max jobs per queue limit no longer gets logged
-

 Key: MAPREDUCE-2706
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2706
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Jeffrey Naisbitt


Submitting jobs over the queue limits used to print log messages such as these:
hadoop-mapred-jobtracker-HOSTNAME.log. ... INFO
org.apache.hadoop.mapred.CapacityTaskScheduler: default has 10 active tasks for 
user MYUSER, cannot initialize
job_XXX with 10 tasks since it will exceed limit of 15 active tasks per user 
for this queue
and
hadoop-mapred-jobtracker-HOSTNAME.log ... INFO 
org.apache.hadoop.mapred.CapacityTaskScheduler: default already has 2 running 
jobs and 0 initializing jobs; cannot initialize job_XXX since it will exceeed 
limit of 2 initialized jobs for this queue

These log messages are useful - especially for QA and testing.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2706) MR-279: Submit jobs beyond the max jobs per queue limit no longer gets logged

2011-07-18 Thread Jeffrey Naisbitt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Naisbitt updated MAPREDUCE-2706:


Attachment: MAPREDUCE-2706.patch

 MR-279: Submit jobs beyond the max jobs per queue limit no longer gets logged
 -

 Key: MAPREDUCE-2706
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2706
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Jeffrey Naisbitt
 Attachments: MAPREDUCE-2706.patch


 Submitting jobs over the queue limits used to print log messages such as 
 these:
 hadoop-mapred-jobtracker-HOSTNAME.log. ... INFO
 org.apache.hadoop.mapred.CapacityTaskScheduler: default has 10 active tasks 
 for user MYUSER, cannot initialize
 job_XXX with 10 tasks since it will exceed limit of 15 active tasks per user 
 for this queue
 and
 hadoop-mapred-jobtracker-HOSTNAME.log ... INFO 
 org.apache.hadoop.mapred.CapacityTaskScheduler: default already has 2 running 
 jobs and 0 initializing jobs; cannot initialize job_XXX since it will exceeed 
 limit of 2 initialized jobs for this queue
 These log messages are useful - especially for QA and testing.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2706) MR-279: Submit jobs beyond the max jobs per queue limit no longer gets logged

2011-07-18 Thread Jeffrey Naisbitt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Naisbitt updated MAPREDUCE-2706:


Status: Patch Available  (was: Open)

 MR-279: Submit jobs beyond the max jobs per queue limit no longer gets logged
 -

 Key: MAPREDUCE-2706
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2706
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Jeffrey Naisbitt
 Attachments: MAPREDUCE-2706.patch


 Submitting jobs over the queue limits used to print log messages such as 
 these:
 hadoop-mapred-jobtracker-HOSTNAME.log. ... INFO
 org.apache.hadoop.mapred.CapacityTaskScheduler: default has 10 active tasks 
 for user MYUSER, cannot initialize
 job_XXX with 10 tasks since it will exceed limit of 15 active tasks per user 
 for this queue
 and
 hadoop-mapred-jobtracker-HOSTNAME.log ... INFO 
 org.apache.hadoop.mapred.CapacityTaskScheduler: default already has 2 running 
 jobs and 0 initializing jobs; cannot initialize job_XXX since it will exceeed 
 limit of 2 initialized jobs for this queue
 These log messages are useful - especially for QA and testing.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2706) MR-279: Submit jobs beyond the max jobs per queue limit no longer gets logged

2011-07-18 Thread Jeffrey Naisbitt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Naisbitt updated MAPREDUCE-2706:


Attachment: MAPREDUCE-2706.patch

 MR-279: Submit jobs beyond the max jobs per queue limit no longer gets logged
 -

 Key: MAPREDUCE-2706
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2706
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Jeffrey Naisbitt
 Attachments: MAPREDUCE-2706.patch


 Submitting jobs over the queue limits used to print log messages such as 
 these:
 hadoop-mapred-jobtracker-HOSTNAME.log. ... INFO
 org.apache.hadoop.mapred.CapacityTaskScheduler: default has 10 active tasks 
 for user MYUSER, cannot initialize
 job_XXX with 10 tasks since it will exceed limit of 15 active tasks per user 
 for this queue
 and
 hadoop-mapred-jobtracker-HOSTNAME.log ... INFO 
 org.apache.hadoop.mapred.CapacityTaskScheduler: default already has 2 running 
 jobs and 0 initializing jobs; cannot initialize job_XXX since it will exceeed 
 limit of 2 initialized jobs for this queue
 These log messages are useful - especially for QA and testing.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2706) MR-279: Submit jobs beyond the max jobs per queue limit no longer gets logged

2011-07-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067275#comment-13067275
 ] 

Hadoop QA commented on MAPREDUCE-2706:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12486917/MAPREDUCE-2706.patch
  against trunk revision 1147981.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/477//console

This message is automatically generated.

 MR-279: Submit jobs beyond the max jobs per queue limit no longer gets logged
 -

 Key: MAPREDUCE-2706
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2706
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Jeffrey Naisbitt
 Attachments: MAPREDUCE-2706.patch


 Submitting jobs over the queue limits used to print log messages such as 
 these:
 hadoop-mapred-jobtracker-HOSTNAME.log. ... INFO
 org.apache.hadoop.mapred.CapacityTaskScheduler: default has 10 active tasks 
 for user MYUSER, cannot initialize
 job_XXX with 10 tasks since it will exceed limit of 15 active tasks per user 
 for this queue
 and
 hadoop-mapred-jobtracker-HOSTNAME.log ... INFO 
 org.apache.hadoop.mapred.CapacityTaskScheduler: default already has 2 running 
 jobs and 0 initializing jobs; cannot initialize job_XXX since it will exceeed 
 limit of 2 initialized jobs for this queue
 These log messages are useful - especially for QA and testing.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2706) MR-279: Submit jobs beyond the max jobs per queue limit no longer gets logged

2011-07-18 Thread Jeffrey Naisbitt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067276#comment-13067276
 ] 

Jeffrey Naisbitt commented on MAPREDUCE-2706:
-

This patch is for the MR-279 branch, so the above test-patch results are not 
applicable

 MR-279: Submit jobs beyond the max jobs per queue limit no longer gets logged
 -

 Key: MAPREDUCE-2706
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2706
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Jeffrey Naisbitt
 Attachments: MAPREDUCE-2706.patch


 Submitting jobs over the queue limits used to print log messages such as 
 these:
 hadoop-mapred-jobtracker-HOSTNAME.log. ... INFO
 org.apache.hadoop.mapred.CapacityTaskScheduler: default has 10 active tasks 
 for user MYUSER, cannot initialize
 job_XXX with 10 tasks since it will exceed limit of 15 active tasks per user 
 for this queue
 and
 hadoop-mapred-jobtracker-HOSTNAME.log ... INFO 
 org.apache.hadoop.mapred.CapacityTaskScheduler: default already has 2 running 
 jobs and 0 initializing jobs; cannot initialize job_XXX since it will exceeed 
 limit of 2 initialized jobs for this queue
 These log messages are useful - especially for QA and testing.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2638) Create a simple stress test for the fair scheduler

2011-07-18 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067291#comment-13067291
 ] 

Tom White commented on MAPREDUCE-2638:
--

Thanks Matei. The preemption intervals are indeed very low - they are set like 
this in order to trigger preemption in a pseudo-distributed cluster and so 
stress the scheduler. For larger clusters the settings you suggest are entirely 
appropriate, as well as increasing the sleep time in the jobs by setting 
{{test.fairscheduler.sleepTime}} to a higher value.

 Create a simple stress test for the fair scheduler
 --

 Key: MAPREDUCE-2638
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2638
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: contrib/fair-share
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-2638.patch, MAPREDUCE-2638.patch


 This would be a test that runs against a cluster, typically with settings 
 that allow preemption to be exercised.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2324) Job should fail if a reduce task can't be scheduled anywhere

2011-07-18 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-2324:
---

Status: Open  (was: Patch Available)

Uploading new patch. 

 Job should fail if a reduce task can't be scheduled anywhere
 

 Key: MAPREDUCE-2324
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2324
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.20.205.0
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
 Attachments: MR-2324-security-v1.txt


 If there's a reduce task that needs more disk space than is available on any 
 mapred.local.dir in the cluster, that task will stay pending forever. For 
 example, we produced this in a QA cluster by accidentally running terasort 
 with one reducer - since no mapred.local.dir had 1T free, the job remained in 
 pending state for several days. The reason for the stuck task wasn't clear 
 from a user perspective until we looked at the JT logs.
 Probably better to just fail the job if a reduce task goes through all TTs 
 and finds that there isn't enough space.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2324) Job should fail if a reduce task can't be scheduled anywhere

2011-07-18 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-2324:
---

Attachment: MR-2324-security-v2.txt

 [exec] 
 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 6 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.


I also did some math.  On our largest cluster here at Yahoo! we have  5000 
machines and at most about 200 jobs running concurrently.  That comes out to 
about 8-16 MB in extra heap usage on the JT, if the HashMap is half full and 
all of those 200 jobs are about to fail because of reduce scheduling issues.

 Job should fail if a reduce task can't be scheduled anywhere
 

 Key: MAPREDUCE-2324
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2324
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.20.205.0
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
 Attachments: MR-2324-security-v1.txt, MR-2324-security-v2.txt


 If there's a reduce task that needs more disk space than is available on any 
 mapred.local.dir in the cluster, that task will stay pending forever. For 
 example, we produced this in a QA cluster by accidentally running terasort 
 with one reducer - since no mapred.local.dir had 1T free, the job remained in 
 pending state for several days. The reason for the stuck task wasn't clear 
 from a user perspective until we looked at the JT logs.
 Probably better to just fail the job if a reduce task goes through all TTs 
 and finds that there isn't enough space.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2324) Job should fail if a reduce task can't be scheduled anywhere

2011-07-18 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-2324:
---

Status: Patch Available  (was: Open)

 Job should fail if a reduce task can't be scheduled anywhere
 

 Key: MAPREDUCE-2324
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2324
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.20.205.0
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
 Attachments: MR-2324-security-v1.txt, MR-2324-security-v2.txt


 If there's a reduce task that needs more disk space than is available on any 
 mapred.local.dir in the cluster, that task will stay pending forever. For 
 example, we produced this in a QA cluster by accidentally running terasort 
 with one reducer - since no mapred.local.dir had 1T free, the job remained in 
 pending state for several days. The reason for the stuck task wasn't clear 
 from a user perspective until we looked at the JT logs.
 Probably better to just fail the job if a reduce task goes through all TTs 
 and finds that there isn't enough space.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2324) Job should fail if a reduce task can't be scheduled anywhere

2011-07-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067297#comment-13067297
 ] 

Hadoop QA commented on MAPREDUCE-2324:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12486922/MR-2324-security-v2.txt
  against trunk revision 1147981.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/478//console

This message is automatically generated.

 Job should fail if a reduce task can't be scheduled anywhere
 

 Key: MAPREDUCE-2324
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2324
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.20.205.0
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
 Attachments: MR-2324-security-v1.txt, MR-2324-security-v2.txt


 If there's a reduce task that needs more disk space than is available on any 
 mapred.local.dir in the cluster, that task will stay pending forever. For 
 example, we produced this in a QA cluster by accidentally running terasort 
 with one reducer - since no mapred.local.dir had 1T free, the job remained in 
 pending state for several days. The reason for the stuck task wasn't clear 
 from a user perspective until we looked at the JT logs.
 Probably better to just fail the job if a reduce task goes through all TTs 
 and finds that there isn't enough space.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2638) Create a simple stress test for the fair scheduler

2011-07-18 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067311#comment-13067311
 ] 

Matei Zaharia commented on MAPREDUCE-2638:
--

OK, that makes sense. +1 to commit this then.

 Create a simple stress test for the fair scheduler
 --

 Key: MAPREDUCE-2638
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2638
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: contrib/fair-share
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-2638.patch, MAPREDUCE-2638.patch


 This would be a test that runs against a cluster, typically with settings 
 that allow preemption to be exercised.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-2707) ProtoOverHadoopRpcEngine without using TunnelProtocol over WritableRpc

2011-07-18 Thread Jitendra Nath Pandey (JIRA)
ProtoOverHadoopRpcEngine without using TunnelProtocol over WritableRpc
--

 Key: MAPREDUCE-2707
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2707
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


ProtoOverHadoopRpcEngine is introduced in MR-279, which uses TunnelProtocol 
over WritableRpcEngine. This jira removes the tunnel protocol and lets 
ProtoOverHadoopRpcEngine directly interact with ipc.Client and ipc.Server.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2707) ProtoOverHadoopRpcEngine without using TunnelProtocol over WritableRpc

2011-07-18 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067319#comment-13067319
 ] 

Jitendra Nath Pandey commented on MAPREDUCE-2707:
-

 This jira doesn't intend to remove writable from ipc.Client/Server. That is 
proposed in a different jira (HADOOP-7399). This will just remove 
TunnelProtocol but the protocol buffer messages will still be wrapped in a 
generic Writable and passed to ipc Client. 
  When HADOOP-7399 is ready to go, ProtoOverHadoopRpcEngine will be modified 
not to wrap request/response into Writable.

 ProtoOverHadoopRpcEngine without using TunnelProtocol over WritableRpc
 --

 Key: MAPREDUCE-2707
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2707
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey

 ProtoOverHadoopRpcEngine is introduced in MR-279, which uses TunnelProtocol 
 over WritableRpcEngine. This jira removes the tunnel protocol and lets 
 ProtoOverHadoopRpcEngine directly interact with ipc.Client and ipc.Server.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2324) Job should fail if a reduce task can't be scheduled anywhere

2011-07-18 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067336#comment-13067336
 ] 

Todd Lipcon commented on MAPREDUCE-2324:


Not sure if I'll have time to review this in the next couple days. Anyone over 
there who could review for you? Otherwise I'll try to look by the end of the 
week.

 Job should fail if a reduce task can't be scheduled anywhere
 

 Key: MAPREDUCE-2324
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2324
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.20.205.0
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
 Attachments: MR-2324-security-v1.txt, MR-2324-security-v2.txt


 If there's a reduce task that needs more disk space than is available on any 
 mapred.local.dir in the cluster, that task will stay pending forever. For 
 example, we produced this in a QA cluster by accidentally running terasort 
 with one reducer - since no mapred.local.dir had 1T free, the job remained in 
 pending state for several days. The reason for the stuck task wasn't clear 
 from a user perspective until we looked at the JT logs.
 Probably better to just fail the job if a reduce task goes through all TTs 
 and finds that there isn't enough space.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2650) back-port MAPREDUCE-2238 to 0.20-security

2011-07-18 Thread Sherry Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067368#comment-13067368
 ] 

Sherry Chen commented on MAPREDUCE-2650:


Todd,
I did not make it clear in previous comment.
Throws an exception (when makedirs failed) semantics are used in trunk and CDH3.
It's good to put it in 0.20-security.

 back-port MAPREDUCE-2238 to 0.20-security
 -

 Key: MAPREDUCE-2650
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2650
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.20.205.0
Reporter: Sherry Chen
Assignee: Sherry Chen
 Attachments: MAPREDUCE-2650.patch


 Dev had seen the attempt directory permission getting set to 000 or 111 in 
 the CI builds and tests run on dev desktops with 0.20-security.
 MAPREDUCE-2238 reported and fixed the issue for 0.22.0, back-port to 
 0.20-security is needed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2669) Some new examples and test cases for them.

2011-07-18 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated MAPREDUCE-2669:


Attachment: MAPREDUCE-2669.patch

The reason for the 5MB patch is that it includes a sample text file for the 
JUnit tests to use.

I have done applied the patch myself and it appears to be working correctly.

I don't know why the core tests are failing, or the contrib tests, but after 
looking them over twice now I am pretty sure I can conclude that they were 
present prior to my patch.

In any case, here is the latest patch!

 Some new examples and test cases for them.
 --

 Key: MAPREDUCE-2669
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2669
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: examples
Affects Versions: 0.22.0
Reporter: Plamen Jeliazkov
Priority: Minor
 Attachments: MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, 
 MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, 
 mapreduce-new-examples-0.22.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Looking to add some more examples such as Mean, Median, and Standard 
 Deviation to the examples.
 I have some generic JUnit testcases as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2627) guava-r09 JAR file needs to be added to mapreduce.

2011-07-18 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated MAPREDUCE-2627:


Status: Patch Available  (was: Open)

 guava-r09 JAR file needs to be added to mapreduce.
 --

 Key: MAPREDUCE-2627
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2627
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.22.0
Reporter: Plamen Jeliazkov
Priority: Blocker
 Attachments: patch.txt

   Original Estimate: 24h
  Remaining Estimate: 24h

 Need to add the guava-r09.jar file into the 
 mapreduce/build/ivy/lib/Hadoop/common directory; missing from build.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2627) guava-r09 JAR file needs to be added to mapreduce.

2011-07-18 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated MAPREDUCE-2627:


Fix Version/s: 0.22.0

 guava-r09 JAR file needs to be added to mapreduce.
 --

 Key: MAPREDUCE-2627
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2627
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.22.0
Reporter: Plamen Jeliazkov
Priority: Blocker
 Fix For: 0.22.0

 Attachments: patch.txt

   Original Estimate: 24h
  Remaining Estimate: 24h

 Need to add the guava-r09.jar file into the 
 mapreduce/build/ivy/lib/Hadoop/common directory; missing from build.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2627) guava-r09 JAR file needs to be added to mapreduce.

2011-07-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067402#comment-13067402
 ] 

Hadoop QA commented on MAPREDUCE-2627:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12485346/patch.txt
  against trunk revision 1147981.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/480//console

This message is automatically generated.

 guava-r09 JAR file needs to be added to mapreduce.
 --

 Key: MAPREDUCE-2627
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2627
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.22.0
Reporter: Plamen Jeliazkov
Priority: Blocker
 Fix For: 0.22.0

 Attachments: patch.txt

   Original Estimate: 24h
  Remaining Estimate: 24h

 Need to add the guava-r09.jar file into the 
 mapreduce/build/ivy/lib/Hadoop/common directory; missing from build.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2627) guava-r09 JAR file needs to be added to mapreduce.

2011-07-18 Thread Plamen Jeliazkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067406#comment-13067406
 ] 

Plamen Jeliazkov commented on MAPREDUCE-2627:
-

QA bot ran on trunk revision -- patch was intended for branch 0.22.0

 guava-r09 JAR file needs to be added to mapreduce.
 --

 Key: MAPREDUCE-2627
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2627
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.22.0
Reporter: Plamen Jeliazkov
Priority: Blocker
 Fix For: 0.22.0

 Attachments: patch.txt

   Original Estimate: 24h
  Remaining Estimate: 24h

 Need to add the guava-r09.jar file into the 
 mapreduce/build/ivy/lib/Hadoop/common directory; missing from build.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2707) ProtoOverHadoopRpcEngine without using TunnelProtocol over WritableRpc

2011-07-18 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated MAPREDUCE-2707:


Attachment: MAPREDUCE-2707.2.patch

 ProtoOverHadoopRpcEngine without using TunnelProtocol over WritableRpc
 --

 Key: MAPREDUCE-2707
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2707
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: MAPREDUCE-2707.2.patch


 ProtoOverHadoopRpcEngine is introduced in MR-279, which uses TunnelProtocol 
 over WritableRpcEngine. This jira removes the tunnel protocol and lets 
 ProtoOverHadoopRpcEngine directly interact with ipc.Client and ipc.Server.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2707) ProtoOverHadoopRpcEngine without using TunnelProtocol over WritableRpc

2011-07-18 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067410#comment-13067410
 ] 

Jitendra Nath Pandey commented on MAPREDUCE-2707:
-

The patch uploaded is for MR-279 branch only.

 ProtoOverHadoopRpcEngine without using TunnelProtocol over WritableRpc
 --

 Key: MAPREDUCE-2707
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2707
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: MAPREDUCE-2707.2.patch


 ProtoOverHadoopRpcEngine is introduced in MR-279, which uses TunnelProtocol 
 over WritableRpcEngine. This jira removes the tunnel protocol and lets 
 ProtoOverHadoopRpcEngine directly interact with ipc.Client and ipc.Server.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2669) Some new examples and test cases for them.

2011-07-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067451#comment-13067451
 ] 

Hadoop QA commented on MAPREDUCE-2669:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12486940/MAPREDUCE-2669.patch
  against trunk revision 1147981.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 12 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

-1 release audit.  The applied patch generated 3 release audit warnings 
(more than the trunk's current 2 warnings).

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestMRCLI
  org.apache.hadoop.fs.TestFileSystem

-1 contrib tests.  The patch failed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/479//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/479//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/479//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/479//console

This message is automatically generated.

 Some new examples and test cases for them.
 --

 Key: MAPREDUCE-2669
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2669
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: examples
Affects Versions: 0.22.0
Reporter: Plamen Jeliazkov
Priority: Minor
 Attachments: MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, 
 MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, 
 mapreduce-new-examples-0.22.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Looking to add some more examples such as Mean, Median, and Standard 
 Deviation to the examples.
 I have some generic JUnit testcases as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2339) optimize JobInProgress.getTaskInProgress(taskid)

2011-07-18 Thread Liyin Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067452#comment-13067452
 ] 

Liyin Liang commented on MAPREDUCE-2339:


Nice patch!
A user submitted a job with more than 680,000 map tasks to our cluster. Then 
jobtracker become inefficient to process heartbeats, many threads are blocked 
and lots of requests are queued. Through jstack of JobTracker process, we find 
most of the time are spent on JIP.getTaskInProgress().
This patch is a good way to improve JIP.getTaskInProgress()'s performance and 
fix our problem.

 optimize JobInProgress.getTaskInProgress(taskid)
 

 Key: MAPREDUCE-2339
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2339
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Affects Versions: 0.20.2, 0.21.0
Reporter: Kang Xiao
 Attachments: MAPREDUCE-2339.patch, MAPREDUCE-2339.patch


 JobInProgress.getTaskInProgress(taskid) use a linner search to get the 
 TaskInProgress object by taskid. In fact, it can be replaced by much more 
 efficient array index operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (MAPREDUCE-2694) AM releases too many containers due to the protocol

2011-07-18 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reopened MAPREDUCE-2694:


  Assignee: (was: Arun C Murthy)

Reopening the issue as the discussion is still happening.

 AM releases too many containers due to the protocol
 ---

 Key: MAPREDUCE-2694
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2694
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Arun C Murthy

 - AM sends request asking 4 containers on host H1.
 - Asynchronously, host H1 reaches RM and gets assigned 4 containers. RM at 
 this point, sets the value against H1 to
 zero in its aggregate request-table for all apps.
 - In the mean-while AM gets to need 3 more containers, so a total of 7 
 including the 4 from previous request.
 - Today, AM sends the absolute number of 7 against H1 to RM as part of its 
 request table.
 - RM seems to be overriding its earlier value of zero against H1 to 7 against 
 H1. And thus allocating 7 more
 containers.
 - AM already gets 4 in this scheduling iteration, but gets 7 more, a total of 
 11 instead of the required 7.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2589) TaskTracker not purging userlog directories

2011-07-18 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-2589:
-

Fix Version/s: 0.20.205.0

The patch looks good.

One minor nit:

 I think the variable name below:

{quote}
  long logRetainiMillSec = DEFAULT_USER_LOG_RETAIN_MAX_HOURS * 60 * 60 * 1000;
{quote}
was supposed to be logRetainMilliSec? (spelling mistake?)

Also, can you please post the ant test results on the jira? 

THe patch lacks unit tests,  have you already verified the fix on a small 
cluster?

 TaskTracker not purging userlog directories
 ---

 Key: MAPREDUCE-2589
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 0.20.205.0
 Environment: 0.20.205
Reporter: Sherry Chen
Assignee: Sherry Chen
Priority: Minor
 Fix For: 0.20.205.0

 Attachments: MAPREDUCE-2589.patch, cleanup_userlogs.py


 UserLogCleaner is not robust. Leftover userlogs after a restart sometimes 
 have to be manually
 cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2691) Finish up the cleanup of distributed cache file resources and related tests.

2011-07-18 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-2691:
---

Summary: Finish up the cleanup of distributed cache file resources and 
related tests.  (was: Implement cleanup of distributed cache file resources)

bq. Vinod, I think you had a patch right?
Nope, not me. But [~chris.douglas] already pushed a patch to MR-279 branch.

But let's leave this open so that I can verify the fix and if possible look at 
the tests.

Changing the title to reflect the same.

 Finish up the cleanup of distributed cache file resources and related tests.
 

 Key: MAPREDUCE-2691
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2691
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Amol Kekre
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.0


 Implement cleanup of distributed cache file resources

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2708) Design and implement MR Application Master recovery

2011-07-18 Thread Sharad Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharad Agarwal updated MAPREDUCE-2708:
--

Component/s: mrv2

 Design and implement MR Application Master recovery
 ---

 Key: MAPREDUCE-2708
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2708
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv2
Reporter: Sharad Agarwal
Assignee: Sharad Agarwal

 Design recovery of MR AM from crashes/node failures. The running job should 
 recover from the state it left off.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-2708) Design and implement MR Application Master recovery

2011-07-18 Thread Sharad Agarwal (JIRA)
Design and implement MR Application Master recovery
---

 Key: MAPREDUCE-2708
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2708
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Sharad Agarwal
Assignee: Sharad Agarwal


Design recovery of MR AM from crashes/node failures. The running job should 
recover from the state it left off.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2589) TaskTracker not purging userlog directories

2011-07-18 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067516#comment-13067516
 ] 

Devaraj K commented on MAPREDUCE-2589:
--

One improvement can be done in the patch, now for every file in the user log 
directory it is getting the jobs which are to be completed every time and 
checking. Instead of this it can get the jobs list once and can check for all 
the files in the user log directory whether it belongs to running job or not.

 TaskTracker not purging userlog directories
 ---

 Key: MAPREDUCE-2589
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 0.20.205.0
 Environment: 0.20.205
Reporter: Sherry Chen
Assignee: Sherry Chen
Priority: Minor
 Fix For: 0.20.205.0

 Attachments: MAPREDUCE-2589.patch, cleanup_userlogs.py


 UserLogCleaner is not robust. Leftover userlogs after a restart sometimes 
 have to be manually
 cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira