[jira] [Commented] (MAPREDUCE-3473) A single task tracker failure shouldn't result in Job failure

2011-12-22 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174685#comment-13174685
 ] 

Eli Collins commented on MAPREDUCE-3473:


bq. a single machine failure doesn't result in job to fail; thats the whole 
point of hadoop. smile.

It can, that's the point of this bug!

 A single task tracker failure shouldn't result in Job failure 
 --

 Key: MAPREDUCE-3473
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3473
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: tasktracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Eli Collins

 Currently some task failures may result in job failures. Eg a local TT disk 
 failure seen in TaskLauncher#run, TaskRunner#run, MapTask#run is visible to 
 and can hang the JobClient, causing the job to fail. Job execution should 
 always be able to survive a task failure if there are sufficient resources. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3473) A single task tracker failure shouldn't result in Job failure

2011-12-22 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174704#comment-13174704
 ] 

Eli Collins commented on MAPREDUCE-3473:


MAPREDUCE-2960 contains details for a specific example. I think what's going on 
is that the ability to tolerate disk failures now means you can get a set of 
task attempt failures on a single TT that would have just been one (because the 
TT used to stop itself when it saw a disk failure).

 A single task tracker failure shouldn't result in Job failure 
 --

 Key: MAPREDUCE-3473
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3473
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: tasktracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Eli Collins

 Currently some task failures may result in job failures. Eg a local TT disk 
 failure seen in TaskLauncher#run, TaskRunner#run, MapTask#run is visible to 
 and can hang the JobClient, causing the job to fail. Job execution should 
 always be able to survive a task failure if there are sufficient resources. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3582) Move successfully passing MR1 tests to MR2 maven tree.

2011-12-22 Thread Ahmed Radwan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-3582:


Attachment: mv_script_MR-3582_rev2.sh
MAPREDUCE-3582_rev2.patch

Updated patch containing successful tests in fs,hdfs and examples packages. The 
patch also contains few updates to pom.xml dependencies  in the examples mvn 
module required for running the tests.
I am still examining the rest of mr1 test classes and will be updating the 
patch.

 Move successfully passing MR1 tests to MR2 maven tree.
 --

 Key: MAPREDUCE-3582
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3582
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-3582.patch, MAPREDUCE-3582_rev2.patch, 
 mv_script_MR-3582.sh, mv_script_MR-3582_rev2.sh


 This ticket will track moving mr1 tests that are passing successfully to mr2 
 maven tree.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3586) Lots of AMs hanging around in PIG testing

2011-12-22 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174788#comment-13174788
 ] 

Hudson commented on MAPREDUCE-3586:
---

Integrated in Hadoop-Hdfs-0.23-Build #115 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/115/])
MAPREDUCE-3586. Modified CompositeService to avoid duplicate stop 
operations thereby solving race conditions in MR AM shutdown. (vinodkv)
svn merge -c 1221950 --ignore-ancestry ../../trunk/

vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1221951
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/service/CompositeService.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestCompositeService.java


 Lots of AMs hanging around in PIG testing
 -

 Key: MAPREDUCE-3586
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3586
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3586-20111220.txt


 [~daijy] found this. Here's what he says:
 bq. I see hundreds of MRAppMaster process on my machine, and lots of tests 
 fail for Too many open files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3349) No rack-name logged in JobHistory for unsuccessful tasks

2011-12-22 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174786#comment-13174786
 ] 

Hudson commented on MAPREDUCE-3349:
---

Integrated in Hadoop-Hdfs-0.23-Build #115 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/115/])
MAPREDUCE-3349. Log rack-name in JobHistory for unsuccessful tasks. 
(Contributed by Amar Kamat and Devaraj K)

sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1221939
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/MapAttemptFinishedEvent.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/ReduceAttemptFinishedEvent.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/TaskAttemptFinishedEvent.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/TaskAttemptUnsuccessfulCompletionEvent.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJobHistoryParsing.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/java/org/apache/hadoop/mapred/JobInProgress.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEvents.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/tools/rumen/TestRumenJobTraces.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/tools/data/rumen/small-trace-test/counters-test-trace.json.gz
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/tools/data/rumen/small-trace-test/dispatch-trace-output.json.gz
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/tools/data/rumen/small-trace-test/job-tracker-logs-topology-output
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/tools/org/apache/hadoop/tools/rumen/HadoopLogsAnalyzer.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/tools/org/apache/hadoop/tools/rumen/JobBuilder.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/tools/org/apache/hadoop/tools/rumen/ParsedHost.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/tools/org/apache/hadoop/tools/rumen/TaskAttempt20LineEventEmitter.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/tools/org/apache/hadoop/tools/rumen/TopologyBuilder.java


 No rack-name logged in JobHistory for unsuccessful tasks
 

 Key: MAPREDUCE-3349
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3349
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Amar Kamat
Priority: Blocker
  Labels: hostname, rackname, rumen, unsuccessful
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3349-v1.11-branch-0.23.patch, 
 MAPREDUCE-3349-v1.11.patch, MAPREDUCE-3349-v1.4.patch, 
 MAPREDUCE-3349-v1.6.patch, MAPREDUCE-3349.patch


 Found this while running jobs on a cluster with [~Karams].
 This is because TaskAttemptUnsuccessfulCompletionEvent history record doesn't 
 have a rack field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3586) Lots of AMs hanging around in PIG testing

2011-12-22 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174790#comment-13174790
 ] 

Hudson commented on MAPREDUCE-3586:
---

Integrated in Hadoop-Hdfs-trunk #902 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/902/])
MAPREDUCE-3586. Modified CompositeService to avoid duplicate stop 
operations thereby solving race conditions in MR AM shutdown. (vinodkv)

vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1221950
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/service/CompositeService.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestCompositeService.java


 Lots of AMs hanging around in PIG testing
 -

 Key: MAPREDUCE-3586
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3586
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3586-20111220.txt


 [~daijy] found this. Here's what he says:
 bq. I see hundreds of MRAppMaster process on my machine, and lots of tests 
 fail for Too many open files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3349) No rack-name logged in JobHistory for unsuccessful tasks

2011-12-22 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174793#comment-13174793
 ] 

Hudson commented on MAPREDUCE-3349:
---

Integrated in Hadoop-Mapreduce-0.23-Build #136 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/136/])
MAPREDUCE-3349. Log rack-name in JobHistory for unsuccessful tasks. 
(Contributed by Amar Kamat and Devaraj K)

sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1221939
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/MapAttemptFinishedEvent.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/ReduceAttemptFinishedEvent.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/TaskAttemptFinishedEvent.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/TaskAttemptUnsuccessfulCompletionEvent.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJobHistoryParsing.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/java/org/apache/hadoop/mapred/JobInProgress.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEvents.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/tools/rumen/TestRumenJobTraces.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/tools/data/rumen/small-trace-test/counters-test-trace.json.gz
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/tools/data/rumen/small-trace-test/dispatch-trace-output.json.gz
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/tools/data/rumen/small-trace-test/job-tracker-logs-topology-output
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/tools/org/apache/hadoop/tools/rumen/HadoopLogsAnalyzer.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/tools/org/apache/hadoop/tools/rumen/JobBuilder.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/tools/org/apache/hadoop/tools/rumen/ParsedHost.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/tools/org/apache/hadoop/tools/rumen/TaskAttempt20LineEventEmitter.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/tools/org/apache/hadoop/tools/rumen/TopologyBuilder.java


 No rack-name logged in JobHistory for unsuccessful tasks
 

 Key: MAPREDUCE-3349
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3349
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Amar Kamat
Priority: Blocker
  Labels: hostname, rackname, rumen, unsuccessful
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3349-v1.11-branch-0.23.patch, 
 MAPREDUCE-3349-v1.11.patch, MAPREDUCE-3349-v1.4.patch, 
 MAPREDUCE-3349-v1.6.patch, MAPREDUCE-3349.patch


 Found this while running jobs on a cluster with [~Karams].
 This is because TaskAttemptUnsuccessfulCompletionEvent history record doesn't 
 have a rack field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3586) Lots of AMs hanging around in PIG testing

2011-12-22 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174795#comment-13174795
 ] 

Hudson commented on MAPREDUCE-3586:
---

Integrated in Hadoop-Mapreduce-0.23-Build #136 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/136/])
MAPREDUCE-3586. Modified CompositeService to avoid duplicate stop 
operations thereby solving race conditions in MR AM shutdown. (vinodkv)
svn merge -c 1221950 --ignore-ancestry ../../trunk/

vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1221951
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/service/CompositeService.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestCompositeService.java


 Lots of AMs hanging around in PIG testing
 -

 Key: MAPREDUCE-3586
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3586
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3586-20111220.txt


 [~daijy] found this. Here's what he says:
 bq. I see hundreds of MRAppMaster process on my machine, and lots of tests 
 fail for Too many open files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3490) RMContainerAllocator counts failed maps towards Reduce ramp up

2011-12-22 Thread Sharad Agarwal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174797#comment-13174797
 ] 

Sharad Agarwal commented on MAPREDUCE-3490:
---

Hi Arun - just had a brief look at the patch. seems like you don't need new 
Container Allocator types. RMContainerAllocator is already getting 
CONTAINER_FAILED event. completedMaps includes succeeded and failed.

Instead of incorporating completedMaps in the calculation, it can take 
succeededMaps.

succeededMaps = completedMaps - failedMaps 

 RMContainerAllocator counts failed maps towards Reduce ramp up
 --

 Key: MAPREDUCE-3490
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3490
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Siddharth Seth
Assignee: Arun C Murthy
Priority: Blocker
 Attachments: MAPREDUCE-3490.patch, MAPREDUCE-3490.patch, 
 MAPREDUCE-3490.patch, MAPREDUCE-3490.patch


 The RMContainerAllocator does not differentiate between failed and successful 
 maps while calculating whether reduce tasks are ready to launch. Failed tasks 
 are also counted towards total completed tasks. 
 Example. 4 failed maps, 10 total maps. Map%complete = 4/14 * 100 instead of 
 being 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3490) RMContainerAllocator counts failed maps towards Reduce ramp up

2011-12-22 Thread Sharad Agarwal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharad Agarwal updated MAPREDUCE-3490:
--

Attachment: MR-3490-alternate.patch

Here is the quick patch for illustration.

 RMContainerAllocator counts failed maps towards Reduce ramp up
 --

 Key: MAPREDUCE-3490
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3490
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Siddharth Seth
Assignee: Arun C Murthy
Priority: Blocker
 Attachments: MAPREDUCE-3490.patch, MAPREDUCE-3490.patch, 
 MAPREDUCE-3490.patch, MAPREDUCE-3490.patch, MR-3490-alternate.patch


 The RMContainerAllocator does not differentiate between failed and successful 
 maps while calculating whether reduce tasks are ready to launch. Failed tasks 
 are also counted towards total completed tasks. 
 Example. 4 failed maps, 10 total maps. Map%complete = 4/14 * 100 instead of 
 being 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3586) Lots of AMs hanging around in PIG testing

2011-12-22 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174807#comment-13174807
 ] 

Hudson commented on MAPREDUCE-3586:
---

Integrated in Hadoop-Mapreduce-trunk #935 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/935/])
MAPREDUCE-3586. Modified CompositeService to avoid duplicate stop 
operations thereby solving race conditions in MR AM shutdown. (vinodkv)

vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1221950
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/service/CompositeService.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestCompositeService.java


 Lots of AMs hanging around in PIG testing
 -

 Key: MAPREDUCE-3586
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3586
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3586-20111220.txt


 [~daijy] found this. Here's what he says:
 bq. I see hundreds of MRAppMaster process on my machine, and lots of tests 
 fail for Too many open files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3490) RMContainerAllocator counts failed maps towards Reduce ramp up

2011-12-22 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174816#comment-13174816
 ] 

Hadoop QA commented on MAPREDUCE-3490:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12508387/MR-3490-alternate.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1494//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1494//console

This message is automatically generated.

 RMContainerAllocator counts failed maps towards Reduce ramp up
 --

 Key: MAPREDUCE-3490
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3490
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Siddharth Seth
Assignee: Arun C Murthy
Priority: Blocker
 Attachments: MAPREDUCE-3490.patch, MAPREDUCE-3490.patch, 
 MAPREDUCE-3490.patch, MAPREDUCE-3490.patch, MR-3490-alternate.patch


 The RMContainerAllocator does not differentiate between failed and successful 
 maps while calculating whether reduce tasks are ready to launch. Failed tasks 
 are also counted towards total completed tasks. 
 Example. 4 failed maps, 10 total maps. Map%complete = 4/14 * 100 instead of 
 being 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-3594) Contrib/Streaming - Test org.apache.hadoop.streaming.TestUlimit fails on VM

2011-12-22 Thread Benoy Antony (Created) (JIRA)
Contrib/Streaming - Test org.apache.hadoop.streaming.TestUlimit fails on VM
---

 Key: MAPREDUCE-3594
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3594
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming
Affects Versions: 0.22.0
 Environment: Red Hat Enterprise Linux Server release 6.1 (Santiago)
Reporter: Benoy Antony
Priority: Minor



The TestUlimit test is as follows : 

The testcse sets the upper limit for virtual memory to 768 MB in the jobconf 
Start a maponly job. 
Let the task get the applicable ulimit from the shell and write it as the 
output. 
The testcase will wait for the completion of the job and compare the joboutput 
with the ulimit originally set in the jobconf 

But this testcase fails because all the task attempts fail with the following 
exception 

java.lang.Throwable: Child Error 
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:225) 
Caused by: java.io.IOException: Task process exit with nonzero status of 134. 
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:212) 

So there is no job output . 

The Test passes on my developer machine, but fails on the build machine which 
is a VM. The build machine OS is Red Hat Enterprise Linux Server release 6.1 
(Santiago)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3360) Provide information about lost nodes in the UI.

2011-12-22 Thread Bh V S Kamesh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174947#comment-13174947
 ] 

Bh V S Kamesh commented on MAPREDUCE-3360:
--

Hi Jason,
  
Thanks for comments. Will incorporate your comments in my next patch. But 
before submitting patch, would like clarify this.

When the RM, does not receive node heartbeat from an NM for *node expiry* 
interval, RM removes the NM from its RM Nodes Map under node *EXPIRE* event. 
Before removing the NM, corresponding Cluster metrics will be updated (In this 
case, incrementing *lost* node count)

If the same NM sends heartbeat after above operation, RM checks whether there 
is any node corresponding to this NodeId. If RM does not find any NM 
corresponding to the NodeId, RM simply returns *reboot* as its heartbeat 
response.
Before sending its heartbeat reponse, RM again updates the Cluster metrics 
(this time, incrementing *reboot* node count).

Is it necessary to update different metrics for the same node's unavailability?
IMO, it shows incorrect information. I *think* either we need to update *lost* 
node count or *reboot* node count but not both, in such circumstance.

any comments?

 Provide information about lost nodes in the UI.
 ---

 Key: MAPREDUCE-3360
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3360
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.0
 Environment: NA
Reporter: Bh V S Kamesh
 Attachments: LostNodes.png, MAPREDUCE-3360-1.patch, 
 MAPREDUCE-3360.patch, lostNodes.png


 Currently there is no information provided about *lost nodes*. Provide 
 information in the UI. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3594) Contrib/Streaming - Test org.apache.hadoop.streaming.TestUlimit fails on VM

2011-12-22 Thread Konstantin Shvachko (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated MAPREDUCE-3594:
---

 Description: 
The TestUlimit test is as follows : 

The testcse sets the upper limit for virtual memory to 768 MB in the jobconf 
Start a maponly job. 
Let the task get the applicable ulimit from the shell and write it as the 
output. 
The testcase will wait for the completion of the job and compare the joboutput 
with the ulimit originally set in the jobconf 

But this testcase fails because all the task attempts fail with the following 
exception 

java.lang.Throwable: Child Error 
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:225) 
Caused by: java.io.IOException: Task process exit with nonzero status of 134. 
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:212) 

So there is no job output . 

The Test passes on my developer machine, but fails on the build machine which 
is a VM. The build machine OS is Red Hat Enterprise Linux Server release 6.1 
(Santiago)

  was:

The TestUlimit test is as follows : 

The testcse sets the upper limit for virtual memory to 768 MB in the jobconf 
Start a maponly job. 
Let the task get the applicable ulimit from the shell and write it as the 
output. 
The testcase will wait for the completion of the job and compare the joboutput 
with the ulimit originally set in the jobconf 

But this testcase fails because all the task attempts fail with the following 
exception 

java.lang.Throwable: Child Error 
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:225) 
Caused by: java.io.IOException: Task process exit with nonzero status of 134. 
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:212) 

So there is no job output . 

The Test passes on my developer machine, but fails on the build machine which 
is a VM. The build machine OS is Red Hat Enterprise Linux Server release 6.1 
(Santiago)

Target Version/s: 0.22.1

 Contrib/Streaming - Test org.apache.hadoop.streaming.TestUlimit fails on VM
 ---

 Key: MAPREDUCE-3594
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3594
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming
Affects Versions: 0.22.0
 Environment: Red Hat Enterprise Linux Server release 6.1 (Santiago)
Reporter: Benoy Antony
Priority: Minor

 The TestUlimit test is as follows : 
 The testcse sets the upper limit for virtual memory to 768 MB in the jobconf 
 Start a maponly job. 
 Let the task get the applicable ulimit from the shell and write it as the 
 output. 
 The testcase will wait for the completion of the job and compare the 
 joboutput with the ulimit originally set in the jobconf 
 But this testcase fails because all the task attempts fail with the following 
 exception 
 java.lang.Throwable: Child Error 
 at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:225) 
 Caused by: java.io.IOException: Task process exit with nonzero status of 134. 
 at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:212) 
 So there is no job output . 
 The Test passes on my developer machine, but fails on the build machine which 
 is a VM. The build machine OS is Red Hat Enterprise Linux Server release 6.1 
 (Santiago)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-3467) Mavenizing har

2011-12-22 Thread Tom White (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White resolved MAPREDUCE-3467.
--

Resolution: Duplicate

This was fixed in HADOOP-7810.

 Mavenizing har
 --

 Key: MAPREDUCE-3467
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3467
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.1
Reporter: John George
Priority: Critical

 As part of mapreduce mavenization, har should also be mavenized and added to 
 maven repo

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3568) Optimize Job's progress calculations in MR AM

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3568:
---

Status: Open  (was: Patch Available)

 Optimize Job's progress calculations in MR AM
 -

 Key: MAPREDUCE-3568
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3568
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3568-20111215.1.txt, 
 MAPREDUCE-3568-20111220.txt, MAPREDUCE-3568-20111222.txt


 Besides catering to client requests, Job progress is calculated in every 
 heartbeat to the RM so as to print the MR AM's progress. Today the map and 
 reduce progresses are calculated by looking up of each task in a big map 
 while we can simply make do with a scan and aggregate. With large number of 
 tasks, this can make a difference.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3568) Optimize Job's progress calculations in MR AM

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3568:
---

Attachment: MAPREDUCE-3568-20111222.txt

Patch fixing tests. Took longer than I anticipated.

 Optimize Job's progress calculations in MR AM
 -

 Key: MAPREDUCE-3568
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3568
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3568-20111215.1.txt, 
 MAPREDUCE-3568-20111220.txt, MAPREDUCE-3568-20111222.txt


 Besides catering to client requests, Job progress is calculated in every 
 heartbeat to the RM so as to print the MR AM's progress. Today the map and 
 reduce progresses are calculated by looking up of each task in a big map 
 while we can simply make do with a scan and aggregate. With large number of 
 tasks, this can make a difference.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3568) Optimize Job's progress calculations in MR AM

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3568:
---

Status: Patch Available  (was: Open)

 Optimize Job's progress calculations in MR AM
 -

 Key: MAPREDUCE-3568
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3568
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3568-20111215.1.txt, 
 MAPREDUCE-3568-20111220.txt, MAPREDUCE-3568-20111222.txt


 Besides catering to client requests, Job progress is calculated in every 
 heartbeat to the RM so as to print the MR AM's progress. Today the map and 
 reduce progresses are calculated by looking up of each task in a big map 
 while we can simply make do with a scan and aggregate. With large number of 
 tasks, this can make a difference.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-1447) job level hook in OutputCommitter is not working in local mode

2011-12-22 Thread Arun C Murthy (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy resolved MAPREDUCE-1447.
--

Resolution: Duplicate

Resolved via MAPREDUCE-3563.

 job level hook in OutputCommitter is not working in local mode
 --

 Key: MAPREDUCE-1447
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1447
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Daniel Dai

 OutputCommitter is not totally working in local mode. Only task level hooks 
 are called, which are setupTask, needsTaskCommit, commitTask, abortTask. Job 
 level hooks are not working, which are: setupJob, cleanupJob.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3354) JobHistoryServer should be started by bin/mapred and not by bin/yarn

2011-12-22 Thread Mahadev konar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-3354:
-

Target Version/s: 0.23.1, 0.24.0  (was: 0.24.0, 0.23.1)
  Status: Open  (was: Patch Available)

Jon,
 While you are at it, can you please clean up bin/mapred as well? It has help 
statements for jobtracker/tasktracker and still commands for acting upon 
jobtracker/tasktracker commands.

 JobHistoryServer should be started by bin/mapred and not by bin/yarn
 

 Key: MAPREDUCE-3354
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3354
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 0.23.1, 0.24.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Jonathan Eagles
Priority: Blocker
 Attachments: MAPREDUCE-3354.patch, MAPREDUCE-3354.patch, 
 MAPREDUCE-3354.patch, MAPREDUCE-3354.patch


 JobHistoryServer belongs to mapreduce land.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3399) ContainerLocalizer should request new resources after completing the current one

2011-12-22 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3399:
--

Attachment: MR3399_v1.txt

Simple patch. Tested on a local cluster - to verify localization is not waiting 
after a download completes.
Don't think this requires explicit unit tests - functionality is being verified 
by existing ones.

 ContainerLocalizer should request new resources after completing the current 
 one
 

 Key: MAPREDUCE-3399
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3399
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2, nodemanager
Affects Versions: 0.23.0
Reporter: Siddharth Seth
Assignee: Siddharth Seth
Priority: Blocker
 Attachments: MR3399_v1.txt


 Currently, the ContainerLocalizer to NM heartbeats to the NM every second. 
 Not very significant, but this causes a ~4second delay in jobs (job jar, 
 splits, etc). Instead, it should heartbeat to ask for additional resources to 
 localize as soon as the previous one is localzied. There's already a TODO in 
 the ContainerLocalizer for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3399) ContainerLocalizer should request new resources after completing the current one

2011-12-22 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3399:
--

Status: Patch Available  (was: Open)

 ContainerLocalizer should request new resources after completing the current 
 one
 

 Key: MAPREDUCE-3399
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3399
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2, nodemanager
Affects Versions: 0.23.0
Reporter: Siddharth Seth
Assignee: Siddharth Seth
Priority: Blocker
 Attachments: MR3399_v1.txt


 Currently, the ContainerLocalizer to NM heartbeats to the NM every second. 
 Not very significant, but this causes a ~4second delay in jobs (job jar, 
 splits, etc). Instead, it should heartbeat to ask for additional resources to 
 localize as soon as the previous one is localzied. There's already a TODO in 
 the ContainerLocalizer for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3399) ContainerLocalizer should request new resources after completing the current one

2011-12-22 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3399:
--

Status: Open  (was: Patch Available)

Canceling - there's some test failures which need to be fixed.

 ContainerLocalizer should request new resources after completing the current 
 one
 

 Key: MAPREDUCE-3399
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3399
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2, nodemanager
Affects Versions: 0.23.0
Reporter: Siddharth Seth
Assignee: Siddharth Seth
Priority: Blocker
 Attachments: MR3399_v1.txt


 Currently, the ContainerLocalizer to NM heartbeats to the NM every second. 
 Not very significant, but this causes a ~4second delay in jobs (job jar, 
 splits, etc). Instead, it should heartbeat to ask for additional resources to 
 localize as soon as the previous one is localzied. There's already a TODO in 
 the ContainerLocalizer for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3490) RMContainerAllocator counts failed maps towards Reduce ramp up

2011-12-22 Thread Arun C Murthy (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175002#comment-13175002
 ] 

Arun C Murthy commented on MAPREDUCE-3490:
--

Sharad, eventually I think we need to stop tracking this in 
RMContainerAllocator and rather rely on Job. For now, my patch seems the 
closest approximation to that (being conservative). Does that make sense? I'll 
file a separate jira to stop tracking these in RMContainerAllocator.

 RMContainerAllocator counts failed maps towards Reduce ramp up
 --

 Key: MAPREDUCE-3490
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3490
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Siddharth Seth
Assignee: Arun C Murthy
Priority: Blocker
 Attachments: MAPREDUCE-3490.patch, MAPREDUCE-3490.patch, 
 MAPREDUCE-3490.patch, MAPREDUCE-3490.patch, MR-3490-alternate.patch


 The RMContainerAllocator does not differentiate between failed and successful 
 maps while calculating whether reduce tasks are ready to launch. Failed tasks 
 are also counted towards total completed tasks. 
 Example. 4 failed maps, 10 total maps. Map%complete = 4/14 * 100 instead of 
 being 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-3595) Add missing TestCounters#testCounterValue test from branch 1 to 0.23

2011-12-22 Thread Tom White (Created) (JIRA)
Add missing TestCounters#testCounterValue test from branch 1 to 0.23


 Key: MAPREDUCE-3595
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3595
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-3595.patch



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3595) Add missing TestCounters#testCounterValue test from branch 1 to 0.23

2011-12-22 Thread Tom White (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-3595:
-

Attachment: MAPREDUCE-3595.patch

 Add missing TestCounters#testCounterValue test from branch 1 to 0.23
 

 Key: MAPREDUCE-3595
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3595
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-3595.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3595) Add missing TestCounters#testCounterValue test from branch 1 to 0.23

2011-12-22 Thread Tom White (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-3595:
-

Status: Patch Available  (was: Open)

 Add missing TestCounters#testCounterValue test from branch 1 to 0.23
 

 Key: MAPREDUCE-3595
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3595
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-3595.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3462) Job submission failing in JUnit tests

2011-12-22 Thread Ravi Prakash (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated MAPREDUCE-3462:


Attachment: MAPREDUCE-3462.branch-0.23.patch

Inside of Configuration.getStrings(), 
getProps().getProperty(mapreduce.job.hdfs-servers) returns 
$fs.default.name. 
System.getProperty(fs.default.name) is returning the string 
${fs.default.name} which continually is tried to be expanded causing the 
maximum depth exception.

This patch simply sets mapreduce.job.hdfs-servers to an empty string so that it 
doesn't return $fs.default.name (as defined in yarn-default.xml)

 Job submission failing in JUnit tests
 -

 Key: MAPREDUCE-3462
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3462
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 0.24.0
Reporter: Amar Kamat
Assignee: Ravi Prakash
Priority: Blocker
  Labels: junit, test
 Fix For: 0.24.0

 Attachments: MAPREDUCE-3462.branch-0.23.patch


 When I run JUnit tests (e.g. TestGridmixSubmission), I see job submission 
 failing with the following error:
 {noformat}
 java.lang.IllegalStateException: Variable substitution depth too large: 20 
 ${fs.default.name}
 at 
 org.apache.hadoop.conf.Configuration.substituteVars(Configuration.java:551)
 at org.apache.hadoop.conf.Configuration.get(Configuration.java:569)
 at 
 org.apache.hadoop.conf.Configuration.getStrings(Configuration.java:1020)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.populateTokenCache(JobSubmitter.java:564)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:353)
 at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1159)
 at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1156)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1156)
 at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1176)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.launchGridmixJob(Gridmix.java:190)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.writeInputData(Gridmix.java:150)
 at org.apache.hadoop.mapred.gridmix.Gridmix.start(Gridmix.java:425)
 at org.apache.hadoop.mapred.gridmix.Gridmix.runJob(Gridmix.java:380)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.access$000(Gridmix.java:56)
 at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:313)
 at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:311)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
 at org.apache.hadoop.mapred.gridmix.Gridmix.run(Gridmix.java:311)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3462) Job submission failing in JUnit tests

2011-12-22 Thread Ravi Prakash (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated MAPREDUCE-3462:


  Description: 
When I run JUnit tests (e.g. TestDistCacheEmulation, TestSleepJob and 
TestCompressionEmulationUtils), I see job submission failing with the following 
error:
{noformat}
java.lang.IllegalStateException: Variable substitution depth too large: 20 
${fs.default.name}
at 
org.apache.hadoop.conf.Configuration.substituteVars(Configuration.java:551)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:569)
at 
org.apache.hadoop.conf.Configuration.getStrings(Configuration.java:1020)
at 
org.apache.hadoop.mapreduce.JobSubmitter.populateTokenCache(JobSubmitter.java:564)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:353)
at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1159)
at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1156)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1156)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1176)
at 
org.apache.hadoop.mapred.gridmix.Gridmix.launchGridmixJob(Gridmix.java:190)
at 
org.apache.hadoop.mapred.gridmix.Gridmix.writeInputData(Gridmix.java:150)
at org.apache.hadoop.mapred.gridmix.Gridmix.start(Gridmix.java:425)
at org.apache.hadoop.mapred.gridmix.Gridmix.runJob(Gridmix.java:380)
at org.apache.hadoop.mapred.gridmix.Gridmix.access$000(Gridmix.java:56)
at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:313)
at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:311)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
at org.apache.hadoop.mapred.gridmix.Gridmix.run(Gridmix.java:311)
{noformat}

  was:
When I run JUnit tests (e.g. TestGridmixSubmission), I see job submission 
failing with the following error:
{noformat}
java.lang.IllegalStateException: Variable substitution depth too large: 20 
${fs.default.name}
at 
org.apache.hadoop.conf.Configuration.substituteVars(Configuration.java:551)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:569)
at 
org.apache.hadoop.conf.Configuration.getStrings(Configuration.java:1020)
at 
org.apache.hadoop.mapreduce.JobSubmitter.populateTokenCache(JobSubmitter.java:564)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:353)
at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1159)
at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1156)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1156)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1176)
at 
org.apache.hadoop.mapred.gridmix.Gridmix.launchGridmixJob(Gridmix.java:190)
at 
org.apache.hadoop.mapred.gridmix.Gridmix.writeInputData(Gridmix.java:150)
at org.apache.hadoop.mapred.gridmix.Gridmix.start(Gridmix.java:425)
at org.apache.hadoop.mapred.gridmix.Gridmix.runJob(Gridmix.java:380)
at org.apache.hadoop.mapred.gridmix.Gridmix.access$000(Gridmix.java:56)
at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:313)
at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:311)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
at org.apache.hadoop.mapred.gridmix.Gridmix.run(Gridmix.java:311)
{noformat}

 Target Version/s: 0.23.1  (was: 0.24.0)
Affects Version/s: (was: 0.24.0)
   0.23.0
Fix Version/s: (was: 0.24.0)

 Job submission failing in JUnit tests
 -

 Key: MAPREDUCE-3462
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3462
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.0
Reporter: Amar Kamat
Assignee: Ravi Prakash
Priority: Blocker
  Labels: junit, test
 Attachments: 

[jira] [Updated] (MAPREDUCE-3462) Job submission failing in JUnit tests

2011-12-22 Thread Ravi Prakash (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated MAPREDUCE-3462:


Status: Patch Available  (was: Open)

 Job submission failing in JUnit tests
 -

 Key: MAPREDUCE-3462
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3462
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.0
Reporter: Amar Kamat
Assignee: Ravi Prakash
Priority: Blocker
  Labels: junit, test
 Attachments: MAPREDUCE-3462.branch-0.23.patch


 When I run JUnit tests (e.g. TestDistCacheEmulation, TestSleepJob and 
 TestCompressionEmulationUtils), I see job submission failing with the 
 following error:
 {noformat}
 java.lang.IllegalStateException: Variable substitution depth too large: 20 
 ${fs.default.name}
 at 
 org.apache.hadoop.conf.Configuration.substituteVars(Configuration.java:551)
 at org.apache.hadoop.conf.Configuration.get(Configuration.java:569)
 at 
 org.apache.hadoop.conf.Configuration.getStrings(Configuration.java:1020)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.populateTokenCache(JobSubmitter.java:564)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:353)
 at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1159)
 at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1156)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1156)
 at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1176)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.launchGridmixJob(Gridmix.java:190)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.writeInputData(Gridmix.java:150)
 at org.apache.hadoop.mapred.gridmix.Gridmix.start(Gridmix.java:425)
 at org.apache.hadoop.mapred.gridmix.Gridmix.runJob(Gridmix.java:380)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.access$000(Gridmix.java:56)
 at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:313)
 at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:311)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
 at org.apache.hadoop.mapred.gridmix.Gridmix.run(Gridmix.java:311)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3462) Job submission failing in JUnit tests

2011-12-22 Thread Ravi Prakash (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated MAPREDUCE-3462:


Component/s: mrv2

 Job submission failing in JUnit tests
 -

 Key: MAPREDUCE-3462
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3462
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 0.23.0
Reporter: Amar Kamat
Assignee: Ravi Prakash
Priority: Blocker
  Labels: junit, test
 Attachments: MAPREDUCE-3462.branch-0.23.patch


 When I run JUnit tests (e.g. TestDistCacheEmulation, TestSleepJob and 
 TestCompressionEmulationUtils), I see job submission failing with the 
 following error:
 {noformat}
 java.lang.IllegalStateException: Variable substitution depth too large: 20 
 ${fs.default.name}
 at 
 org.apache.hadoop.conf.Configuration.substituteVars(Configuration.java:551)
 at org.apache.hadoop.conf.Configuration.get(Configuration.java:569)
 at 
 org.apache.hadoop.conf.Configuration.getStrings(Configuration.java:1020)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.populateTokenCache(JobSubmitter.java:564)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:353)
 at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1159)
 at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1156)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1156)
 at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1176)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.launchGridmixJob(Gridmix.java:190)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.writeInputData(Gridmix.java:150)
 at org.apache.hadoop.mapred.gridmix.Gridmix.start(Gridmix.java:425)
 at org.apache.hadoop.mapred.gridmix.Gridmix.runJob(Gridmix.java:380)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.access$000(Gridmix.java:56)
 at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:313)
 at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:311)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
 at org.apache.hadoop.mapred.gridmix.Gridmix.run(Gridmix.java:311)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3567) Extraneous JobConf objects in AM heap

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3567:
---

Status: Open  (was: Patch Available)

 Extraneous JobConf objects in AM heap
 -

 Key: MAPREDUCE-3567
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3567
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3567-20111215.1.txt


 MR AM creates new JobConf objects unnecessarily in a couple of places in 
 JobImpl and TaskImpl which occupy non-trivial amount of heap.
 While working with a 64 bit JVM on 100K maps jobs, with uncompressed 
 pointers, removing those extraneous objects helped in addressing OOM with 2GB 
 AM heap size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3567) Extraneous JobConf objects in AM heap

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3567:
---

Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

 Extraneous JobConf objects in AM heap
 -

 Key: MAPREDUCE-3567
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3567
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3567-20111215.1.txt, 
 MAPREDUCE-3567-20111222.txt


 MR AM creates new JobConf objects unnecessarily in a couple of places in 
 JobImpl and TaskImpl which occupy non-trivial amount of heap.
 While working with a 64 bit JVM on 100K maps jobs, with uncompressed 
 pointers, removing those extraneous objects helped in addressing OOM with 2GB 
 AM heap size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3567) Extraneous JobConf objects in AM heap

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3567:
---

Attachment: MAPREDUCE-3567-20111222.txt

Some of the javac warnings are bogus, they are related to JobConf deprecation 
and I already suppressed them. Addresses a couple of warnings which are valid.

 Extraneous JobConf objects in AM heap
 -

 Key: MAPREDUCE-3567
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3567
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3567-20111215.1.txt, 
 MAPREDUCE-3567-20111222.txt


 MR AM creates new JobConf objects unnecessarily in a couple of places in 
 JobImpl and TaskImpl which occupy non-trivial amount of heap.
 While working with a 64 bit JVM on 100K maps jobs, with uncompressed 
 pointers, removing those extraneous objects helped in addressing OOM with 2GB 
 AM heap size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3568) Optimize Job's progress calculations in MR AM

2011-12-22 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175028#comment-13175028
 ] 

Hadoop QA commented on MAPREDUCE-3568:
--

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12508419/MAPREDUCE-3568-20111222.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 15 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1495//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1495//console

This message is automatically generated.

 Optimize Job's progress calculations in MR AM
 -

 Key: MAPREDUCE-3568
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3568
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3568-20111215.1.txt, 
 MAPREDUCE-3568-20111220.txt, MAPREDUCE-3568-20111222.txt


 Besides catering to client requests, Job progress is calculated in every 
 heartbeat to the RM so as to print the MR AM's progress. Today the map and 
 reduce progresses are calculated by looking up of each task in a big map 
 while we can simply make do with a scan and aggregate. With large number of 
 tasks, this can make a difference.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3595) Add missing TestCounters#testCounterValue test from branch 1 to 0.23

2011-12-22 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175046#comment-13175046
 ] 

Hadoop QA commented on MAPREDUCE-3595:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508426/MAPREDUCE-3595.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1496//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1496//console

This message is automatically generated.

 Add missing TestCounters#testCounterValue test from branch 1 to 0.23
 

 Key: MAPREDUCE-3595
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3595
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-3595.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3462) Job submission failing in JUnit tests

2011-12-22 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175056#comment-13175056
 ] 

Hadoop QA commented on MAPREDUCE-3462:
--

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12508428/MAPREDUCE-3462.branch-0.23.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1498//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1498//console

This message is automatically generated.

 Job submission failing in JUnit tests
 -

 Key: MAPREDUCE-3462
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3462
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 0.23.0
Reporter: Amar Kamat
Assignee: Ravi Prakash
Priority: Blocker
  Labels: junit, test
 Attachments: MAPREDUCE-3462.branch-0.23.patch


 When I run JUnit tests (e.g. TestDistCacheEmulation, TestSleepJob and 
 TestCompressionEmulationUtils), I see job submission failing with the 
 following error:
 {noformat}
 java.lang.IllegalStateException: Variable substitution depth too large: 20 
 ${fs.default.name}
 at 
 org.apache.hadoop.conf.Configuration.substituteVars(Configuration.java:551)
 at org.apache.hadoop.conf.Configuration.get(Configuration.java:569)
 at 
 org.apache.hadoop.conf.Configuration.getStrings(Configuration.java:1020)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.populateTokenCache(JobSubmitter.java:564)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:353)
 at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1159)
 at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1156)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1156)
 at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1176)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.launchGridmixJob(Gridmix.java:190)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.writeInputData(Gridmix.java:150)
 at org.apache.hadoop.mapred.gridmix.Gridmix.start(Gridmix.java:425)
 at org.apache.hadoop.mapred.gridmix.Gridmix.runJob(Gridmix.java:380)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.access$000(Gridmix.java:56)
 at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:313)
 at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:311)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
 at org.apache.hadoop.mapred.gridmix.Gridmix.run(Gridmix.java:311)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3567) Extraneous JobConf objects in AM heap

2011-12-22 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175053#comment-13175053
 ] 

Hadoop QA commented on MAPREDUCE-3567:
--

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12508430/MAPREDUCE-3567-20111222.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1497//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1497//console

This message is automatically generated.

 Extraneous JobConf objects in AM heap
 -

 Key: MAPREDUCE-3567
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3567
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3567-20111215.1.txt, 
 MAPREDUCE-3567-20111222.txt


 MR AM creates new JobConf objects unnecessarily in a couple of places in 
 JobImpl and TaskImpl which occupy non-trivial amount of heap.
 While working with a 64 bit JVM on 100K maps jobs, with uncompressed 
 pointers, removing those extraneous objects helped in addressing OOM with 2GB 
 AM heap size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3399) ContainerLocalizer should request new resources after completing the current one

2011-12-22 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3399:
--

Attachment: MR3399_v2.txt

Fixes the failing TestContainerLocalizer test.

 ContainerLocalizer should request new resources after completing the current 
 one
 

 Key: MAPREDUCE-3399
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3399
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2, nodemanager
Affects Versions: 0.23.0
Reporter: Siddharth Seth
Assignee: Siddharth Seth
Priority: Blocker
 Attachments: MR3399_v1.txt, MR3399_v2.txt


 Currently, the ContainerLocalizer to NM heartbeats to the NM every second. 
 Not very significant, but this causes a ~4second delay in jobs (job jar, 
 splits, etc). Instead, it should heartbeat to ask for additional resources to 
 localize as soon as the previous one is localzied. There's already a TODO in 
 the ContainerLocalizer for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3399) ContainerLocalizer should request new resources after completing the current one

2011-12-22 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3399:
--

Status: Patch Available  (was: Open)

 ContainerLocalizer should request new resources after completing the current 
 one
 

 Key: MAPREDUCE-3399
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3399
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2, nodemanager
Affects Versions: 0.23.0
Reporter: Siddharth Seth
Assignee: Siddharth Seth
Priority: Blocker
 Attachments: MR3399_v1.txt, MR3399_v2.txt


 Currently, the ContainerLocalizer to NM heartbeats to the NM every second. 
 Not very significant, but this causes a ~4second delay in jobs (job jar, 
 splits, etc). Instead, it should heartbeat to ask for additional resources to 
 localize as soon as the previous one is localzied. There's already a TODO in 
 the ContainerLocalizer for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3354) JobHistoryServer should be started by bin/mapred and not by bin/yarn

2011-12-22 Thread Jonathan Eagles (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175086#comment-13175086
 ] 

Jonathan Eagles commented on MAPREDUCE-3354:


Absolutely. Every little bit helps.

 JobHistoryServer should be started by bin/mapred and not by bin/yarn
 

 Key: MAPREDUCE-3354
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3354
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 0.23.1, 0.24.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Jonathan Eagles
Priority: Blocker
 Attachments: MAPREDUCE-3354.patch, MAPREDUCE-3354.patch, 
 MAPREDUCE-3354.patch, MAPREDUCE-3354.patch


 JobHistoryServer belongs to mapreduce land.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3553) Add support for data returned when exceptions thrown from web service apis to be in either xml or in JSON

2011-12-22 Thread Thomas Graves (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-3553:
-

Status: Open  (was: Patch Available)

this is going to conflict with MAPREDUCE-3547 for tests, so cancelling this for 
now and then will update after MAPREDUCE-3547 committed.

 Add support for data returned when exceptions thrown from web service apis to 
 be in either xml or in JSON
 -

 Key: MAPREDUCE-3553
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3553
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Thomas Graves
Assignee: Thomas Graves
Priority: Minor
 Attachments: MAPREDUCE-3553.patch


 When the web services apis for rm, nm, app master, and job history server 
 throw an exception - like bad request, not found, they always return the data 
 in JSON format.  It would be nice to return based on what they requested - 
 xml or JSON.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3567) Extraneous JobConf objects in AM heap

2011-12-22 Thread Siddharth Seth (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175118#comment-13175118
 ] 

Siddharth Seth commented on MAPREDUCE-3567:
---

+1. Committed to trunk and branch-0.23. Thanks Vinod.

 Extraneous JobConf objects in AM heap
 -

 Key: MAPREDUCE-3567
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3567
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3567-20111215.1.txt, 
 MAPREDUCE-3567-20111222.txt


 MR AM creates new JobConf objects unnecessarily in a couple of places in 
 JobImpl and TaskImpl which occupy non-trivial amount of heap.
 While working with a 64 bit JVM on 100K maps jobs, with uncompressed 
 pointers, removing those extraneous objects helped in addressing OOM with 2GB 
 AM heap size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3567) Extraneous JobConf objects in AM heap

2011-12-22 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3567:
--

  Resolution: Fixed
Target Version/s: 0.23.1
  Status: Resolved  (was: Patch Available)

 Extraneous JobConf objects in AM heap
 -

 Key: MAPREDUCE-3567
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3567
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3567-20111215.1.txt, 
 MAPREDUCE-3567-20111222.txt


 MR AM creates new JobConf objects unnecessarily in a couple of places in 
 JobImpl and TaskImpl which occupy non-trivial amount of heap.
 While working with a 64 bit JVM on 100K maps jobs, with uncompressed 
 pointers, removing those extraneous objects helped in addressing OOM with 2GB 
 AM heap size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3567) Extraneous JobConf objects in AM heap

2011-12-22 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175164#comment-13175164
 ] 

Hudson commented on MAPREDUCE-3567:
---

Integrated in Hadoop-Common-trunk-Commit #1467 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1467/])
MAPREDUCE-3567. Extraneous JobConf objects in AM heap. Contributed by Vinod 
Kumar Vavilapalli)

sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1222498
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/LocalContainerLauncher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/MapTaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/ReduceTaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/MapTaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/ReduceTaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRAppBenchmark.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java


 Extraneous JobConf objects in AM heap
 -

 Key: MAPREDUCE-3567
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3567
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3567-20111215.1.txt, 
 MAPREDUCE-3567-20111222.txt


 MR AM creates new JobConf objects unnecessarily in a couple of places in 
 JobImpl and TaskImpl which occupy non-trivial amount of heap.
 While working with a 64 bit JVM on 100K maps jobs, with uncompressed 
 pointers, removing those extraneous objects helped in addressing OOM with 2GB 
 AM heap size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3567) Extraneous JobConf objects in AM heap

2011-12-22 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175165#comment-13175165
 ] 

Hudson commented on MAPREDUCE-3567:
---

Integrated in Hadoop-Hdfs-trunk-Commit #1539 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1539/])
MAPREDUCE-3567. Extraneous JobConf objects in AM heap. Contributed by Vinod 
Kumar Vavilapalli)

sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1222498
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/LocalContainerLauncher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/MapTaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/ReduceTaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/MapTaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/ReduceTaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRAppBenchmark.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java


 Extraneous JobConf objects in AM heap
 -

 Key: MAPREDUCE-3567
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3567
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3567-20111215.1.txt, 
 MAPREDUCE-3567-20111222.txt


 MR AM creates new JobConf objects unnecessarily in a couple of places in 
 JobImpl and TaskImpl which occupy non-trivial amount of heap.
 While working with a 64 bit JVM on 100K maps jobs, with uncompressed 
 pointers, removing those extraneous objects helped in addressing OOM with 2GB 
 AM heap size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3586) Lots of AMs hanging around in PIG testing

2011-12-22 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175167#comment-13175167
 ] 

Hudson commented on MAPREDUCE-3586:
---

Integrated in Hadoop-Common-0.23-Commit #320 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/320/])


 Lots of AMs hanging around in PIG testing
 -

 Key: MAPREDUCE-3586
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3586
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3586-20111220.txt


 [~daijy] found this. Here's what he says:
 bq. I see hundreds of MRAppMaster process on my machine, and lots of tests 
 fail for Too many open files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3567) Extraneous JobConf objects in AM heap

2011-12-22 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175166#comment-13175166
 ] 

Hudson commented on MAPREDUCE-3567:
---

Integrated in Hadoop-Common-0.23-Commit #320 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/320/])
Merge MAPREDUCE-3567 from trunk

sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1222499
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/LocalContainerLauncher.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/MapTaskAttemptImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/ReduceTaskAttemptImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/MapTaskImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/ReduceTaskImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRAppBenchmark.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java


 Extraneous JobConf objects in AM heap
 -

 Key: MAPREDUCE-3567
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3567
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3567-20111215.1.txt, 
 MAPREDUCE-3567-20111222.txt


 MR AM creates new JobConf objects unnecessarily in a couple of places in 
 JobImpl and TaskImpl which occupy non-trivial amount of heap.
 While working with a 64 bit JVM on 100K maps jobs, with uncompressed 
 pointers, removing those extraneous objects helped in addressing OOM with 2GB 
 AM heap size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-3596) Job got hang after completion of 99% map phase with hadoop-0.23.1.1112091615 RE build

2011-12-22 Thread Ravi Prakash (Created) (JIRA)
Job got hang after completion of 99% map phase with hadoop-0.23.1.1112091615 RE 
build
-

 Key: MAPREDUCE-3596
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3596
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Ravi Prakash
Priority: Critical


Courtesy [~vinaythota]
{quote}
Ran sort benchmark couple of times and every time the job got hang after 
completion 99% map phase. There are some map tasks failed. Also it's not 
scheduled some of the pending map tasks.
Cluster size is 350 nodes.

Build Details:
==
Version:0.23.1.1112091615, 1212592
Compiled:   Fri Dec 9 16:25:27 PST 2011 by someone from 
branches/branch-0.23/hadoop-common-project/hadoop-common 

ResourceManager version:0.23.1.1112091615 from 1212681 by someone 
source checksum
6e54430abdc912c91c05b9208a3361de on Fri Dec 9 16:52:07 PST 2011
Hadoop version: 0.23.1.1112091615 from 1212592 by someone source 
checksum 999b78e0eadace831529ee78ed29c8e1 on
Fri Dec 9 16:25:27 PST 2011
{quote}




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3585) RM unable to detect NMs restart

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3585:
---

Priority: Minor  (was: Major)

Downgrading priority till we get more details.

 RM unable to detect NMs restart
 ---

 Key: MAPREDUCE-3585
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3585
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Bh V S Kamesh
Priority: Minor

 Suppose say in a single host, there have been multiple NMs configured. In 
 this case, there should be mechanism to detect the NMs comeback.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3596) Job got hang after completion of 99% map phase with hadoop-0.23.1.1112091615 RE build

2011-12-22 Thread Ravi Prakash (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated MAPREDUCE-3596:


Description: 
Courtesy [~vinaythota]
{quote}
Ran sort benchmark couple of times and every time the job got hang after 
completion 99% map phase. There are some map tasks failed. Also it's not 
scheduled some of the pending map tasks.
Cluster size is 350 nodes.

Build Details:
==

Compiled:   Fri Dec 9 16:25:27 PST 2011 by someone from 
branches/branch-0.23/hadoop-common-project/hadoop-common 
ResourceManager version:revision 1212681 by someone source checksum
6e54430abdc912c91c05b9208a3361de on Fri Dec 9 16:52:07 PST 2011
Hadoop version: revision 1212592 by someone source checksum 
999b78e0eadace831529ee78ed29c8e1 on
Fri Dec 9 16:25:27 PST 2011
{quote}




  was:
Courtesy [~vinaythota]
{quote}
Ran sort benchmark couple of times and every time the job got hang after 
completion 99% map phase. There are some map tasks failed. Also it's not 
scheduled some of the pending map tasks.
Cluster size is 350 nodes.

Build Details:
==
Version:0.23.1.1112091615, 1212592
Compiled:   Fri Dec 9 16:25:27 PST 2011 by someone from 
branches/branch-0.23/hadoop-common-project/hadoop-common 

ResourceManager version:0.23.1.1112091615 from 1212681 by someone 
source checksum
6e54430abdc912c91c05b9208a3361de on Fri Dec 9 16:52:07 PST 2011
Hadoop version: 0.23.1.1112091615 from 1212592 by someone source 
checksum 999b78e0eadace831529ee78ed29c8e1 on
Fri Dec 9 16:25:27 PST 2011
{quote}





 Job got hang after completion of 99% map phase with hadoop-0.23.1.1112091615 
 RE build
 -

 Key: MAPREDUCE-3596
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3596
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Ravi Prakash
Priority: Critical

 Courtesy [~vinaythota]
 {quote}
 Ran sort benchmark couple of times and every time the job got hang after 
 completion 99% map phase. There are some map tasks failed. Also it's not 
 scheduled some of the pending map tasks.
 Cluster size is 350 nodes.
 Build Details:
 ==
 Compiled:   Fri Dec 9 16:25:27 PST 2011 by someone from 
 branches/branch-0.23/hadoop-common-project/hadoop-common 
 ResourceManager version:revision 1212681 by someone source checksum
 6e54430abdc912c91c05b9208a3361de on Fri Dec 9 16:52:07 PST 2011
 Hadoop version: revision 1212592 by someone source checksum 
 999b78e0eadace831529ee78ed29c8e1 on
 Fri Dec 9 16:25:27 PST 2011
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3596) Job got hang after completion of 99% map phase with hadoop-0.23.1.1112091615 RE build

2011-12-22 Thread Ravi Prakash (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated MAPREDUCE-3596:


Description: 
Courtesy [~vinaythota]
{quote}
Ran sort benchmark couple of times and every time the job got hang after 
completion 99% map phase. There are some map tasks failed. Also it's not 
scheduled some of the pending map tasks.
Cluster size is 350 nodes.

Build Details:
==

Compiled:   Fri Dec 9 16:25:27 PST 2011 by someone from 
branches/branch-0.23/hadoop-common-project/hadoop-common 
ResourceManager version:revision 1212681 by someone source checksum on 
Fri Dec 9 16:52:07 PST 2011
Hadoop version: revision 1212592 by someone Fri Dec 9 16:25:27 PST 2011
{quote}




  was:
Courtesy [~vinaythota]
{quote}
Ran sort benchmark couple of times and every time the job got hang after 
completion 99% map phase. There are some map tasks failed. Also it's not 
scheduled some of the pending map tasks.
Cluster size is 350 nodes.

Build Details:
==

Compiled:   Fri Dec 9 16:25:27 PST 2011 by someone from 
branches/branch-0.23/hadoop-common-project/hadoop-common 
ResourceManager version:revision 1212681 by someone source checksum
6e54430abdc912c91c05b9208a3361de on Fri Dec 9 16:52:07 PST 2011
Hadoop version: revision 1212592 by someone source checksum 
999b78e0eadace831529ee78ed29c8e1 on
Fri Dec 9 16:25:27 PST 2011
{quote}





 Job got hang after completion of 99% map phase with hadoop-0.23.1.1112091615 
 RE build
 -

 Key: MAPREDUCE-3596
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3596
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Ravi Prakash
Priority: Critical

 Courtesy [~vinaythota]
 {quote}
 Ran sort benchmark couple of times and every time the job got hang after 
 completion 99% map phase. There are some map tasks failed. Also it's not 
 scheduled some of the pending map tasks.
 Cluster size is 350 nodes.
 Build Details:
 ==
 Compiled:   Fri Dec 9 16:25:27 PST 2011 by someone from 
 branches/branch-0.23/hadoop-common-project/hadoop-common 
 ResourceManager version:revision 1212681 by someone source checksum 
 on Fri Dec 9 16:52:07 PST 2011
 Hadoop version: revision 1212592 by someone Fri Dec 9 16:25:27 PST 
 2011
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3591) webapps always return html on non-existent URL

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3591:
---

Priority: Minor  (was: Major)

 webapps always return html on non-existent URL
 --

 Key: MAPREDUCE-3591
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3591
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Thomas Graves
Priority: Minor

 If the user tries to go to a non-existent url, say rm:8088/cluster/foo, via 
 the web ui or the web service rest api, it returns 404 and it always returns 
 html content.  With the addition of the web service rest api it would be nice 
 if it returned what was requested - XML or JSON.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3596) Job got hang after completion of 99% map phase with hadoop-0.23.1.1112091615 RE build

2011-12-22 Thread Ravi Prakash (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175188#comment-13175188
 ] 

Ravi Prakash commented on MAPREDUCE-3596:
-

Ok. Here's how far I've got

{noformat}
$ grep attempt_1324018664143_0002_m -r container_1324018664143_0002_01_01/ 
| grep Created attempt | awk '{print $10}' | sort | uniq  | grep _1$
attempt_1324018664143_0002_m_009775_1
attempt_1324018664143_0002_m_012988_1
attempt_1324018664143_0002_m_013199_1
{noformat}

i.e. There are three maps which had to be retried. The first succeeded on being 
retried
{noformat}
2011-12-16 07:09:11,013 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with 
attempt attempt_1324018664143_0002_m_009775_1
{noformat}

The other two failed. They failed for different reasons which doesn't seem to 
me to be related to this investigation. In any case. After failure,
{noformat}
2011-12-16 07:09:15,870 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Processing 
attempt_1324018664143_0002_m_012988_0 of type TA_CONTAINER_LAUNCH_FAILED
2011-12-16 07:09:15,870 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
attempt_1324018664143_0002_m_012988_0 TaskAttempt Transitioned from ASSIGNED to 
FAILED
2011-12-16 07:09:15,870 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Processing the 
event EventType: CONTAINER_DEALLOCATE
2011-12-16 07:09:15,870 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Processing 
task_1324018664143_0002_m_012988 of type T_ATTEMPT_FAILED
2011-12-16 07:09:15,870 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Created attempt 
attempt_1324018664143_0002_m_012988_1
2011-12-16 07:09:15,870 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Processing the 
event EventType: CONTAINER_FAILED
2011-12-16 07:09:15,870 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 1 failures on node 
someNode
2011-12-16 07:09:15,870 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Processing 
attempt_1324018664143_0002_m_012988_1 of type TA_RESCHEDULE
2011-12-16 07:09:15,870 INFO [Thread-31] 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In 
HistoryEventHandler TASK_FINISHED
2011-12-16 07:09:15,871 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
attempt_1324018664143_0002_m_012988_1 TaskAttempt Transitioned from NEW to 
UNASSIGNED
2011-12-16 07:09:15,871 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Processing the 
event EventType: CONTAINER_REQ
2011-12-16 07:09:15,871 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Added 
attempt_1324018664143_0002_m_012988_1 to list of failed maps
2011-12-16 07:09:15,871 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Added 
priority=priority: 5, 
2011-12-16 07:09:15,871 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=2 priority=5 resourceName=* numContainers=1 #asks=1
{noformat}
And then that attempt is never heard from again in the AM logs. Similarly for 
the other attempt

I could not find the resource request in the RM logs.


 Job got hang after completion of 99% map phase with hadoop-0.23.1.1112091615 
 RE build
 -

 Key: MAPREDUCE-3596
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3596
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Ravi Prakash
Priority: Critical

 Courtesy [~vinaythota]
 {quote}
 Ran sort benchmark couple of times and every time the job got hang after 
 completion 99% map phase. There are some map tasks failed. Also it's not 
 scheduled some of the pending map tasks.
 Cluster size is 350 nodes.
 Build Details:
 ==
 Compiled:   Fri Dec 9 16:25:27 PST 2011 by someone from 
 branches/branch-0.23/hadoop-common-project/hadoop-common 
 ResourceManager version:revision 1212681 by someone source checksum 
 on Fri Dec 9 16:52:07 PST 2011
 Hadoop version: revision 1212592 by someone Fri Dec 9 16:25:27 PST 
 2011
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 

[jira] [Commented] (MAPREDUCE-3567) Extraneous JobConf objects in AM heap

2011-12-22 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175190#comment-13175190
 ] 

Hudson commented on MAPREDUCE-3567:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #1488 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1488/])
MAPREDUCE-3567. Extraneous JobConf objects in AM heap. Contributed by Vinod 
Kumar Vavilapalli)

sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1222498
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/LocalContainerLauncher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/MapTaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/ReduceTaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/MapTaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/ReduceTaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRAppBenchmark.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java


 Extraneous JobConf objects in AM heap
 -

 Key: MAPREDUCE-3567
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3567
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3567-20111215.1.txt, 
 MAPREDUCE-3567-20111222.txt


 MR AM creates new JobConf objects unnecessarily in a couple of places in 
 JobImpl and TaskImpl which occupy non-trivial amount of heap.
 While working with a 64 bit JVM on 100K maps jobs, with uncompressed 
 pointers, removing those extraneous objects helped in addressing OOM with 2GB 
 AM heap size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3420) [Umbrella ticket] Make uber jobs functional

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3420:
---

Summary: [Umbrella ticket] Make uber jobs functional  (was: Make uber jobs 
functional)

 [Umbrella ticket] Make uber jobs functional
 ---

 Key: MAPREDUCE-3420
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3420
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0, 0.24.0
Reporter: Hitesh Shah
 Fix For: 0.23.1


 Umbrella jira for getting uber jobs to work correctly with YARN/MRv2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3399) ContainerLocalizer should request new resources after completing the current one

2011-12-22 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175194#comment-13175194
 ] 

Hadoop QA commented on MAPREDUCE-3399:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508449/MR3399_v2.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1500//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1500//console

This message is automatically generated.

 ContainerLocalizer should request new resources after completing the current 
 one
 

 Key: MAPREDUCE-3399
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3399
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2, nodemanager
Affects Versions: 0.23.0
Reporter: Siddharth Seth
Assignee: Siddharth Seth
Priority: Blocker
 Attachments: MR3399_v1.txt, MR3399_v2.txt


 Currently, the ContainerLocalizer to NM heartbeats to the NM every second. 
 Not very significant, but this causes a ~4second delay in jobs (job jar, 
 splits, etc). Instead, it should heartbeat to ask for additional resources to 
 localize as soon as the previous one is localzied. There's already a TODO in 
 the ContainerLocalizer for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3567) Extraneous JobConf objects in AM heap

2011-12-22 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175196#comment-13175196
 ] 

Hudson commented on MAPREDUCE-3567:
---

Integrated in Hadoop-Mapreduce-0.23-Commit #331 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/331/])
Merge MAPREDUCE-3567 from trunk

sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1222499
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/LocalContainerLauncher.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/MapTaskAttemptImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/ReduceTaskAttemptImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/MapTaskImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/ReduceTaskImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRAppBenchmark.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java


 Extraneous JobConf objects in AM heap
 -

 Key: MAPREDUCE-3567
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3567
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3567-20111215.1.txt, 
 MAPREDUCE-3567-20111222.txt


 MR AM creates new JobConf objects unnecessarily in a couple of places in 
 JobImpl and TaskImpl which occupy non-trivial amount of heap.
 While working with a 64 bit JVM on 100K maps jobs, with uncompressed 
 pointers, removing those extraneous objects helped in addressing OOM with 2GB 
 AM heap size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2745) [MR-279] NM UI should get a read-only view instead of the actual NMContext

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-2745:
---

Priority: Trivial  (was: Major)

 [MR-279] NM UI should get a read-only view instead of the actual NMContext 
 ---

 Key: MAPREDUCE-2745
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2745
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Anupam Seth
Priority: Trivial
  Labels: newbie
 Fix For: 0.23.1

 Attachments: MAPREDUCE-2745-branch-0_23.patch, 
 MAPREDUCE-2745-branch-0_23_v2.patch


 NMContext is modifiable, the UI should only get read-only access. Just like 
 the AM web-ui.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3360) Provide information about lost nodes in the UI.

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3360:
---

Priority: Critical  (was: Major)

 Provide information about lost nodes in the UI.
 ---

 Key: MAPREDUCE-3360
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3360
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.0
 Environment: NA
Reporter: Bh V S Kamesh
Priority: Critical
 Attachments: LostNodes.png, MAPREDUCE-3360-1.patch, 
 MAPREDUCE-3360.patch, lostNodes.png


 Currently there is no information provided about *lost nodes*. Provide 
 information in the UI. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3353:
---

Priority: Critical  (was: Major)

 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Hitesh Shah
Priority: Critical
 Fix For: 0.23.1


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2744) [MR-279] In unsercure mode, AM can fake resource requirements

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-2744:
---

Priority: Minor  (was: Major)

 [MR-279] In unsercure mode, AM can fake resource requirements 
 --

 Key: MAPREDUCE-2744
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2744
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2, security
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Priority: Minor
 Fix For: 0.23.1


 Without security, it is impossible to completely avoid AMs faking resources. 
 We can at the least make it as difficult as possible by using the same 
 container tokens and the RM-NM shared key mechanism over unauthenticated 
 RM-NM channel.
 In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3553) Add support for data returned when exceptions thrown from web service apis to be in either xml or in JSON

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3553:
---

Issue Type: Sub-task  (was: Bug)
Parent: MAPREDUCE-2863

 Add support for data returned when exceptions thrown from web service apis to 
 be in either xml or in JSON
 -

 Key: MAPREDUCE-3553
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3553
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Thomas Graves
Assignee: Thomas Graves
Priority: Minor
 Attachments: MAPREDUCE-3553.patch


 When the web services apis for rm, nm, app master, and job history server 
 throw an exception - like bad request, not found, they always return the data 
 in JSON format.  It would be nice to return based on what they requested - 
 xml or JSON.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3547) finish unit tests for web services for RM and NM

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3547:
---

Issue Type: Sub-task  (was: Bug)
Parent: MAPREDUCE-2863

 finish unit tests for web services for RM and NM
 

 Key: MAPREDUCE-3547
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3547
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Thomas Graves
Assignee: Thomas Graves
Priority: Critical
 Attachments: MAPREDUCE-3547.patch, MAPREDUCE-3547.patch


 Write more unit tests for the web services added for rm and nm.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3548) write unit tests for web services for mapreduce app master and job history server

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3548:
---

Issue Type: Sub-task  (was: Bug)
Parent: MAPREDUCE-2863

 write unit tests for web services for mapreduce app master and job history 
 server
 -

 Key: MAPREDUCE-3548
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3548
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Thomas Graves
Assignee: Thomas Graves
Priority: Critical

 write more unit tests for mapreduce application master and job history server 
 web services added in MAPREDUCE-2863

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3554) add job history/am hostname to web services info output

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3554:
---

Issue Type: Sub-task  (was: Improvement)
Parent: MAPREDUCE-2863

 add job history/am hostname to web services info output  
 -

 Key: MAPREDUCE-3554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3554
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Thomas Graves

 It would be useful to add the job history or am hostname to web services info 
 output.  
 history server uri is like host:19888/ws/v1/history/info
 mapreduce app master uri is something like 
 host:8088/proxy/application_1323191000473_0002/ws/v1/mapreduce/info

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3552) add ability to specify the format type (xml|json) of web services when requesting it via url query param

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3552:
---

Issue Type: Sub-task  (was: Improvement)
Parent: MAPREDUCE-2863

 add ability to specify the format type (xml|json) of web services when 
 requesting it via url query param
 

 Key: MAPREDUCE-3552
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3552
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Thomas Graves

 add ability to specify the format type (xml|json) of web services when 
 requesting it via url query param.  Perhaps ?format=xml or similar.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3440) Add tests for testing other NM components with disk failures

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3440:
---

Issue Type: Sub-task  (was: Test)
Parent: MAPREDUCE-3121

 Add tests for testing other NM components with disk failures
 

 Key: MAPREDUCE-3440
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3440
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Affects Versions: 0.23.0
Reporter: Ravi Gummadi

 Add more tests to test other components when disks fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3121) DFIP aka 'NodeManager should handle Disk-Failures In Place'

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3121:
---

Summary: DFIP aka 'NodeManager should handle Disk-Failures In Place'  (was: 
NodeManager should handle disk-failures)

 DFIP aka 'NodeManager should handle Disk-Failures In Place'
 ---

 Key: MAPREDUCE-3121
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3121
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Ravi Gummadi
Priority: Blocker
 Fix For: 0.23.1

 Attachments: 3121.patch, 3121.v1.1.patch, 3121.v1.patch, 
 3121.v2.patch, 3121.v3.patch


 This is akin to MAPREDUCE-2413 but for YARN's NodeManager. We want to 
 minimize the impact of transient/permanent disk failures on containers. With 
 larger number of disks per node, the ability to continue to run containers on 
 other disks is crucial.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3519) Deadlock in LocalDirsHandlerService and ShuffleHandler

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3519:
---

Issue Type: Sub-task  (was: Bug)
Parent: MAPREDUCE-3121

 Deadlock in LocalDirsHandlerService and ShuffleHandler
 --

 Key: MAPREDUCE-3519
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3519
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2, nodemanager
Affects Versions: 0.23.1, 0.24.0
Reporter: Ravi Gummadi
Assignee: Ravi Gummadi
Priority: Blocker
 Fix For: 0.23.1

 Attachments: 3519.patch, 3519.v1.patch, deadlock.txt


 MAPREDUCE-3121 cloned Configuration object in LocalDirsHandlerService.init() 
 to avoid others to access that configuration object. But since it is used in 
 local FileSystem object creation in LocalDirAllocator.AllocatorPerContext and 
 the same FileSystem object is used in 
 ShuffleHandler.Shuffle.localDirAllocator, this is causing a deadlock when 
 accessing this configuration object from LocalDirsHandlerService and 
 ShuffleHandler along with AllocatorPerContext object.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3441) NodeManager should identify failed disks becoming good back again

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3441:
---

Issue Type: Sub-task  (was: Improvement)
Parent: MAPREDUCE-3121

 NodeManager should identify failed disks becoming good back again
 -

 Key: MAPREDUCE-3441
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3441
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2, nodemanager
Affects Versions: 0.23.0
Reporter: Ravi Gummadi

 MAPREDUCE-3121 makes NodeManager identify disk failures. But once a disk goes 
 down, it is marked as failed forever. To reuse that disk (after it becomes 
 good), NodeManager needs restart. This JIRA is to improve NodeManager to 
 reuse good disks(which could be bad some time back).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3474) NM disk failure detection only covers local dirs

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3474:
---

Issue Type: Sub-task  (was: Improvement)
Parent: MAPREDUCE-3121

 NM disk failure detection only covers local dirs 
 -

 Key: MAPREDUCE-3474
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3474
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: nodemanager, tasktracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Eli Collins

 This is the MR counterpart to HDFS-1848. Like HDFS volume failure detection, 
 NM disk failure detection checks a subset of the disks, and a subset of the 
 directories. Eg the TT and the NM do not check the root disk for errors 
 unless a local dir resides on them. Even if a local dir resides on the root 
 disk the disk checking code only checks the local dirs so a failure only seen 
 when accessing a part of the disk no hosting the local dirs will not be 
 noticed. The disk that hosts the logs, pid, tmp dirs etc is critical, so if 
 needs to be checked as well, and the NM should shutdown if a critical disk is 
 not available (to prevent MR issues similar to HDFS-1848 and HDFS-2095). 
 Typically people currently work around this limitation by (aside from 
 ignoring it) by using raid-1 for the root disk or a health script that checks 
 the root disk health.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3566) MR AM slows down due to repeatedly constructing ContainerLaunchContext

2011-12-22 Thread Mahadev konar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-3566:
-

Priority: Critical  (was: Major)

 MR AM slows down due to repeatedly constructing ContainerLaunchContext
 --

 Key: MAPREDUCE-3566
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3566
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Critical
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3566-20111215.txt, MAPREDUCE-3566-20111220.txt


 The construction of the context is expensive, includes per-task trips to 
 NameNode for obtaining the information about job.jar, job splits etc which is 
 redundant across all tasks.
 We should have a common job-level context and a task-specific context 
 inheriting from the common job-level context.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3569) TaskAttemptListener holds a global lock for all task-updates

2011-12-22 Thread Mahadev konar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-3569:
-

Priority: Critical  (was: Major)

 TaskAttemptListener holds a global lock for all task-updates
 

 Key: MAPREDUCE-3569
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3569
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.1
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Critical
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3569-2015.1.txt


 This got added via MAPREDUCE-3274. We really don't need the lock if we just 
 implement what I mentioned on that ticket 
 [here|https://issues.apache.org/jira/browse/MAPREDUCE-3274?focusedCommentId=13137214page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13137214].
 This has performance implications on MR AM with lots of tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3572) MR AM's dispatcher is blocked by heartbeats to ResourceManager

2011-12-22 Thread Mahadev konar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-3572:
-

Priority: Critical  (was: Major)

 MR AM's dispatcher is blocked by heartbeats to ResourceManager
 --

 Key: MAPREDUCE-3572
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3572
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Critical
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3572-20111215.txt


 All the heartbeat processing is done in {{RMContainerAllocator}} locking the 
 object. The event processing is also locked on this, causing the dispatcher 
 to be blocked and the rest of the AM getting stalled.
 The event processing should be in a separate thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3512) Batch jobHistory disk flushes

2011-12-22 Thread Mahadev konar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-3512:
-

Priority: Critical  (was: Major)

 Batch jobHistory disk flushes
 -

 Key: MAPREDUCE-3512
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3512
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Siddharth Seth
Priority: Critical

 The mr-am flushes each individual job history event to disk for AM recovery. 
 The history even handler ends up with a significant backlog for tests like 
 MAPREDUCE-3402. 
 History events could be batched up based on num records / time / 
 TaskFinishedEvents to reduce the number of DFS writes - with the potential 
 drawback of having to rerun some tasks during AM recovery.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3568) Optimize Job's progress calculations in MR AM

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3568:
---

Priority: Critical  (was: Major)

 Optimize Job's progress calculations in MR AM
 -

 Key: MAPREDUCE-3568
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3568
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Critical
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3568-20111215.1.txt, 
 MAPREDUCE-3568-20111220.txt, MAPREDUCE-3568-20111222.txt


 Besides catering to client requests, Job progress is calculated in every 
 heartbeat to the RM so as to print the MR AM's progress. Today the map and 
 reduce progresses are calculated by looking up of each task in a big map 
 while we can simply make do with a scan and aggregate. With large number of 
 tasks, this can make a difference.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3305) Fix -list-blacklisted-trackers to print the blacklisted NMs

2011-12-22 Thread Mahadev konar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-3305:
-

Priority: Critical  (was: Major)

 Fix -list-blacklisted-trackers to print the blacklisted NMs
 ---

 Key: MAPREDUCE-3305
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3305
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Ramya Sunil
Priority: Critical
 Fix For: 0.23.1


 bin/mapred job -list-blacklisted-trackers currently prints 
 getBlacklistedTrackers - Not implemented yet This is a long pending issue. 
 Could not find a tracking ticket, hence opening one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3596) Sort benchmark got hang after completion of 99% map phase

2011-12-22 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3596:
---

Summary: Sort benchmark got hang after completion of 99% map phase  (was: 
Job got hang after completion of 99% map phase with hadoop-0.23.1.1112091615 RE 
build)

Please avoid using internal numbering. They won't make sense to outsiders(like 
me) anyways :)

Regardless, can you please provide more information? AM logs is a good start. 
Thanks!

 Sort benchmark got hang after completion of 99% map phase
 -

 Key: MAPREDUCE-3596
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3596
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Ravi Prakash
Priority: Critical

 Courtesy [~vinaythota]
 {quote}
 Ran sort benchmark couple of times and every time the job got hang after 
 completion 99% map phase. There are some map tasks failed. Also it's not 
 scheduled some of the pending map tasks.
 Cluster size is 350 nodes.
 Build Details:
 ==
 Compiled:   Fri Dec 9 16:25:27 PST 2011 by someone from 
 branches/branch-0.23/hadoop-common-project/hadoop-common 
 ResourceManager version:revision 1212681 by someone source checksum 
 on Fri Dec 9 16:52:07 PST 2011
 Hadoop version: revision 1212592 by someone Fri Dec 9 16:25:27 PST 
 2011
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3305) Fix -list-blacklisted-trackers to print the blacklisted NMs

2011-12-22 Thread Mahadev konar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-3305:
-

Priority: Major  (was: Critical)

Looks like we will have to remove this option from the mapred script, since we 
do not have blacklisting in the RM as of now.

 Fix -list-blacklisted-trackers to print the blacklisted NMs
 ---

 Key: MAPREDUCE-3305
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3305
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Ramya Sunil
 Fix For: 0.23.1


 bin/mapred job -list-blacklisted-trackers currently prints 
 getBlacklistedTrackers - Not implemented yet This is a long pending issue. 
 Could not find a tracking ticket, hence opening one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (MAPREDUCE-3364) Job executed through ftp file system is failing with java.io.IOException: Seek not supported

2011-12-22 Thread Devaraj K (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K reassigned MAPREDUCE-3364:


Assignee: Devaraj K

 Job executed through ftp file system is failing with java.io.IOException: 
 Seek not supported
 --

 Key: MAPREDUCE-3364
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3364
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Ramgopal N
Assignee: Devaraj K

 Instead of hdfs file as input to the job,i have given local file through ftp 
 as input and executed a job.The job is failing with ERROR
 Error: java.io.IOException: Seek not supported
 at 
 org.apache.hadoop.fs.ftp.FTPInputStream.seek(FTPInputStream.java:60)
 at 
 org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:47)
 at 
 org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:117)
 at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:484)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:710)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:147)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:142)
 The same job is successfully getting executed in V1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3568) Optimize Job's progress calculations in MR AM

2011-12-22 Thread Siddharth Seth (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175238#comment-13175238
 ] 

Siddharth Seth commented on MAPREDUCE-3568:
---

+1, Looks good. One minor change though - TestRMContainerAllocator.FakeJob 
isn't used any more and can be removed.

 Optimize Job's progress calculations in MR AM
 -

 Key: MAPREDUCE-3568
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3568
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mr-am, mrv2, performance
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Critical
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3568-20111215.1.txt, 
 MAPREDUCE-3568-20111220.txt, MAPREDUCE-3568-20111222.txt


 Besides catering to client requests, Job progress is calculated in every 
 heartbeat to the RM so as to print the MR AM's progress. Today the map and 
 reduce progresses are calculated by looking up of each task in a big map 
 while we can simply make do with a scan and aggregate. With large number of 
 tasks, this can make a difference.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3462) Job submission failing in JUnit tests

2011-12-22 Thread Amar Kamat (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175250#comment-13175250
 ] 

Amar Kamat commented on MAPREDUCE-3462:
---

Tested the fix on {{TestCompressionEmulationUtils}} and the test passed. I was 
wondering if it makes sense to add this to mapred-site.xml either at the top 
level (i.e {{conf/mapred-site.xml}}) or just for tests (i.e 
{{src/test/mapred-site.xml}}). I tired setting this property in 
{{src/test/mapred-site.xml}} but the test still failed. Somehow, we should make 
sure that the contrib tests load the {{src/test/mapred-site.xml}}.

 Job submission failing in JUnit tests
 -

 Key: MAPREDUCE-3462
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3462
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 0.23.0
Reporter: Amar Kamat
Assignee: Ravi Prakash
Priority: Blocker
  Labels: junit, test
 Attachments: MAPREDUCE-3462.branch-0.23.patch


 When I run JUnit tests (e.g. TestDistCacheEmulation, TestSleepJob and 
 TestCompressionEmulationUtils), I see job submission failing with the 
 following error:
 {noformat}
 java.lang.IllegalStateException: Variable substitution depth too large: 20 
 ${fs.default.name}
 at 
 org.apache.hadoop.conf.Configuration.substituteVars(Configuration.java:551)
 at org.apache.hadoop.conf.Configuration.get(Configuration.java:569)
 at 
 org.apache.hadoop.conf.Configuration.getStrings(Configuration.java:1020)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.populateTokenCache(JobSubmitter.java:564)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:353)
 at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1159)
 at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1156)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1156)
 at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1176)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.launchGridmixJob(Gridmix.java:190)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.writeInputData(Gridmix.java:150)
 at org.apache.hadoop.mapred.gridmix.Gridmix.start(Gridmix.java:425)
 at org.apache.hadoop.mapred.gridmix.Gridmix.runJob(Gridmix.java:380)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.access$000(Gridmix.java:56)
 at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:313)
 at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:311)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
 at org.apache.hadoop.mapred.gridmix.Gridmix.run(Gridmix.java:311)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3596) Sort benchmark got hang after completion of 99% map phase

2011-12-22 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3596:
--

Attachment: logs.tar.bz2

Attached some parts of the AM and RM logs.
am1/rm1 - first 2 map failures
am2/rm2 - 3rd map failure
am3/rm3 - last bit before the job was killed.

The first failed map was retried successfully. The remaining 2 never got 
containers allocated.

Looks like this may be an issue on the RM (RM logs aren't very useful though - 
since DEBUG logging wasn't enabled). The AM side table looks ok. After the 
second failed map - 1 container requested with priority=5 (never allocated)
{noformat}
2011-12-16 07:09:15,871 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=2 priority=5 resourceName=* numContainers=1 #asks=1
{noformat}

After the third failed map - 2 container requests with priority=5 (never 
allocated)
{noformat}
2011-12-16 07:26:07,641 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: 
applicationId=2 priority=5 resourceName=* numContainers=2 #asks=1
{noformat}

Towards the end, all reduce tasks are around 0.3328 complete, pendingMaps stays 
at 2.

 Sort benchmark got hang after completion of 99% map phase
 -

 Key: MAPREDUCE-3596
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3596
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Ravi Prakash
Priority: Critical
 Attachments: logs.tar.bz2


 Courtesy [~vinaythota]
 {quote}
 Ran sort benchmark couple of times and every time the job got hang after 
 completion 99% map phase. There are some map tasks failed. Also it's not 
 scheduled some of the pending map tasks.
 Cluster size is 350 nodes.
 Build Details:
 ==
 Compiled:   Fri Dec 9 16:25:27 PST 2011 by someone from 
 branches/branch-0.23/hadoop-common-project/hadoop-common 
 ResourceManager version:revision 1212681 by someone source checksum 
 on Fri Dec 9 16:52:07 PST 2011
 Hadoop version: revision 1212592 by someone Fri Dec 9 16:25:27 PST 
 2011
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3462) Job submission failing in JUnit tests

2011-12-22 Thread Amar Kamat (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175259#comment-13175259
 ] 

Amar Kamat commented on MAPREDUCE-3462:
---

I think setting {{mapreduce.job.hdfs-servers}} to an empty string in 
{{src/java/mapred-default.xml}} should take care of the failures. Thoughts?

 Job submission failing in JUnit tests
 -

 Key: MAPREDUCE-3462
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3462
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 0.23.0
Reporter: Amar Kamat
Assignee: Ravi Prakash
Priority: Blocker
  Labels: junit, test
 Attachments: MAPREDUCE-3462.branch-0.23.patch


 When I run JUnit tests (e.g. TestDistCacheEmulation, TestSleepJob and 
 TestCompressionEmulationUtils), I see job submission failing with the 
 following error:
 {noformat}
 java.lang.IllegalStateException: Variable substitution depth too large: 20 
 ${fs.default.name}
 at 
 org.apache.hadoop.conf.Configuration.substituteVars(Configuration.java:551)
 at org.apache.hadoop.conf.Configuration.get(Configuration.java:569)
 at 
 org.apache.hadoop.conf.Configuration.getStrings(Configuration.java:1020)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.populateTokenCache(JobSubmitter.java:564)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:353)
 at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1159)
 at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1156)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1156)
 at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1176)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.launchGridmixJob(Gridmix.java:190)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.writeInputData(Gridmix.java:150)
 at org.apache.hadoop.mapred.gridmix.Gridmix.start(Gridmix.java:425)
 at org.apache.hadoop.mapred.gridmix.Gridmix.runJob(Gridmix.java:380)
 at 
 org.apache.hadoop.mapred.gridmix.Gridmix.access$000(Gridmix.java:56)
 at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:313)
 at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:311)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
 at org.apache.hadoop.mapred.gridmix.Gridmix.run(Gridmix.java:311)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2517) Porting Gridmix v3 system tests into trunk branch.

2011-12-22 Thread Amar Kamat (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175291#comment-13175291
 ] 

Amar Kamat commented on MAPREDUCE-2517:
---

Committed the backported patch to Hadoop branch-1.1 (0.20.206). Thanks Vinay!

 Porting Gridmix v3 system tests into trunk branch.
 --

 Key: MAPREDUCE-2517
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2517
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: contrib/gridmix
Reporter: Vinay Kumar Thota
Assignee: Vinay Kumar Thota
 Fix For: 0.23.0

 Attachments: MAPREDUCE-2517-h20-v1.0.patch, MAPREDUCE-2517-v2.patch, 
 MAPREDUCE-2517-v3.patch, MAPREDUCE-2517-v4.patch, MAPREDUCE-2517.patch


 Porting of girdmix v3 system tests into trunk branch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3490) RMContainerAllocator counts failed maps towards Reduce ramp up

2011-12-22 Thread Sharad Agarwal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175319#comment-13175319
 ] 

Sharad Agarwal commented on MAPREDUCE-3490:
---

bq. I think we need to stop tracking this in RMContainerAllocator and rather 
rely on Job. For now, my patch seems the closest approximation to that (being 
conservative).
Doing it in Job or in RMContainerAllocator is a separate discussion. I don't 
think this patch deal with anything like that. It adds two new events for 
RMContainerAllocator itself. 
I am proposing that we don't need these extra events because this information 
(failed attempts info) is already available in RMContainerAllocator.

 

 RMContainerAllocator counts failed maps towards Reduce ramp up
 --

 Key: MAPREDUCE-3490
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3490
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.0
Reporter: Siddharth Seth
Assignee: Arun C Murthy
Priority: Blocker
 Attachments: MAPREDUCE-3490.patch, MAPREDUCE-3490.patch, 
 MAPREDUCE-3490.patch, MAPREDUCE-3490.patch, MR-3490-alternate.patch


 The RMContainerAllocator does not differentiate between failed and successful 
 maps while calculating whether reduce tasks are ready to launch. Failed tasks 
 are also counted towards total completed tasks. 
 Example. 4 failed maps, 10 total maps. Map%complete = 4/14 * 100 instead of 
 being 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira