[ https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221053#comment-13221053 ]
Alejandro Abdelnur commented on MAPREDUCE-3837: ----------------------------------------------- Mayank, * Built branch-1 with your patch * Configured the cluster, run a job test is OK * Configured the mapred-site.xml with 'mapred.jobtracker.restart.recover=true' * Restarted the JT * Created a IN data file in my HDFS home dir * Submitted 5 wordcount jobs {code} bin/hadoop jar hadoop-*examples*jar wordcount IN OUT0 & bin/hadoop jar hadoop-*examples*jar wordcount IN OUT1 & bin/hadoop jar hadoop-*examples*jar wordcount IN OUT2 & bin/hadoop jar hadoop-*examples*jar wordcount IN OUT3 & bin/hadoop jar hadoop-*examples*jar wordcount IN OUT4 & {code} * Waited till they are all running * Killed the JT * Restarted the JT The jobs are not recovered, and what I see in the logs is: {code} 2012-03-02 08:55:22,164 INFO org.apache.hadoop.mapred.JobTracker: Found an incomplete job directory job_201203020852_0001. Deleting it!! 2012-03-02 08:55:22,194 INFO org.apache.hadoop.mapred.JobTracker: Found an incomplete job directory job_201203020852_0002. Deleting it!! 2012-03-02 08:55:22,204 INFO org.apache.hadoop.mapred.JobTracker: Found an incomplete job directory job_201203020852_0003. Deleting it!! 2012-03-02 08:55:22,224 INFO org.apache.hadoop.mapred.JobTracker: Found an incomplete job directory job_201203020852_0004. Deleting it!! 2012-03-02 08:55:22,236 INFO org.apache.hadoop.mapred.JobTracker: Found an incomplete job directory job_201203020852_0005. Deleting it!! {code} Am I missing some additional configuration? > Hadoop 22 Job tracker is not able to recover job in case of crash and after > that no user can submit job. > -------------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-3837 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 0.22.0 > Reporter: Mayank Bansal > Assignee: Mayank Bansal > Fix For: 0.24.0, 0.22.1, 0.23.2 > > Attachments: PATCH-HADOOP-1-MAPREDUCE-3837.patch, > PATCH-MAPREDUCE-3837.patch, PATCH-TRUNK-MAPREDUCE-3837.patch > > > If job tracker is crashed while running , and there were some jobs are > running , so if job tracker's property mapreduce.jobtracker.restart.recover > is true then it should recover the job. > However the current behavior is as follows > jobtracker try to restore the jobs but it can not . And after that jobtracker > closes its handle to hdfs and nobody else can submit job. > Thanks, > Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira