[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221053#comment-13221053
 ] 

Alejandro Abdelnur commented on MAPREDUCE-3837:
-----------------------------------------------

Mayank,

* Built branch-1 with your patch
* Configured the cluster, run a job test is OK
* Configured the mapred-site.xml with 'mapred.jobtracker.restart.recover=true'
* Restarted the JT
* Created a IN data file in my HDFS home dir
* Submitted 5 wordcount jobs

{code}
bin/hadoop jar hadoop-*examples*jar wordcount IN OUT0 &
bin/hadoop jar hadoop-*examples*jar wordcount IN OUT1 &
bin/hadoop jar hadoop-*examples*jar wordcount IN OUT2 &
bin/hadoop jar hadoop-*examples*jar wordcount IN OUT3 &
bin/hadoop jar hadoop-*examples*jar wordcount IN OUT4 &
{code}

* Waited till they are all running
* Killed the JT
* Restarted the JT

The jobs are not recovered, and what I see in the logs is:

{code}
2012-03-02 08:55:22,164 INFO org.apache.hadoop.mapred.JobTracker: Found an 
incomplete job directory job_201203020852_0001. Deleting it!!
2012-03-02 08:55:22,194 INFO org.apache.hadoop.mapred.JobTracker: Found an 
incomplete job directory job_201203020852_0002. Deleting it!!
2012-03-02 08:55:22,204 INFO org.apache.hadoop.mapred.JobTracker: Found an 
incomplete job directory job_201203020852_0003. Deleting it!!
2012-03-02 08:55:22,224 INFO org.apache.hadoop.mapred.JobTracker: Found an 
incomplete job directory job_201203020852_0004. Deleting it!!
2012-03-02 08:55:22,236 INFO org.apache.hadoop.mapred.JobTracker: Found an 
incomplete job directory job_201203020852_0005. Deleting it!!
{code}

Am I missing some additional configuration?

                
> Hadoop 22 Job tracker is not able to recover job in case of crash and after 
> that no user can submit job.
> --------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3837
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.22.0
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>             Fix For: 0.24.0, 0.22.1, 0.23.2
>
>         Attachments: PATCH-HADOOP-1-MAPREDUCE-3837.patch, 
> PATCH-MAPREDUCE-3837.patch, PATCH-TRUNK-MAPREDUCE-3837.patch
>
>
> If job tracker is crashed while running , and there were some jobs are 
> running , so if job tracker's property mapreduce.jobtracker.restart.recover 
> is true then it should recover the job.
> However the current behavior is as follows
> jobtracker try to restore the jobs but it can not . And after that jobtracker 
> closes its handle to hdfs and nobody else can submit job. 
> Thanks,
> Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to