[ 
https://issues.apache.org/jira/browse/OOZIE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13731455#comment-13731455
 ] 

Robert Kanter commented on OOZIE-1483:
--------------------------------------

Discussion thread on dev mailing list here: 
http://mail-archives.apache.org/mod_mbox/oozie-dev/201308.mbox/%3cCAHz+ZFeCKbnx7v7=qmtxn90y7tfhmorg4yzgbfecpuj7tss...@mail.gmail.com%3e
                
> Support for Job Recoverability
> ------------------------------
>
>                 Key: OOZIE-1483
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1483
>             Project: Oozie
>          Issue Type: Improvement
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>
> To support for the JobTracker to recover jobs on restart, we need to 
> configure the launcher job to be restarted by the JT, but not any of the 
> launched jobs ({{mapreduce.job.restart.recover}}).  This way, the launcher 
> job will simply start over when the JT recovers it; if we allow the JT to 
> recover the actual jobs, then they will interfere.   We'll also need this for 
> the same ability in YARN.
> This should be fairly trivial except for the MapReduce action because of the 
> optimization where the launcher finishes instead of waiting for the actual 
> job and Oozie does an "id swap".  Trying to add support for JT to recover the 
> MR action doesn't seem feasible as we'd run into a lot of trickiness and some 
> race conditions due to the id swap.  
> Instead, I think we should remove the MR optimization because it will allow 
> us to to support the recoverability for the MR action as well.  This also has 
> the benefit of simplifying the code because we'd be getting rid of all of the 
> id swap stuff and also making the MR action consistent with the other 
> actions.  The only downside is that the MR action will take an extra Map slot 
> just like the other actions.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to