[ 
https://issues.apache.org/jira/browse/OOZIE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13768465#comment-13768465
 ] 

Srikanth Sundarrajan commented on OOZIE-1534:
---------------------------------------------

Notes from my discussion with [~tucu00] offline:

This could be done if the log scavenger logic with use to harvest MR jobs 
started by pig/hive/sqoop is done realtime (as opposed after pig/hive/sqoop 
finishes) and the captured job IDs are written/fsync to a file in HDFS in the 
action subdir. Then the action main class would look for this file container 
job ids at start time and if it exists, it would kill all those jobs before 
proceeding. This would make the launcher job idempotent.

                
> Launcher job might run do hadoop attempt relaunch - possibly causing 
> correctness issues
> ---------------------------------------------------------------------------------------
>
>                 Key: OOZIE-1534
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1534
>             Project: Oozie
>          Issue Type: Improvement
>            Reporter: Srikanth Sundarrajan
>
> The <prepare> section of the action allow to clean up the output dir. This is 
> not sufficient as MR jobs started by Pig/Hive may be still running.We should 
> look to kill child MR jobs if any before launching new ones.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to