[ 
https://issues.apache.org/jira/browse/PIG-5273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Subhashrao Saley updated PIG-5273:
-----------------------------------------
    Status: Patch Available  (was: Open)

> _SUCCESS file should be created at the end of the job
> -----------------------------------------------------
>
>                 Key: PIG-5273
>                 URL: https://issues.apache.org/jira/browse/PIG-5273
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Satish Subhashrao Saley
>            Assignee: Satish Subhashrao Saley
>         Attachments: PIG-5273-1.patch
>
>
> One of the users ran into issues because _SUCCESS file was created by 
> FileOutputCommitter.commitJob() and storeCleanup() called after that in 
> PigOutputCommitter failed to store schema due to network outage. abortJob was 
> then called and the StoreFunc.cleanupOnFailure method in it deleted the 
> output directory. Downstream jobs that started because of _SUCCESS file ran 
> with empty data 
> Possible solutions:
> 1) Move storeCleanup before commit. Found that order was reversed in 
> https://issues.apache.org/jira/browse/PIG-2642, probably due to 
> FileOutputCommitter version 1 and might not be a problem with 
> FileOutputCommitter version 2. This would still not help when there are 
> multiple outputs as main problem is cleanupOnFailure in abortJob deleting 
> directories.
> 2) We can change cleanupOnFailure not delete output directories. It still 
> does not help. The Oozie action retry might kick in and delete the directory 
> while the downstream has already started running because of the _SUCCESS 
> file. 
> 3) It cannot be done in the OutputCommitter at all as multiple output 
> committers are called in parallel in Tez. We can have Pig suppress _SUCCESS 
> creation and try creating them all at the end in TezLauncher if job has 
> succeeded before calling cleanupOnSuccess. Can probably add it as a 
> configurable setting and turn on by default in our clusters. This is probably 
> the possible solution
> Thank you [~rohini] for finding out the issue and providing solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to