Satish Subhashrao Saley created PIG-5273:
--------------------------------------------

             Summary: _SUCCESS file should be created at the end of the job
                 Key: PIG-5273
                 URL: https://issues.apache.org/jira/browse/PIG-5273
             Project: Pig
          Issue Type: Bug
            Reporter: Satish Subhashrao Saley
            Assignee: Satish Subhashrao Saley


One of the users ran into issues because _SUCCESS file was created by 
FileOutputCommitter.commitJob() and storeCleanup() called after that in 
PigOutputCommitter failed to store schema due to network outage. abortJob was 
then called and the StoreFunc.cleanupOnFailure method in it deleted the output 
directory. Downstream jobs that started because of _SUCCESS file ran with empty 
data 
Possible solutions:
1) Move storeCleanup before commit. Found that order was reversed in 
https://issues.apache.org/jira/browse/PIG-2642, probably due to 
FileOutputCommitter version 1 and might not be a problem with 
FileOutputCommitter version 2. This would still not help when there are 
multiple outputs as main problem is cleanupOnFailure in abortJob deleting 
directories.
2) We can change cleanupOnFailure not delete output directories. It still does 
not help. The Oozie action retry might kick in and delete the directory while 
the downstream has already started running because of the _SUCCESS file. 
3) It cannot be done in the OutputCommitter at all as multiple output 
committers are called in parallel in Tez. We can have Pig suppress _SUCCESS 
creation and try creating them all at the end in TezLauncher if job has 
succeeded before calling cleanupOnSuccess. Can probably add it as a 
configurable setting and turn on by default in our clusters. This is probably 
the possible solution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to