Feng Peng created HCATALOG-545:
----------------------------------

             Summary: Improve failure recovery for FileOutputCommitterContainer
                 Key: HCATALOG-545
                 URL: https://issues.apache.org/jira/browse/HCATALOG-545
             Project: HCatalog
          Issue Type: Bug
          Components: mapreduce
    Affects Versions: 0.4, 0.5
            Reporter: Feng Peng


When a M/R job creates partitions in multiple Hive tables, all partitions are 
committed in the same cleanup task via multiple instances of the 
FileOutputCommitterContainer.

Currently, when one of the FileOutputCommitterContainer fails, the cleanup task 
exits with failure and retries. However, the retry would be blocked by 
"partition exists" error caused by the partial commits. 

Instead, the cleanup task should roll back all previous commits to the 
different tables in case of failure so that the next retry can continue.
Also, if all retries of the cleanup taks fail, no partial commit should be left 
in the Hive metastore.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to