Feng Peng created HCATALOG-545:
----------------------------------
Summary: Improve failure recovery for FileOutputCommitterContainer
Key: HCATALOG-545
URL: https://issues.apache.org/jira/browse/HCATALOG-545
Project: HCatalog
Issue Type: Bug
Components: mapreduce
Affects Versions: 0.4, 0.5
Reporter: Feng Peng
When a M/R job creates partitions in multiple Hive tables, all partitions are
committed in the same cleanup task via multiple instances of the
FileOutputCommitterContainer.
Currently, when one of the FileOutputCommitterContainer fails, the cleanup task
exits with failure and retries. However, the retry would be blocked by
"partition exists" error caused by the partial commits.
Instead, the cleanup task should roll back all previous commits to the
different tables in case of failure so that the next retry can continue.
Also, if all retries of the cleanup taks fail, no partial commit should be left
in the Hive metastore.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira