[jira] [Commented] (HCATALOG-545) Improve failure recovery for FileOutputCommitterContainer

Feng Peng (JIRA) Mon, 05 Nov 2012 18:54:16 -0800

    [ 
https://issues.apache.org/jira/browse/HCATALOG-545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491152#comment-13491152
 ]


Feng Peng commented on HCATALOG-545:
------------------------------------

Here is an example:

{noformat}
%default PART 20121025T000000Z

test_data = LOAD 'test.txt' as (key: chararray, num1: int, num2: int, row_num: 
int, col_desc: chararray);

store test_data into 'db.test1' using 
org.apache.hcatalog.pig.HCatStorer('part_dt=$PART');
store test_data into 'db.test2' using 
org.apache.hcatalog.pig.HCatStorer('part_dt=$PART');
store test_data into 'db.test3' using 
org.apache.hcatalog.pig.HCatStorer('part_dt=$PART');
{noformat}

Each partition will be created by a unique instance of 
FileOutputCommitterContainer, which is crated by the corresponding storer. All 
these FileOutputCommitterContainers will be called in the cleanup task for the 
same M/R job. If the first FileOutputCommitterContainer successfully committed 
the partition to table "db.test1" but the second FileOutputCommitterContainer 
fails, the next retry would fail at the table "db.test1" since the partition 
already exists.

                
> Improve failure recovery for FileOutputCommitterContainer
> ---------------------------------------------------------
>
>                 Key: HCATALOG-545
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-545
>             Project: HCatalog
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.4, 0.5
>            Reporter: Feng Peng
>
> When a M/R job creates partitions in multiple Hive tables, all partitions are 
> committed in the same cleanup task via multiple instances of the 
> FileOutputCommitterContainer.
> Currently, when one of the FileOutputCommitterContainer fails, the cleanup 
> task exits with failure and retries. However, the retry would be blocked by 
> "partition exists" error caused by the partial commits. 
> Instead, the cleanup task should roll back all previous commits to the 
> different tables in case of failure so that the next retry can continue.
> Also, if all retries of the cleanup taks fail, no partial commit should be 
> left in the Hive metastore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HCATALOG-545) Improve failure recovery for FileOutputCommitterContainer

Reply via email to