[ 
https://issues.apache.org/jira/browse/HCATALOG-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449829#comment-13449829
 ] 

Francis Liu commented on HCATALOG-451:
--------------------------------------

To avoid confustion. There are two Pig issues:

1. PIG-2712 (abortJob())
2. antlr mismatch between Pig and Hive

If #1 is fixed then the issue raised by this Jira is resolved. The patch is an 
easy fix and Rohini already has patches up for review in PIG-2712.

The trunk patch posted in this Jira is not really a fix to the trunk problem 
but more cleanup and unit tests based on the assumption that both #1 and #2 are 
fixed. Something we could probably put into a separate jira.

As for code rot, I'm willing to take responsiblity for rebasing the patch. I 
don't see an urgent need to let another workaround in. 

I'll give the antlr problem a quick look sometime this week or net and see how 
much of an effort it is.


                
> Partitions are created even when Jobs are aborted
> -------------------------------------------------
>
>                 Key: HCATALOG-451
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-451
>             Project: HCatalog
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.4, 0.5
>         Environment: Hadoop 1.0.2, non-dynamic partitions.
>            Reporter: Mithun Radhakrishnan
>            Assignee: Vandana Ayyalasomayajula
>             Fix For: 0.4.1
>
>         Attachments: HCAT-451-trunk.02.patch, HCATALOG-451.0.patch, 
> HCATALOG-451-branch-0.4.02.patch, HCATALOG-451-branch-0.4.03.patch, 
> HCATALOG-451-branch-0.4.patch
>
>
> If an MR job using HCatOutputFormat fails, and 
> FileOutputCommitterContainer::abortJob() is called, one would expect that 
> partitions aren't created/registered with HCatalog.
> When using dynamic-partitions, one sees that this behaves correctly. But when 
> static-partitions are used, partitions are created regardless of whether the 
> Job succeeded or failed.
> (This manifested as a failure when the job is repeated. The retry-job fails 
> to launch since the partitions already exist from the last failed run.)
> This is a result of bad code in FileOutputCommitter::cleanupJob(), which 
> seems to do an unconditional partition-add. This can be fixed by adding a 
> check for the output directory before adding partitions (in the 
> !dynamicParititoning case), since the directory is removed in abortJob().
> We'll have a patch for this shortly. As an aside, we ought to move the 
> partition-creation into commitJob(), where it logically belongs. cleanupJob() 
> is deprecated and common to both success and failure code paths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to