[
https://issues.apache.org/jira/browse/HCATALOG-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449829#comment-13449829
]
Francis Liu commented on HCATALOG-451:
--------------------------------------
To avoid confustion. There are two Pig issues:
1. PIG-2712 (abortJob())
2. antlr mismatch between Pig and Hive
If #1 is fixed then the issue raised by this Jira is resolved. The patch is an
easy fix and Rohini already has patches up for review in PIG-2712.
The trunk patch posted in this Jira is not really a fix to the trunk problem
but more cleanup and unit tests based on the assumption that both #1 and #2 are
fixed. Something we could probably put into a separate jira.
As for code rot, I'm willing to take responsiblity for rebasing the patch. I
don't see an urgent need to let another workaround in.
I'll give the antlr problem a quick look sometime this week or net and see how
much of an effort it is.
> Partitions are created even when Jobs are aborted
> -------------------------------------------------
>
> Key: HCATALOG-451
> URL: https://issues.apache.org/jira/browse/HCATALOG-451
> Project: HCatalog
> Issue Type: Bug
> Components: mapreduce
> Affects Versions: 0.4, 0.5
> Environment: Hadoop 1.0.2, non-dynamic partitions.
> Reporter: Mithun Radhakrishnan
> Assignee: Vandana Ayyalasomayajula
> Fix For: 0.4.1
>
> Attachments: HCAT-451-trunk.02.patch, HCATALOG-451.0.patch,
> HCATALOG-451-branch-0.4.02.patch, HCATALOG-451-branch-0.4.03.patch,
> HCATALOG-451-branch-0.4.patch
>
>
> If an MR job using HCatOutputFormat fails, and
> FileOutputCommitterContainer::abortJob() is called, one would expect that
> partitions aren't created/registered with HCatalog.
> When using dynamic-partitions, one sees that this behaves correctly. But when
> static-partitions are used, partitions are created regardless of whether the
> Job succeeded or failed.
> (This manifested as a failure when the job is repeated. The retry-job fails
> to launch since the partitions already exist from the last failed run.)
> This is a result of bad code in FileOutputCommitter::cleanupJob(), which
> seems to do an unconditional partition-add. This can be fixed by adding a
> check for the output directory before adding partitions (in the
> !dynamicParititoning case), since the directory is removed in abortJob().
> We'll have a patch for this shortly. As an aside, we ought to move the
> partition-creation into commitJob(), where it logically belongs. cleanupJob()
> is deprecated and common to both success and failure code paths.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira