[ 
https://issues.apache.org/jira/browse/HCATALOG-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449873#comment-13449873
 ] 

Rohini Palaniswamy commented on HCATALOG-451:
---------------------------------------------

Just to note.

bq. The trunk patch posted in this Jira is not really a fix to the trunk 
problem but more cleanup and unit tests based on the assumption that both #1 
and #2 are fixed.
If #1 is fixed, the issue raised by this Jira is not resolved. #1 is not the 
issue of the jira and the patch for this JIRA is just not unit tests depending 
on #1 to be fixed. HCat Code has been fixed to do expected work in commitJob 
and abortJob/cleanupJob whereas everything (both commit and abort code) was 
done before in cleanupJob and causing it to commit partitions even when job 
failed. Without this patch, user will have to manually drop partition if a job 
fails before rerunning it. 

 
                
> Partitions are created even when Jobs are aborted
> -------------------------------------------------
>
>                 Key: HCATALOG-451
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-451
>             Project: HCatalog
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.4, 0.5
>         Environment: Hadoop 1.0.2, non-dynamic partitions.
>            Reporter: Mithun Radhakrishnan
>            Assignee: Vandana Ayyalasomayajula
>             Fix For: 0.4.1
>
>         Attachments: HCAT-451-trunk.02.patch, HCATALOG-451.0.patch, 
> HCATALOG-451-branch-0.4.02.patch, HCATALOG-451-branch-0.4.03.patch, 
> HCATALOG-451-branch-0.4.patch
>
>
> If an MR job using HCatOutputFormat fails, and 
> FileOutputCommitterContainer::abortJob() is called, one would expect that 
> partitions aren't created/registered with HCatalog.
> When using dynamic-partitions, one sees that this behaves correctly. But when 
> static-partitions are used, partitions are created regardless of whether the 
> Job succeeded or failed.
> (This manifested as a failure when the job is repeated. The retry-job fails 
> to launch since the partitions already exist from the last failed run.)
> This is a result of bad code in FileOutputCommitter::cleanupJob(), which 
> seems to do an unconditional partition-add. This can be fixed by adding a 
> check for the output directory before adding partitions (in the 
> !dynamicParititoning case), since the directory is removed in abortJob().
> We'll have a patch for this shortly. As an aside, we ought to move the 
> partition-creation into commitJob(), where it logically belongs. cleanupJob() 
> is deprecated and common to both success and failure code paths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to