[
https://issues.apache.org/jira/browse/HCATALOG-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alan Gates resolved HCATALOG-580.
---------------------------------
Resolution: Fixed
Patch 3 checked into branch 0.5 and trunk.
> Optimizations in HCAT-538 break e2e tests
> -----------------------------------------
>
> Key: HCATALOG-580
> URL: https://issues.apache.org/jira/browse/HCATALOG-580
> Project: HCatalog
> Issue Type: Bug
> Affects Versions: 0.5
> Environment: RH 5.8 (on AWS)
> Hadoop 1.1.2.17 (build)
> HCat 0.5 (build)
> Reporter: Sushanth Sowmyan
> Assignee: Daniel Dai
> Priority: Blocker
> Fix For: 0.5
>
> Attachments: HCATALOG-580-1.patch, HCATALOG-580-2.patch,
> HCATALOG-580-3.patch
>
>
> The optimizations brought in by HCATALOG-538 break dynamic partitioning in
> the e2e tests. The issue is that the assumption that if the first child in a
> directory structure is a directory, the rest are directories, and if the
> first child is a file, then the rest are files is an incorrect one.
> (Admittedly, one part of that, that of assuming that if the first child is a
> file, the assumption that it is a leaf directory is not necessarily a bad one
> in premise, although still incorrect)
> The issue with this is that underlying FileOutputCommitter and OutputFormat
> behaviour would affect whether or not you get files or directories, or
> whether there would be any _temporary directories still left behind, for eg.
> In the case I tested, the issue is that there is a _temporary directory in a
> "leaf" directory, followed by part files. The optimization sees the
> _temporary directory, finds nothing inside it, so doesn't mkdir any parent,
> then decides that the rest are directories, then moves to the part file, and
> tries to rename it directly without mkdir-ing its parent directory.
> The e2e test conf in question is Pig_Checkin_7
> {code}
> {
> 'num' => 7
> ,'hcat_prep'=>q\drop table if exists
> pig_checkin_7;
> create table pig_checkin_7 (name string, age int) partitioned by (ds string)
> STORED AS TEXTFILE;\
> ,'pig' => q\a = load 'studentparttab30k'
> using org.apache.hcatalog.pig.HCatLoader();
> b = foreach a generate name, age, ds;
> store b into 'pig_checkin_7' using org.apache.hcatalog.pig.HCatStorer();\,
> ,'result_table' => 'pig_checkin_7',
> ,'sql' => "select name, age, ds from
> studentparttab30k;",
> ,'floatpostprocess' => 1
> ,'delimiter' => ' '
> }
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira