[
https://issues.apache.org/jira/browse/HIVE-2117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Patrick Hunt reassigned HIVE-2117:
----------------------------------
Assignee: Patrick Hunt
> insert overwrite ignoring partition location
> --------------------------------------------
>
> Key: HIVE-2117
> URL: https://issues.apache.org/jira/browse/HIVE-2117
> Project: Hive
> Issue Type: Bug
> Affects Versions: 0.7.0, 0.8.0
> Reporter: Patrick Hunt
> Assignee: Patrick Hunt
> Priority: Blocker
> Attachments: HIVE-2117_br07.patch, data.txt
>
>
> The following code works differently in 0.5.0 vs 0.7.0.
> In 0.5.0 the partition location is respected.
> However in 0.7.0 while the initial partition is create with the specified
> location "<path>/parta", the "insert overwrite ..." results in the partition
> written to "<path>/dt=a" (note that <path> is the same in both cases).
> {code}
> create table foo_stg (bar INT, car INT);
> load data local inpath 'data.txt' into table foo_stg;
>
> create table foo4 (bar INT, car INT) partitioned by (dt STRING) LOCATION
> '/user/hive/warehouse/foo4';
> alter table foo4 add partition (dt='a') location
> '/user/hive/warehouse/foo4/parta';
>
> from foo_stg fs insert overwrite table foo4 partition (dt='a') select *;
> {code}
> From what I can tell HIVE-1707 introduced this via a change to
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Path, String,
> Map<String, String>, boolean, boolean)
> specifically:
> {code}
> + Path partPath = new Path(tbl.getDataLocation().getPath(),
> + Warehouse.makePartPath(partSpec));
> +
> + Path newPartPath = new Path(loadPath.toUri().getScheme(), loadPath
> + .toUri().getAuthority(), partPath.toUri().getPath());
> {code}
> Reading the description on HIVE-1707 it seems that this may have been done
> purposefully, however given the partition location is explicitly specified
> for the partition in question it seems like that should be honored (esp give
> the table location has not changed).
> This difference in behavior is causing a regression in existing production
> Hive based code. I'd like to take a stab at addressing this, any suggestions?
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira