Szehon Ho created HIVE-6209:
-------------------------------
Summary: 'LOAD DATA INPATH ... OVERWRITE ..' doesn't overwrite
current data
Key: HIVE-6209
URL: https://issues.apache.org/jira/browse/HIVE-6209
Project: Hive
Issue Type: Bug
Reporter: Szehon Ho
Assignee: Szehon Ho
Fix For: 0.13.0
In case where user loads data into table using overwrite, using a different
file, it is not being overwritten.
{code}
$ hdfs dfs -cat /tmp/data
aaa
bbb
ccc
$ hdfs dfs -cat /tmp/data2
ddd
eee
fff
$ hive
hive> create table test (id string);
hive> load data inpath '/tmp/data' overwrite into table test;
hive> select * from test;
aaa
bbb
ccc
hive> load data inpath '/tmp/data2' overwrite into table test;
hive> select * from test;
aaa
bbb
ccc
ddd
eee
fff
{code}
It seems it is broken by HIVE-3756 which added another condition to whether
"rmr" should be run on old directory, and skips in this case.
There is a workaround of set fs.hdfs.impl.disable.cache=true;
which sabotages this condition, but this condition should be removed in
long-term.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)