[ 
https://issues.apache.org/jira/browse/HIVE-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745275#action_12745275
 ] 

Zheng Shao commented on HIVE-718:
---------------------------------

Let's get a consensus on this issue soon.

Atomicity is definitely wanted (either now or some time in the future). It's 
not just the move of all files need to be atomic - if the files are moved, the 
job should always return successful - otherwise most probably the client will 
try again and end up insert the data twice. It seems it's impossible to me with 
what we have.

Maybe we should not support "append" - user can always load into a new 
partition every time. Thoughts?


> Load data inpath into a new partition without overwrite does not move the file
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-718
>                 URL: https://issues.apache.org/jira/browse/HIVE-718
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Zheng Shao
>         Attachments: HIVE-718.1.patch, HIVE-718.2.patch, hive-718.txt
>
>
> The bug can be reproduced as following. Note that it only happens for 
> partitioned tables. The select after the first load returns nothing, while 
> the second returns the data correctly.
> insert.txt in the current local directory contains 3 lines: "a", "b" and "c".
> {code}
> > create table tmp_insert_test (value string) stored as textfile;
> > load data local inpath 'insert.txt' into table tmp_insert_test;
> > select * from tmp_insert_test;
> a
> b
> c
> > create table tmp_insert_test_p ( value string) partitioned by (ds string) 
> > stored as textfile;
> > load data local inpath 'insert.txt' into table tmp_insert_test_p partition 
> > (ds = '2009-08-01');
> > select * from tmp_insert_test_p where ds= '2009-08-01';
> > load data local inpath 'insert.txt' into table tmp_insert_test_p partition 
> > (ds = '2009-08-01');
> > select * from tmp_insert_test_p where ds= '2009-08-01';
> a       2009-08-01
> b       2009-08-01
> d       2009-08-01
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to