[ https://issues.apache.org/jira/browse/HIVE-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739170#action_12739170 ]
Todd Lipcon commented on HIVE-718: ---------------------------------- bq. I think it's not acceptable for a failed "insert" to corrupt the original data of the table. then we definitely have to move an entire directory of files in at once - otherwise we can have an insert partially succeed bq. We never have a table with sub directories (instead of files) inside. We will need some testing to make sure it actually works. This is going to be a necessity to do non-overwrite loads into a table/partition, right? bq. For unique name, maybe we can just prepend the job id. This isn't always available (eg running LOAD DATA from the cli). I think we're stuck with java.util.UUID, as ugly as it may be. I've spent the last hour or so trying to figure out any other way of generating a unique name inside a subdirectory. Because of the semantics of FileSystem.mkdirs and FileSystem.rename, I don't believe there's any way of doing this. mkdirs doesn't return false in the case that the directory already exists, and if you rename(src, dst), and dst already exists as a directory, it will move src *inside* of dst. > Load data inpath into a new partition without overwrite does not move the file > ------------------------------------------------------------------------------ > > Key: HIVE-718 > URL: https://issues.apache.org/jira/browse/HIVE-718 > Project: Hadoop Hive > Issue Type: Bug > Reporter: Zheng Shao > Attachments: HIVE-718.1.patch, HIVE-718.2.patch, hive-718.txt > > > The bug can be reproduced as following. Note that it only happens for > partitioned tables. The select after the first load returns nothing, while > the second returns the data correctly. > insert.txt in the current local directory contains 3 lines: "a", "b" and "c". > {code} > > create table tmp_insert_test (value string) stored as textfile; > > load data local inpath 'insert.txt' into table tmp_insert_test; > > select * from tmp_insert_test; > a > b > c > > create table tmp_insert_test_p ( value string) partitioned by (ds string) > > stored as textfile; > > load data local inpath 'insert.txt' into table tmp_insert_test_p partition > > (ds = '2009-08-01'); > > select * from tmp_insert_test_p where ds= '2009-08-01'; > > load data local inpath 'insert.txt' into table tmp_insert_test_p partition > > (ds = '2009-08-01'); > > select * from tmp_insert_test_p where ds= '2009-08-01'; > a 2009-08-01 > b 2009-08-01 > d 2009-08-01 > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.