[ https://issues.apache.org/jira/browse/HIVE-15880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094248#comment-16094248 ]
Lefty Leverenz commented on HIVE-15880: --------------------------------------- Good docs Vihang, thanks. I added some links from auto.purge in the list of table properties: * [DDL -- TBLPROPERTIES | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-listTableProperties] > Allow insert overwrite and truncate table query to use auto.purge table > property > -------------------------------------------------------------------------------- > > Key: HIVE-15880 > URL: https://issues.apache.org/jira/browse/HIVE-15880 > Project: Hive > Issue Type: Improvement > Reporter: Vihang Karajgaonkar > Assignee: Vihang Karajgaonkar > Fix For: 2.3.0, 3.0.0 > > Attachments: HIVE-15880.01.patch, HIVE-15880.02.patch, > HIVE-15880.03.patch, HIVE-15880.04.patch, HIVE-15880.05.patch, > HIVE-15880.06.patch > > > It seems inconsistent that auto.purge property is not considered when we do a > INSERT OVERWRITE while it is when we do a DROP TABLE > Drop table doesn't move table data to Trash when auto.purge is set to true > {noformat} > > create table temp(col1 string, col2 string); > No rows affected (0.064 seconds) > > alter table temp set tblproperties('auto.purge'='true'); > No rows affected (0.083 seconds) > > insert into temp values ('test', 'test'), ('test2', 'test2'); > No rows affected (25.473 seconds) > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 22 2017-02-09 13:03 > /user/hive/warehouse/temp/000000_0 > # > > drop table temp; > No rows affected (0.242 seconds) > # hdfs dfs -ls /user/hive/warehouse/temp > ls: `/user/hive/warehouse/temp': No such file or directory > # > # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse > # > {noformat} > INSERT OVERWRITE query moves the table data to Trash even when auto.purge is > set to true > {noformat} > > create table temp(col1 string, col2 string); > > alter table temp set tblproperties('auto.purge'='true'); > > insert into temp values ('test', 'test'), ('test2', 'test2'); > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 22 2017-02-09 13:07 > /user/hive/warehouse/temp/000000_0 > # > > insert overwrite table temp select * from dummy; > # hdfs dfs -ls /user/hive/warehouse/temp > Found 1 items > -rwxrwxrwt 3 hive hive 26 2017-02-09 13:08 > /user/hive/warehouse/temp/000000_0 > # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse > Found 1 items > drwx------ - hive hive 0 2017-02-09 13:08 > /user/hive/.Trash/Current/user/hive/warehouse/temp > # > {noformat} > While move operations are not very costly on HDFS it could be significant > overhead on slow FileSystems like S3. This could improve the performance of > {{INSERT OVERWRITE TABLE}} queries especially when there are large number of > partitions on tables located on S3 should the user wish to set auto.purge > property to true > Similarly {{TRUNCATE TABLE}} query on a table with {{auto.purge}} property > set true should not move the data to Trash -- This message was sent by Atlassian JIRA (v6.4.14#64029)