Delta files that are no longer needed are deleted asynchronously. For example, you may have some query using delta_0000002_0000002. A minor compaction, for example, can run concurrently and create delta_0000001_0000003 but it will leave delta_0000001_0000001, delta_0000002_0000002, delta_0000003_0000003 to be cleaned later. A query that starts after this, will use delta_0000001_0000003 and ignore delta_0000001_0000001, delta_0000002_0000002, delta_0000003_0000003, thus it has fewer files to read and merge. delta_0000001_0000001, delta_0000002_0000002, delta_0000003_0000003 will be deleted when the system determines that no query can be using them.
Judging by the directory listing you sent no major or minor compactions have ran. From: "r7raul1...@163.com<mailto:r7raul1...@163.com>" <r7raul1...@163.com<mailto:r7raul1...@163.com>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>> Date: Thursday, June 11, 2015 at 12:53 AM To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>> Subject: Re: Re: delta file compact take no effect SHOW COMPACTIONS; I can see some info Database Table Partition Type State Worker Start Time default u_data_txn NULL MAJOR initiated NULL 0 Time taken: 0.024 seconds, Fetched: 2 row(s) But after that I still see many delta file. ________________________________ r7raul1...@163.com<mailto:r7raul1...@163.com> From: Elliot West<mailto:tea...@gmail.com> Date: 2015-06-11 15:25 To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: Re: delta file compact take no effect What do you see if you issue: SHOW COMPACTIONS; On Thursday, 11 June 2015, r7raul1...@163.com<mailto:r7raul1...@163.com> <r7raul1...@163.com<mailto:r7raul1...@163.com>> wrote: I use hive 1.1.0 on hadoop 2.5.0 After I do some update operation on table u_data_txn. My table create many delta file like: drwxr-xr-x - hdfs hive 0 2015-02-06 22:52 /user/hive/warehouse/u_data_txn/delta_0000001_0000001 -rw-r--r-- 3 hdfs supergroup 346453 2015-02-06 22:52 /user/hive/warehouse/u_data_txn/delta_0000001_0000001/bucket_00000 -rw-r--r-- 3 hdfs supergroup 415924 2015-02-06 22:52 /user/hive/warehouse/u_data_txn/delta_0000001_0000001/bucket_00001 drwxr-xr-x - hdfs hive 0 2015-02-06 22:58 /user/hive/warehouse/u_data_txn/delta_0000002_0000002 -rw-r--r-- 3 hdfs supergroup 807 2015-02-06 22:58 /user/hive/warehouse/u_data_txn/delta_0000002_0000002/bucket_00000 -rw-r--r-- 3 hdfs supergroup 779 2015-02-06 22:58 /user/hive/warehouse/u_data_txn/delta_0000002_0000002/bucket_00001 drwxr-xr-x - hdfs hive 0 2015-02-06 22:59 /user/hive/warehouse/u_data_txn/delta_0000003_0000003 -rw-r--r-- 3 hdfs supergroup 817 2015-02-06 22:59 /user/hive/warehouse/u_data_txn/delta_0000003_0000003/bucket_00000 -rw-r--r-- 3 hdfs supergroup 767 2015-02-06 22:59 /user/hive/warehouse/u_data_txn/delta_0000003_0000003/bucket_00001 drwxr-xr-x - hdfs hive 0 2015-02-06 23:01 /user/hive/warehouse/u_data_txn/delta_0000004_0000004 -rw-r--r-- 3 hdfs supergroup 817 2015-02-06 23:01 /user/hive/warehouse/u_data_txn/delta_0000004_0000004/bucket_00000 -rw-r--r-- 3 hdfs supergroup 779 2015-02-06 23:01 /user/hive/warehouse/u_data_txn/delta_0000004_0000004/bucket_00001 drwxr-xr-x - hdfs hive 0 2015-02-06 23:03 /user/hive/warehouse/u_data_txn/delta_0000005_0000005 -rw-r--r-- 3 hdfs supergroup 817 2015-02-06 23:03 /user/hive/warehouse/u_data_txn/delta_0000005_0000005/bucket_00000 -rw-r--r-- 3 hdfs supergroup 779 2015-02-06 23:03 /user/hive/warehouse/u_data_txn/delta_0000005_0000005/bucket_00001 drwxr-xr-x - hdfs hive 0 2015-02-10 21:34 /user/hive/warehouse/u_data_txn/delta_0000006_0000006 -rw-r--r-- 3 hdfs supergroup 821 2015-02-10 21:34 /user/hive/warehouse/u_data_txn/delta_0000006_0000006/bucket_00000 drwxr-xr-x - hdfs hive 0 2015-02-10 21:35 /user/hive/warehouse/u_data_txn/delta_0000007_0000007 -rw-r--r-- 3 hdfs supergroup 821 2015-02-10 21:35 /user/hive/warehouse/u_data_txn/delta_0000007_0000007/bucket_00000 drwxr-xr-x - hdfs hive 0 2015-03-24 01:16 /user/hive/warehouse/u_data_txn/delta_0000008_0000008 -rw-r--r-- 3 hdfs supergroup 1670 2015-03-24 01:16 /user/hive/warehouse/u_data_txn/delta_0000008_0000008/bucket_00000 -rw-r--r-- 3 hdfs supergroup 1767 2015-03-24 01:16 /user/hive/warehouse/u_data_txn/delta_0000008_0000008/bucket_00001 I try ALTER TABLE u_data_txn COMPACT 'MAJOR'; The delta still exist. Then I try ALTER TABLE u_data_txn COMPACT 'MINOR'; The delta still exist. How to merge delta file? My config is: <property> <name>hive.support.concurrency</name> <value>true</value> </property> <property> <name>hive.enforce.bucketing</name> <value>true</value> </property> <property> <name>hive.exe.dynamic.partition.mode</name> <value>nonstrict</value> </property> <property> <name>hive.txn.manager</name> <value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value> </property> <property> <name>hive.compactor.initiator.on</name> <value>true</value> </property> <property> <name>hive.compactor.worker.threads</name> <value>4</value> </property> ________________________________ r7raul1...@163.com<javascript:_e(%7B%7D,'cvml','r7raul1...@163.com');>