Hi Alan, Thans for your help. I set the hive.compactor.initiator.on(= true) and hive.compactor.worker.threads(=2) in hive-site.xml. After the configuration, I started Hive for its first run. what do you mean by "When you say you set hive.compactor.initiator.on (=true I hope) and hive.compactor.worker.threads, did you did that in your metastore process?" Do I need to config something in other place?
In order to verify the compaction feature, I executed a alter table t1_txn compact 'major' command first. The request is enqueued but its state is always initiated even after I restart hive. How to make the request execute? Then I set hive.compactor.delta.num.threshold(=2) in hive-site.xml. There supposed to be a minor compaction after two update or delete operations. But after 5 UPDATEs, the compaction did not happen. The show compactions command only lists the request of the previous alter table command. Besides, I set hive.compactor.delta.pct.threshold(=0.01), according to the document, it specifies the percentage (fractional) size of the delta files relative to the base that will trigger a major compaction. Since the base does not exist in the beginning, how does the system know when to trigger a major compaction? So, my question is how to make compaction work? Is there any tutorial or help? Following is my hive-site.xml: <?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hive.txn.manager</name> <value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value> </property> <property> <name>hive.txn.timeout</name> <value>1000</value> </property> <property> <name>hive.compactor.initiator.on</name> <value>true</value> </property> <property> <name>hive.compactor.worker.threads</name> <value>2</value> </property> <property> <name>hive.support.concurrency</name> <value>true</value> </property> <property> <name>hive.enforce.bucketing</name> <value>true</value> </property> <property> <name>hive.exec.dynamic.partition.mode</name> <value>nonstrict</value> </property> <property> <name>hive.in.test</name> <value>true</value> </property> <property> <name>hive.compactor.delta.num.threshold</name> <value>2</value> </property> <property> <name>hive.compactor.delta.pct.threshold</name> <value>0.01</value> </property> </configuration> At 2014-12-03 09:59:34, "Alan Gates" <ga...@hortonworks.com> wrote: The base directories will only exist after compaction has run. When you say you set hive.compactor.initiator.on (=true I hope) and hive.compactor.worker.threads, did you did that in your metastore process? If so, did you restart the metastore after changing the config values? Alan. vic0777 December 1, 2014 at 23:12 Hi All, I am trying to use the new transaction feature in Hive-0.14. According to its document, every transaction table have a base directory and one delta directory for each transaction in HDFS for data storage. But I can not find where the base directory is in HDFS, there is only delta directories. Following is the commands I used. create table test_txn (id int ,name string ) clustered by (id) into 2 buckets stored as orc TBLPROPERTIES('transactional'='true'); insert into table test_txn select * from test_text; update test_txn set name="liu" where id = 10; P.S. I have configured the parameters required by the transaction feature: hive.support.concurrency, hive.enforce.bucketing, hive.exec.dynamic.partition.mode, hive.txn.manager, hive.compactor.initiator.on hive.compactor.worker.threads. Although I cannot find the base directory in HDFS, all SELECT, UPDATE and DELETE statements works fine and the data in the table is correct. I am wondering where the base directory is. Any help is appreciated. Thanks, Wantao -- Sent with Postbox CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.