Hi Alan,

Thans for your help.  I set the hive.compactor.initiator.on(= true) and 
hive.compactor.worker.threads(=2) in hive-site.xml. After the configuration, I 
started Hive for its first run. what do you mean by "When you say you set 
hive.compactor.initiator.on (=true I hope) and hive.compactor.worker.threads, 
did you did that in your metastore process?" Do I need to config something in 
other place? 

In order to verify the compaction feature, I executed a alter table t1_txn 
compact 'major' command first. The request is enqueued but its state is always 
initiated even after I restart hive. How to make the request execute? Then I 
set hive.compactor.delta.num.threshold(=2) in hive-site.xml. There supposed to 
be a minor compaction after two update or delete operations. But after 5 
UPDATEs, the compaction did not happen. The show compactions command only lists 
the request of the previous alter table command. Besides, I set 
hive.compactor.delta.pct.threshold(=0.01), according to the document,  it 
specifies the percentage (fractional) size of the delta files relative to the 
base that will trigger a major compaction. Since the base does not exist in the 
beginning, how does the system know when to trigger a major compaction?  So, my 
question is how to make compaction work? Is there any tutorial or help?

Following is my hive-site.xml:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>hive.txn.manager</name>
    <value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
  </property>
  <property>
     <name>hive.txn.timeout</name>
     <value>1000</value>
  </property>
  <property>
     <name>hive.compactor.initiator.on</name>
     <value>true</value>
  </property>
  <property>
     <name>hive.compactor.worker.threads</name>
     <value>2</value>
  </property>
  <property>
     <name>hive.support.concurrency</name>
     <value>true</value>
  </property>
  <property>
     <name>hive.enforce.bucketing</name>
     <value>true</value>
  </property>
  <property>
     <name>hive.exec.dynamic.partition.mode</name>
     <value>nonstrict</value>
  </property>
 <property>
     <name>hive.in.test</name>
     <value>true</value>
  </property>
    <property>
     <name>hive.compactor.delta.num.threshold</name>
     <value>2</value>
  </property>
  <property>
     <name>hive.compactor.delta.pct.threshold</name>
     <value>0.01</value>
  </property>
</configuration>







At 2014-12-03 09:59:34, "Alan Gates" <ga...@hortonworks.com> wrote:
The base directories will only exist after compaction has run.  When you say 
you set hive.compactor.initiator.on (=true I hope) and 
hive.compactor.worker.threads, did you did that in your metastore process?  If 
so, did you restart the metastore after changing the config values?

Alan.


vic0777
December 1, 2014 at 23:12
Hi All,

I am trying to use the new transaction feature in Hive-0.14. According to its 
document, every transaction table have a base directory and one delta directory 
for each transaction in HDFS for data storage. But I can not find where the 
base directory is in HDFS, there is only delta directories. Following is the 
commands I used.

create table test_txn (id int ,name string ) clustered by (id) into 2 buckets 
stored as orc TBLPROPERTIES('transactional'='true');
insert into table test_txn select * from test_text;
update test_txn set name="liu" where id = 10;

P.S. I have configured the parameters required by the transaction feature:
  hive.support.concurrency,
  hive.enforce.bucketing,
  hive.exec.dynamic.partition.mode,
  hive.txn.manager,
  hive.compactor.initiator.on
  hive.compactor.worker.threads.

Although I cannot find the base directory in HDFS, all SELECT, UPDATE and 
DELETE statements works fine and the data in the table is correct. I am 
wondering where the base directory is.

Any help is appreciated.

Thanks,
Wantao









--

Sent with Postbox

CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.

Reply via email to