-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71792/
-----------------------------------------------------------

Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21917
    https://issues.apache.org/jira/browse/HIVE-21917


Repository: hive-git


Description
-------

The Initiator thread in the metastore repeatedly loops over entries in the 
COMPLETED_TXN_COMPONENTS table to determine which partitions / tables might 
need to be compacted. However, entries are never removed from this table except 
by a completed Compactor run.

In a cluster where most tables / partitions are write-once read-many, this 
results in stale entries in this table never being cleaned up. In a small test 
cluster, we have observed approximately 45k entries in this table (virtually 
equal to the number of partitions in the cluster) while < 100 of these tables 
have delta files at all. Since most of the tables will never get enough writes 
to trigger a compaction (and in fact have only ever been written to once), the 
initiator thread keeps trying to evaluate them on every loop.


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 610cf05204 
  
ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java 
b28b57779b 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
 8253ccb9c9 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 6281208247 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
 e840758c9d 


Diff: https://reviews.apache.org/r/71792/diff/1/


Testing
-------

Unit tests


Thanks,

Denys Kuzmenko

Reply via email to