-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71792/
-----------------------------------------------------------
(Updated Nov. 21, 2019, 5:35 p.m.)
Review request for hive, Laszlo Pinter and Peter Vary.
Bugs: HIVE-21917
https://issues.apache.org/jira/browse/HIVE-21917
Repository: hive-git
Description
-------
The Initiator thread in the metastore repeatedly loops over entries in the
COMPLETED_TXN_COMPONENTS table to determine which partitions / tables might
need to be compacted. However, entries are never removed from this table except
by a completed Compactor run.
In a cluster where most tables / partitions are write-once read-many, this
results in stale entries in this table never being cleaned up. In a small test
cluster, we have observed approximately 45k entries in this table (virtually
equal to the number of partitions in the cluster) while < 100 of these tables
have delta files at all. Since most of the tables will never get enough writes
to trigger a compaction (and in fact have only ever been written to once), the
initiator thread keeps trying to evaluate them on every loop.
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 610cf05204
ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
b28b57779b
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
8253ccb9c9
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
268038795b
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
e840758c9d
Diff: https://reviews.apache.org/r/71792/diff/2/
Changes: https://reviews.apache.org/r/71792/diff/1-2/
Testing
-------
Unit tests
Thanks,
Denys Kuzmenko