Csaba Ringhofer created HIVE-21931:
--------------------------------------
Summary: Slow compaction for tiny tables
Key: HIVE-21931
URL: https://issues.apache.org/jira/browse/HIVE-21931
Project: Hive
Issue Type: Bug
Affects Versions: 3.1.0
Reporter: Csaba Ringhofer
I observed the issue in Impala development environment when (major) compacting
insert_only transactional tables in Hive. The compaction could take ~10 minutes
even when it only had to merge 2 rows from 2 inserts. The actual work was done
much earlier, the new base file was correctly written to HDFS, and Hive seemed
to wait without doing any work.
The compactions are started manually, hive.compactor.initiator.on=false to
avoid "surprise compaction" during tests.
{code}
hive.compactor.abortedtxn.threshold=1000
hive.compactor.check.interval=300s
hive.compactor.cleaner.run.interval=5000ms
hive.compactor.compact.insert.only=true
hive.compactor.crud.query.based=false
hive.compactor.delta.num.threshold=10
hive.compactor.delta.pct.threshold=0.1
hive.compactor.history.reaper.interval=2m
hive.compactor.history.retention.attempted=2
hive.compactor.history.retention.failed=3
hive.compactor.history.retention.succeeded=3
hive.compactor.initiator.failed.compacts.threshold=2
hive.compactor.initiator.on=false
hive.compactor.max.num.delta=500
hive.compactor.worker.threads=4
hive.compactor.worker.timeout=86400s
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)