Denys Kuzmenko created HIVE-29567:
-------------------------------------
Summary: Data loss bug in query-based major compaction
Key: HIVE-29567
URL: https://issues.apache.org/jira/browse/HIVE-29567
Project: Hive
Issue Type: Bug
Reporter: Denys Kuzmenko
Root Cause
HIVE-28238 introduced a data loss bug in query-based major compaction. The
compaction INSERT query reads 0 rows because it treats the empty base
directory created by the preceding CREATE TABLE step as a valid (committed)
base, ignoring all delta directories.
Before HIVE-28238, the buggy VALID_TXNS_KEY set by AcidCompactionService was
never actually used for data reads. Each compaction subquery opened its own
implicit transaction through the Driver, and the compaction chain of events
masked the bug
--
This message was sent by Atlassian Jira
(v8.20.10#820010)