[
https://issues.apache.org/jira/browse/HIVE-29567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Denys Kuzmenko updated HIVE-29567:
----------------------------------
Status: Patch Available (was: Open)
> Data loss bug in query-based major compaction
> ---------------------------------------------
>
> Key: HIVE-29567
> URL: https://issues.apache.org/jira/browse/HIVE-29567
> Project: Hive
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 4.3.0
> Reporter: Denys Kuzmenko
> Assignee: Denys Kuzmenko
> Priority: Major
>
> *Root Cause *
>
>
>
>
>
> HIVE-28238 introduced a data loss bug in query-based major compaction. The
> compaction INSERT query reads 0 rows because it treats the empty base
> directory created by the preceding CREATE TABLE step as a valid (committed)
> base, ignoring all delta directories.
> Before HIVE-28238, the *buggy* VALID_TXNS_KEY set by AcidCompactionService
> was never actually used for data reads. Each compaction subquery *opened its
> own implicit transaction* through the Driver, and the compaction chain of
> events masked the bug
--
This message was sent by Atlassian Jira
(v8.20.10#820010)