[
https://issues.apache.org/jira/browse/HIVE-29567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Denys Kuzmenko updated HIVE-29567:
----------------------------------
Description:
*Root Cause *
HIVE-28238 introduced a data loss bug in query-based major compaction. The
compaction INSERT query reads 0 rows because it treats the empty base directory
created by the preceding CREATE TABLE step as a valid (committed) base,
ignoring all delta directories.
Before HIVE-28238, the *buggy* VALID_TXNS_KEY set by AcidCompactionService was
never actually used for data reads. Each compaction subquery *opened its own
implicit transaction* through the Driver, and the compaction chain of events
masked the bug
was:
Root Cause
HIVE-28238 introduced a data loss bug in query-based major compaction. The
compaction INSERT query reads 0 rows because it treats the empty base
directory created by the preceding CREATE TABLE step as a valid (committed)
base, ignoring all delta directories.
Before HIVE-28238, the buggy VALID_TXNS_KEY set by AcidCompactionService was
never actually used for data reads. Each compaction subquery opened its own
implicit transaction through the Driver, and the compaction chain of events
masked the bug
> Data loss bug in query-based major compaction
> ---------------------------------------------
>
> Key: HIVE-29567
> URL: https://issues.apache.org/jira/browse/HIVE-29567
> Project: Hive
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 4.3.0
> Reporter: Denys Kuzmenko
> Priority: Major
>
> *Root Cause *
>
>
>
>
>
> HIVE-28238 introduced a data loss bug in query-based major compaction. The
> compaction INSERT query reads 0 rows because it treats the empty base
> directory created by the preceding CREATE TABLE step as a valid (committed)
> base, ignoring all delta directories.
> Before HIVE-28238, the *buggy* VALID_TXNS_KEY set by AcidCompactionService
> was never actually used for data reads. Each compaction subquery *opened its
> own implicit transaction* through the Driver, and the compaction chain of
> events masked the bug
--
This message was sent by Atlassian Jira
(v8.20.10#820010)