[ 
https://issues.apache.org/jira/browse/HIVE-29567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-29567:
----------------------------------
    Fix Version/s: 4.3.0
       Resolution: Fixed
           Status: Resolved  (was: Patch Available)

> Data loss in query-based major compaction
> -----------------------------------------
>
>                 Key: HIVE-29567
>                 URL: https://issues.apache.org/jira/browse/HIVE-29567
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 4.1.0, 4.2.0
>            Reporter: Denys Kuzmenko
>            Assignee: Denys Kuzmenko
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.3.0
>
>
> *Root Cause   *                                                               
>                                                                               
>                                                
>                                                                               
>                                                                               
>                                                
>   HIVE-28238 introduced a data loss bug in query-based major compaction. The 
> compaction INSERT query reads 0 rows because it treats the empty base 
> directory created by the preceding CREATE TABLE step as a valid (committed) 
> base, ignoring all delta directories.
> Before HIVE-28238, the *buggy* VALID_TXNS_KEY set by AcidCompactionService 
> was never actually used for data reads. Each compaction subquery *opened its 
> own implicit transaction* through the Driver, and the compaction chain of 
> events masked the bug



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to