[jira] [Updated] (HIVE-22639) Bucket file name does not match bucket id after query based major compaction

Aron Hamvas (Jira) Thu, 02 Jan 2020 05:15:36 -0800


     [ 
https://issues.apache.org/jira/browse/HIVE-22639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Aron Hamvas updated HIVE-22639:
-------------------------------
    Attachment: HIVE-22639.2.patch
        Status: Patch Available  (was: Open)

> Bucket file name does not match bucket id after query based major compaction
> ----------------------------------------------------------------------------
>
>                 Key: HIVE-22639
>                 URL: https://issues.apache.org/jira/browse/HIVE-22639
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 3.1.0, 3.0.0
>            Reporter: Aron Hamvas
>            Assignee: Aron Hamvas
>            Priority: Major
>         Attachments: HIVE-22639.1.patch, HIVE-22639.2.patch, HIVE-22639.patch
>
>
> While debugging 
> {{TestCrudCompactorOnTez#testCompactionWithSchemaEvolutionAndBuckets()}}, it 
> has come to my attention, that even though before compaction, the file name 
> of the single bucket in the delta directories is {{bucket_00001}}, in the new 
> base, the name of the new single bucket file is {{bucket_00000}}. At the same 
> time, the bucket value in the ROW__ID of the records remain the same and 
> suggest that the bucket id is 1. 
> So the bucket id and the file name do not match. This could lead to problems.
> The test itself does not reveal this issue, although I think that the tests 
> should check this, too. At the same time, the tests assume the exact bucket 
> id value in cases where it cannot be predicted and fail, even though the 
> bucket it does not change after the compaction, so the check should really 
> pass.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22639) Bucket file name does not match bucket id after query based major compaction

Reply via email to