[ https://issues.apache.org/jira/browse/HIVE-22639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aron Hamvas updated HIVE-22639: ------------------------------- Attachment: HIVE-22639.2.patch Status: Patch Available (was: Open) > Bucket file name does not match bucket id after query based major compaction > ---------------------------------------------------------------------------- > > Key: HIVE-22639 > URL: https://issues.apache.org/jira/browse/HIVE-22639 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: 3.1.0, 3.0.0 > Reporter: Aron Hamvas > Assignee: Aron Hamvas > Priority: Major > Attachments: HIVE-22639.1.patch, HIVE-22639.2.patch, HIVE-22639.patch > > > While debugging > {{TestCrudCompactorOnTez#testCompactionWithSchemaEvolutionAndBuckets()}}, it > has come to my attention, that even though before compaction, the file name > of the single bucket in the delta directories is {{bucket_00001}}, in the new > base, the name of the new single bucket file is {{bucket_00000}}. At the same > time, the bucket value in the ROW__ID of the records remain the same and > suggest that the bucket id is 1. > So the bucket id and the file name do not match. This could lead to problems. > The test itself does not reveal this issue, although I think that the tests > should check this, too. At the same time, the tests assume the exact bucket > id value in cases where it cannot be predicted and fail, even though the > bucket it does not change after the compaction, so the check should really > pass. -- This message was sent by Atlassian Jira (v8.3.4#803005)