[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16806755#comment-16806755 ]
Denys Kuzmenko commented on HIVE-9995: -------------------------------------- rebased > ACID compaction tries to compact a single file > ---------------------------------------------- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions > Affects Versions: 1.0.0 > Reporter: Eugene Koifman > Assignee: Denys Kuzmenko > Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.08.patch, > HIVE-9995.09.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_0000016 > -rw-r--r-- 1 ekoifman staff 602 2016-06-09 16:03 > /user/hive/warehouse/t/base_0000016/bucket_00000 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_0000017 > -rw-r--r-- 1 ekoifman staff 588 2016-06-09 16:07 > /user/hive/warehouse/t/base_0000017/bucket_00000 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_0000017_0000017_0000 > -rw-r--r-- 1 ekoifman staff 514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_0000017_0000017_0000/bucket_00000 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_0000018_0000018_0000 > -rw-r--r-- 1 ekoifman staff 612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_0000018_0000018_0000/bucket_00000 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_0000017 > -rw-r--r-- 1 ekoifman staff 588 2016-06-09 16:07 > /user/hive/warehouse/t/base_0000017/bucket_00000 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_0000018_0000018 > -rw-r--r-- 1 ekoifman staff 500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_0000018_0000018/bucket_00000 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_0000018_0000018_0000 > -rw-r--r-- 1 ekoifman staff 612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_0000018_0000018_0000/bucket_00000 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_0000018_0000018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)