[ https://issues.apache.org/jira/browse/HIVE-23966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HIVE-23966: ---------------------------------- Labels: pull-request-available (was: ) > Minor query-based compaction always results in delta dirs with minWriteId=1 > --------------------------------------------------------------------------- > > Key: HIVE-23966 > URL: https://issues.apache.org/jira/browse/HIVE-23966 > Project: Hive > Issue Type: Bug > Reporter: Karen Coppage > Assignee: Karen Coppage > Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Minor compaction after major/IOW will result in directories that look like: > * base_z_v > * delta_1_y_v > * delete_delta_1_y_v > Should be: > * base_z_v > * delta_(z+1)_y_v > * delete_delta_(z+1)_y_v > Issues this causes: > For example, after running insert overwrite, then minor compaction, major > compaction will fail with the following error: > {noformat} > Found 2 equal splits: OrcSplit > [hdfs://.../warehouse/tablespace/managed/hive/bucketed/delta_0000001_0000006_v0001058/bucket_00004, > start=0, length=722, isOriginal=false, fileLength=722, hasFooter=false, > hasBase=true, deltas=1] and OrcSplit > [hdfs://.../warehouse/tablespace/managed/hive/bucketed/base_0000001/bucket_00004_0, > start=0, length=811, isOriginal=false, fileLength=811, hasFooter=false, > hasBase=true, deltas=1] > {noformat} > or it can fail with: > {noformat} > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Wrong sort order > of Acid rows detected for the rows: > org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder$WriteIdRowId@201be62b > an > d > org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder$WriteIdRowId@5f97bd3f > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)