[ https://issues.apache.org/jira/browse/HIVE-24235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17388018#comment-17388018 ]
Karen Coppage commented on HIVE-24235: -------------------------------------- This is useless because if the table is dropped, all related compactions are cleared from metadata. So here we're trying to fail a compaction that doesn't exist. The actual fix for this issue is HIVE-25393. > Drop and recreate table during MR compaction leaves behind base/delta > directory > ------------------------------------------------------------------------------- > > Key: HIVE-24235 > URL: https://issues.apache.org/jira/browse/HIVE-24235 > Project: Hive > Issue Type: Bug > Reporter: Karen Coppage > Assignee: Karen Coppage > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h > Remaining Estimate: 0h > > If a table is dropped and recreated during MR compaction, the table directory > and a base (or delta, if minor compaction) directory could be created, with > or without data, while the table "does not exist". > E.g. > {code:java} > create table c (i int) stored as orc tblproperties > ("NO_AUTO_COMPACTION"="true", "transactional"="true"); > insert into c values (9); > insert into c values (9); > alter table c compact 'major'; > While compaction job is running: { > drop table c; > create table c (i int) stored as orc tblproperties > ("NO_AUTO_COMPACTION"="true", "transactional"="true"); > } > {code} > The table directory should be empty, but table directory could look like this > after the job is finished: > {code:java} > Oct 6 14:23 c/base_0000002_v0000101/._orc_acid_version.crc > Oct 6 14:23 c/base_0000002_v0000101/.bucket_00000.crc > Oct 6 14:23 c/base_0000002_v0000101/_orc_acid_version > Oct 6 14:23 c/base_0000002_v0000101/bucket_00000 > {code} > or perhaps just: > {code:java} > Oct 6 14:23 c/base_0000002_v0000101/._orc_acid_version.crc > Oct 6 14:23 c/base_0000002_v0000101/_orc_acid_version > {code} > Insert another row and you have: > {code:java} > Oct 6 14:33 base_0000002_v0000101/ > Oct 6 14:33 base_0000002_v0000101/._orc_acid_version.crc > Oct 6 14:33 base_0000002_v0000101/.bucket_00000.crc > Oct 6 14:33 base_0000002_v0000101/_orc_acid_version > Oct 6 14:33 base_0000002_v0000101/bucket_00000 > Oct 6 14:35 delta_0000001_0000001_0000/._orc_acid_version.crc > Oct 6 14:35 delta_0000001_0000001_0000/.bucket_00000_0.crc > Oct 6 14:35 delta_0000001_0000001_0000/_orc_acid_version > Oct 6 14:35 delta_0000001_0000001_0000/bucket_00000_0 > {code} > Selecting from the table will result in this error because the highest valid > writeId for this table is 1: > {code:java} > thrift.ThriftCLIService: Error fetching results: > org.apache.hive.service.cli.HiveSQLException: Unable to get the next row set > at > org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:482) > ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > ... > Caused by: java.io.IOException: java.lang.RuntimeException: ORC split > generation failed with exception: java.io.IOException: Not enough history > available for (1,x). Oldest available base: > .../warehouse/b/base_0000004_v0000092 > {code} > Solution: Resolve the table again after compaction is finished; compare the > id with the table id from when compaction began. If the ids do not match, > abort the compaction's transaction. -- This message was sent by Atlassian Jira (v8.3.4#803005)