[
https://issues.apache.org/jira/browse/HIVE-29512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18073334#comment-18073334
]
Venugopal Reddy K edited comment on HIVE-29512 at 4/14/26 3:59 AM:
-------------------------------------------------------------------
The HMS backend tables COMPACTION_QUEUE and COMPLETED_COMPACTIONS are used for
both HIVE acid table and Iceberg table compaction flows. However, cleanup of
these records today happens only when a Hive ACID table or partition is dropped.
For Iceberg tables, it seems straightforward to extend cleanup upon drop table
at
-[https://github.com/VenuReddy2103/hive/blob/c9e7f5dd6191636232921279acc1a5dd5a6fcaff/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/AcidEventListener.java#L78]
However, the partition case is more complex. HMS partition APIs(such as add
partitions, drop partitions etc) are not invoked by HS2 for Iceberg tables, so
the existing cleanup hooks are not triggered.
One approach I am considering is introducing a new HMS API to explicitly clean
up compaction metadata for partitions. This API could be invoked by clients
when Iceberg partitions are dropped, thereby delegating the responsibility to
the clients.
I would appreciate the community’s feedback on the proposed approach.
was (Author: venureddy):
COMPACTION_QUEUE and COMPLETED_COMPACTIONS HMS backend db tables are used for
both HIVE acid table and iceberg table compaction flows. But these backend
table records are cleaned only upon hive acid table/partition drop.
It is straightforward to cleanup them upon drop iceberg table as well at -
[https://github.com/VenuReddy2103/hive/blob/c9e7f5dd6191636232921279acc1a5dd5a6fcaff/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/AcidEventListener.java#L78]
Whereas it is different for the partition case. Partition related HMS APIs(like
add partitions, drop partitions etc) are not invoked by HS2 for Iceberg tables.
Hence, thinking of adding a HMS API to cleanup these backend table records for
partition. API can be invoked by clients upon drop partitions of iceberg
tables, leaving the responsibility to clients. Request for community's
opinion/suggestions about it.
> COMPACTION_QUEUE and COMPLETED_COMPACTIONS table entries are not cleared upon
> iceberg table delete
> --------------------------------------------------------------------------------------------------
>
> Key: HIVE-29512
> URL: https://issues.apache.org/jira/browse/HIVE-29512
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Reporter: Venugopal Reddy K
> Assignee: Venugopal Reddy K
> Priority: Major
>
> *[Description]*
> COMPACTION_QUEUE and COMPLETED_COMPACTIONS table entries are not cleared when
> an iceberg table is deleted. Hence such entries can keep on accumulating over
> time.
> Note: For ACID tables, entries are cleared.
> *[Steps to reproduce]*
> {code:java}
> create database mydb;
> use mydb;
> create table icet1(i int) stored by iceberg
> tblproperties('compactor.threshold.target.size'='1000');
> insert into icet1 values(1),(2),(3);
> insert into icet1 values(1),(2),(3);
> # Wait for COMPACTION_QUEUE to have an entry for table.
> #show compactions; => check state as initiated
> drop table icet1;
> #show compactions; => Still entry exists and it is in initiated state.
> {code}
>
> 1. If the table is recreated again(without adding data) before compaction is
> triggered, compaction fails with below exception. Error message can be seen
> in show compactions output.
> *java.lang.NullPointerException: Cannot invoke
> "org.apache.iceberg.Snapshot.snapshotId()" because the return value of
> "org.apache.iceberg.Table.currentSnapshot()" is null*
> 2. If the table is not created before compaction is triggered, compaction
> fails with error message. Error message can be seen in show compactions
> output -
> *NoSuchObjectException(message:hive.mydb.icet1 table not found)*
> Note: In both the cases, COMPACTION_QUEUE entry for the table is moved to
> COMPLETED_COMPACTIONS upon compaction failure.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)