[
https://issues.apache.org/jira/browse/HIVE-27190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18052022#comment-18052022
] Denys Kuzmenko edited comment on HIVE-27190 at 1/15/26 9:42 AM: ---------------------------------------------------------------- [~lisoda], is this a read-path performance issue, with no writes involved and therefore no statistics updates? Basic stats: https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java#L475-L517 Table column stats: https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java#L706-L728 Partition column stats (Hive-4.1+): https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java#L731-L759 1. I don't see where this would call [BaseMetastoreTableOperations#refreshFromMetadataLocation|https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java#L184] 2. Not sure whether any of these functions are invoked multiple times during planning, but if they are, caching would definitely help with column statistics retrieval. 3. Iceberg API used to retrieve basic partition stats: https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java#L572-L573 was (Author: dkuzmenko): [~lisoda], is this a read-path performance issue, with no writes involved and therefore no statistics updates? Basic stats: https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java#L475-L517 Table column stats: https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java#L706-L728 Partition column stats (Hive-4.1+): https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java#L731-L759 1. I don't see where this would call [BaseMetastoreTableOperations#refreshFromMetadataLocation|https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java#L184] 2. Not sure whether any of these functions are invoked multiple times during planning, but if they are, caching would definitely help with column statistics retrieval. > Implement col stats cache for hive iceberg table > ------------------------------------------------- > > Key: HIVE-27190 > URL: https://issues.apache.org/jira/browse/HIVE-27190 > Project: Hive > Issue Type: Improvement > Reporter: Simhadri Govindappa > Assignee: Simhadri Govindappa > Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
