[jira] [Comment Edited] (HIVE-27190) Implement col stats cache for hive iceberg table

Denys Kuzmenko (Jira) Wed, 14 Jan 2026 04:49:13 -0800


    [ 
https://issues.apache.org/jira/browse/HIVE-27190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18051800#comment-18051800
 ]


Denys Kuzmenko edited comment on HIVE-27190 at 1/14/26 12:47 PM:
-----------------------------------------------------------------

Even better would be if you could provide a flamegraph, so we can pinpoint 
where CPU cycles are being spent inefficiently. 
Currently, for unpartitioned tables, each column stats object is stored in a 
separate Puffin blob. For partitioned tables - it's the whole ColumnStatistics. 
We read and aggregate the blobs for the relevant partitions, then discard any 
columns that are not part of the projection (no secondary indexes in puffin).


was (Author: dkuzmenko):
Even better would be if you could provide a flamegraph, so we can pinpoint 
where CPU cycles are being spent inefficiently. 
Currently, for unpartitioned tables, each column stats object is stored in a 
separate Puffin blob. For partitioned tables - it's the whole ColumnStatistics. 
We read and aggregate the blobs for the relevant partitions, then discard any 
columns that are not part of the projection.

> Implement  col stats cache for hive iceberg table
> -------------------------------------------------
>
>                 Key: HIVE-27190
>                 URL: https://issues.apache.org/jira/browse/HIVE-27190
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Simhadri Govindappa
>            Assignee: Simhadri Govindappa
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HIVE-27190) Implement col stats cache for hive iceberg table

Reply via email to