Re: [PR] Docs: Document compute_partition_stats procedure [iceberg]

via GitHub Mon, 14 Jul 2025 11:47:29 -0700


szehon-ho commented on code in PR #13532:
URL: https://github.com/apache/iceberg/pull/13532#discussion_r2205581408



##########
docs/docs/spark-procedures.md:
##########
@@ -974,6 +974,38 @@ Collect statistics of the snapshot with id `snap1` of 
table `my_table` for colum
 CALL catalog_name.system.compute_table_stats(table => 'my_table', snapshot_id 
=> 'snap1', columns => array('col1', 'col2'));
 ```
 
+## Partition Statistics
+
+### `compute_partition_stats`
+
+This procedure computes the stats incrementally from the last snapshot that 
has `PartitionStatisticsFile` until the given 
+snapshot (uses current snapshot if not specified) and writes the combined 
result into a `PartitionStatisticsFile`
+after merging the partition stats. It performs a full compute if previous 
statistics file does not exist. It also registers the 

Review Comment:
   lets get rid of 'merging the partition stats'?  I feel its implied from 
'incrementally' and 'combined'



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Docs: Document compute_partition_stats procedure [iceberg]

Reply via email to