ajantha-bhat commented on code in PR #15896: URL: https://github.com/apache/iceberg/pull/15896#discussion_r3040250715
########## format/spec.md: ########## @@ -1007,7 +1007,7 @@ The schema of the partition statistics file is as follows: | v1 | v2 | v3 | Field id, name | Type | Description | |----|----|----|----------------|------|-------------| -| _required_ | _required_ | _required_ | **`1 partition`** | `struct<..>` | Partition data tuple, schema based on the unified partition type considering all specs in a table | +| _required_ | _required_ | _required_ | **`1 partition`** | `struct<..>` | Partition data tuple, schema based on the unified partition type considering all specs in a table, empty for unpartitioned tables | Review Comment: It is still a behavior change for existing users. Partition stats was computed only for partition tables. Now you are enabling for non-partition tables with NULL partition. Plus like I mentioned, it is still odd for non partition table to write this file as it is via `compute_partition_stats` method. Why does user has to call `compute_partition_stats` on non-partition table? I can understand that you need a quick table level stats for CBO in Trino without doing I/O of all the manifests. a) Can we introduce a new table level stats (along with NDV puffin file) in `compute_table_stats`? People can still refer it for both partition table and non-partition table if they need whole table level info? b) Or Can we check if snapshot summary already has this table level stats you are looking for? (we don't have to do multiple IO of the files in that case). If not, can we enhance snapshot summary to include it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
