ebyhr commented on code in PR #15896: URL: https://github.com/apache/iceberg/pull/15896#discussion_r3038513638
########## format/spec.md: ########## @@ -1007,7 +1007,7 @@ The schema of the partition statistics file is as follows: | v1 | v2 | v3 | Field id, name | Type | Description | |----|----|----|----------------|------|-------------| -| _required_ | _required_ | _required_ | **`1 partition`** | `struct<..>` | Partition data tuple, schema based on the unified partition type considering all specs in a table | +| _required_ | _required_ | _required_ | **`1 partition`** | `struct<..>` | Partition data tuple, schema based on the unified partition type considering all specs in a table, empty for unpartitioned tables | Review Comment: @ajantha-bhat Thanks, I'll post after exploring another approach, since Parquet writer can't write an empty struct. We want to obtain the record count more quickly and efficiently for CBO. As mentioned in the PR description, Trino currently needs to read manifest files to estimate table size. Even if v4 manifests address this, migration would take a long time. We need a solution that works for tables with format versions < v4 as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
