Hi all, I have raised a PR [1] to populate the total_record_count field in partition statistics when computable from metadata( no equality deletes, no V2 position delete files). This follows the discussion in #12098 about using DV cardinalities for this.
During review, a question came up : since total_record_count is derivable from existing fields , should the iceberg core library compute and persist it, or should this be left to engines ? For computing in core: the spec encourages it, it avoids duplicating logic across engines, and it’s immediately available from the stats file For leaving to engines: it’s a derived value, implementation adds complexity around null handling in incremental computation and it can only be populated for partitions without eq deletes. Would appreciate community inputs on the preferred approach. [1] https://github.com/apache/iceberg/pull/15979 Thanks Hemanth Boyina
