ajantha-bhat commented on PR #15979: URL: https://github.com/apache/iceberg/pull/15979#issuecomment-4367997699
Replied on the mailing list ``` I personally think these changes are not required for the following reasons: Consistency: We can apply this same logic to positional deletes, not just DVs (Peter also mentioned this). Spec Adherence: The total_record_count should reflect an accurate value in all cases. The spec defines it as the "Accurate count of records in a partition after applying deletes if any," which implies engines should account for equality deletes as well. Redundancy: The current PR simply calculates total_record = data record - dv record. This is easily derived by the user from other existing stats, so specific handling isn't necessary. Existing Logic: We don't force total_record = data record when no delete files exist. Since those stats are already easily inferred by the user, we should maintain that same approach here. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
