gaborkaszab commented on PR #15979: URL: https://github.com/apache/iceberg/pull/15979#issuecomment-4342973474
I gave this some further thoughts. I think, since this is a derived stat, we shouldn't write this and also we shouldn't have this in the spec either as the content of the stat file. Leaving the door open for engines performing a full scan to write the highest precision total_record_count stat seems very speculative. I'd rather populate this on the read path if all the conditions are met instead of writing it to the stat file. I see @pvary is on another opinion. We can wait if there is anyone else chiming in. In the meantime let me ping @ajantha-bhat one more time to see if he can share some additional context on the introduction of this stat. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
