singhpk234 commented on PR #4517: URL: https://github.com/apache/iceberg/pull/4517#issuecomment-1091262450
Thanks @SreeramGarlapati !!! for sharing the https://github.com/apache/iceberg/pull/2660#discussion_r650154461 thread, it really helps. Glad to know it was already considered while the implementation. It looks like we were on the fence to establish the reliability of this metric and hence decided to re-visit it in future, refering this comment https://github.com/apache/iceberg/pull/2660#discussion_r650322788 > SnapshotSummary is a free form dictionary & ADDED_FILES_PROP as a key in this dictionary - is NOT added to [Iceberg Spec](https://iceberg.apache.org/spec/#snapshots) Agree with you on this. > is not NOT populated by all engines yet As per my understanding engines don't own the responsibility of calculating / updating the stats, IIUC it's done entirely by core library. The flow for ex : [commit() [from engine]](https://github.com/apache/iceberg/blob/master/spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkPositionDeltaWrite.java#L251) -> [apply() in SnapshotProducer (core)](https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/SnapshotProducer.java#L296) -> [Summary calculation (core)](https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/SnapshotProducer.java#L202-L210). Am I missing something here ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
