alliasgher opened a new pull request, #874: URL: https://github.com/apache/iceberg-go/pull/874
## Summary `removeSnapshotsUpdate.PostCommit()` collects manifest lists, manifests, and data files for deletion from object storage when snapshots are expired. It did not collect `StatisticsFile` or `PartitionStatisticsFile` paths, so those files were leaked indefinitely after expiration. Fixes #837 ## Changes (`table/updates.go`) After the existing snapshot/manifest/data-file collection in `PostCommit`: 1. Build a set of removed snapshot IDs. 2. Iterate `preTable.Metadata().Statistics()` and `PartitionStatistics()`; for any entry whose `SnapshotID` is in the removed set, queue its `StatisticsPath` for deletion. 3. Symmetric to the existing manifest/data-file logic, remove from the deletion set any `StatisticsPath` that still exists in `postTable.Metadata().Statistics()` / `PartitionStatistics()` so we do not delete files that are still referenced. ## Dependency This depends on the prune-on-`RemoveSnapshots` fix in #873. Without that, `postTable.Metadata().Statistics()` would still contain the removed snapshot's entries and the post-commit loop would skip the deletion. Either ordering of the merges works as long as both land. ## Verification - `go build ./table/` passes - `go vet ./table/` passes - `go test ./table/` passes (existing `TestRemoveSnapshotsPostCommitSkipped` still passes) Fixes #837 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
