paveon commented on code in PR #1118:
URL: https://github.com/apache/iceberg-go/pull/1118#discussion_r3296286204
##########
table/updates.go:
##########
@@ -506,6 +506,27 @@ func (u *removeSnapshotsUpdate) PostCommit(ctx
context.Context, preTable *Table,
}
}
+ // Manifest paths kept alive by retained snapshots, plus their
+ // loaded manifest-file slices so the live-data-file pass below
+ // doesn't re-download each manifest list.
+ retainedManifests := make(map[string]struct{})
+ retainedSnapshotManifests := make(map[int64][]iceberg.ManifestFile)
+ for _, snap := range postTable.Metadata().Snapshots() {
+ mans, err := snap.Manifests(prefs)
+ if err != nil {
+ return err
+ }
+ retainedSnapshotManifests[snap.SnapshotID] = mans
+ for _, man := range mans {
+ retainedManifests[man.FilePath()] = struct{}{}
+ }
+ }
+
+ // Open each orphaned manifest at most once: skip manifests that
+ // retained snapshots still reference, and dedupe across expired
+ // snapshots that share manifests by reference.
+ visitedManifests := make(map[string]struct{})
Review Comment:
Added two new unit tests to verify this behaviour. I also refactored the
previously existing tests a little bit to reduce the duplicity of creating
metadata strings over and over if that's okay.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]