jasonf20 commented on code in PR #10962:
URL: https://github.com/apache/iceberg/pull/10962#discussion_r1880327776
##########
core/src/main/java/org/apache/iceberg/ManifestFilterManager.java:
##########
@@ -363,6 +363,10 @@ private ManifestFile filterManifest(
}
private boolean canContainDeletedFiles(ManifestFile manifest, boolean
trustManifestReferences) {
+ if (manifest.minSequenceNumber() > 0 && manifest.minSequenceNumber() <
minSequenceNumber) {
+ return true;
+ }
Review Comment:
Perhaps it can be added as a table property. As an anecdote I have
encountered tables with frequent updates that have 100K + inactive delete files
in their manifests. And since Spark isn't used in this env performing a cleanup
isn't simple.
I think the above issue should be addressed, but perhaps we got a bit side
tracked in the context of this bug. How would you like to proceed here. This is
still a valid bug fix, can you think of a trick to make the test reproduce the
issue easily without reverting the change to `dropDeleteFilesOlderThan`? If
not can we merge it regardless, it's still a valid fix.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]