amogh-jahagirdar commented on code in PR #13222:
URL: https://github.com/apache/iceberg/pull/13222#discussion_r2183765078


##########
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:
##########
@@ -1130,6 +1132,11 @@ protected ManifestReader<DataFile> 
newManifestReader(ManifestFile manifest) {
     protected Set<DataFile> newFileSet() {
       return DataFileSet.create();
     }
+
+    @Override

Review Comment:
   This is in `DataFileFilterManager` which is nested in 
`MergingSnapshotProducer`, I think this change makes sense since 
dataFileFilterManager probably shouldn't support removing dangling deletes 
since it won't even be reading those delete entries.



##########
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:
##########
@@ -920,6 +920,8 @@ protected Map<String, String> summary() {
 
   @Override
   public List<ManifestFile> apply(TableMetadata base, Snapshot snapshot) {
+    Set<DataFile> filesToBeDeleted = filterManager.filesToBeDeleted();

Review Comment:
   Yeah +1 I'm not entirely confident that `filesToBeDeleted()` is a sufficient 
source of truth for passing to `removeDanglingDeletesFor`. Does 
`filesToBeDeleted` encompass entries that match row filters and path based 
deletes/partition based driops? 



##########
core/src/main/java/org/apache/iceberg/ManifestFilterManager.java:
##########
@@ -224,7 +235,9 @@ List<ManifestFile> filterManifests(Schema tableSchema, 
List<ManifestFile> manife
   private boolean canTrustManifestReferences(List<ManifestFile> manifests) {
     Set<String> manifestLocations =
         manifests.stream().map(ManifestFile::path).collect(Collectors.toSet());
-    return allDeletesReferenceManifests && 
manifestLocations.containsAll(manifestsWithDeletes);

Review Comment:
   @nastra I think the additional check is fine but is there a situation where 
`allDeletesReferenceManifests` is true and the manifestsWithDeletes is empty? 
Every delete operation would either invalidate `allDeletesReferenceManifests` 
or if it's a delte(File f) where the file has defined a manifest location we'd 
add to the manifestLocations set.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to