rdblue commented on code in PR #15006:
URL: https://github.com/apache/iceberg/pull/15006#discussion_r2719132780


##########
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:
##########
@@ -1073,6 +1088,125 @@ private List<ManifestFile> newDeleteFilesAsManifests() {
     return cachedNewDeleteManifests;
   }
 
+  // Merge duplicates, internally takes care of updating newDeleteFilesBySpec 
to remove
+  // duplicates and add the newly merged DV
+  private void mergeDVsAndWrite() {

Review Comment:
   I think this method should return a `List<DeleteFile>` that represents all 
of the DVs after merging has been done. It should probably also use a cache to 
ensure that multiple calls produce the same list and reuse as much work as 
possible, although this could be done in a follow up and I could be convinced 
to cache at a different level (like delete manifests).
   
   Then the caller should take the list produced here, combine it with the v2 
`DeleteFile` list, group by spec, and write delete manifests. That would 
require no side-effects other than caching.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to