amogh-jahagirdar commented on code in PR #15006:
URL: https://github.com/apache/iceberg/pull/15006#discussion_r2729998925


##########
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:
##########
@@ -86,8 +89,8 @@ abstract class MergingSnapshotProducer<ThisT> extends 
SnapshotProducer<ThisT> {
   // update data
   private final Map<Integer, DataFileSet> newDataFilesBySpec = 
Maps.newHashMap();
   private Long newDataFilesDataSequenceNumber;
-  private final Map<Integer, DeleteFileSet> newDeleteFilesBySpec = 
Maps.newHashMap();
-  private final Set<String> newDVRefs = Sets.newHashSet();
+  private final List<DeleteFile> positionAndEqualityDeletes = 
Lists.newArrayList();
+  private final Map<String, List<DeleteFile>> dvsByReferencedFile = 
Maps.newLinkedHashMap();

Review Comment:
   @rdblue These are 2 disjoint fields, one for a list of v2 deletes and a 
multimap for DVs.
   The map is a `LinkedHashMap` because we have a bunch of tests which have 
expectations on the exact orders of entries in a manifest. The previous change 
didn't require anything because we worked with the deleteFilesBySpec, and 
inherently preserved the order. 
   
   I personally think our tests should probably get away from expecting a 
certain order in manifests, and just assert the contents (or at least have 
validate methods that express either being strict on the ordering or not). As 
we get into V4, maybe we'll make implementation choices for ordering entries in 
a certain way but in the current state of things, it was kind of a hinderance 
to making changes here.
   
   I didn't make the test change since it's fairly large, and can be 
distracting from this change and I figured the linkedhashma has negligible 
overhead so we can just preserve the existing behavior.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to