rdblue commented on a change in pull request #2925:
URL: https://github.com/apache/iceberg/pull/2925#discussion_r711814105



##########
File path: core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java
##########
@@ -242,6 +245,53 @@ private ManifestFile copyManifest(ManifestFile manifest) {
         current.formatVersion(), toCopy, current.specsById(), newManifestPath, 
snapshotId(), appendedManifestsSummary);
   }
 
+  /**
+   * Validates that no files matching given partitions have been added to the 
table since a starting snapshot.
+   *
+   * @param base table metadata to validate
+   * @param startingSnapshotId id of the snapshot current at the start of the 
operation
+   * @param partitionSet a set of partitions to check against, or none if 
check is to be against all files
+   */
+  protected void validateAddedDataFiles(TableMetadata base, Long 
startingSnapshotId,

Review comment:
       Rather than using an `Optional<PartitionSet>` couldn't this use 
`validateAddedDataFiles` with filter `Expressions.alwaysTrue()`? That would 
make this a bit more straightforward. and avoid the `isPresent` check inside a 
loop (which probably isn't a good idea anyway).

##########
File path: core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java
##########
@@ -242,6 +245,53 @@ private ManifestFile copyManifest(ManifestFile manifest) {
         current.formatVersion(), toCopy, current.specsById(), newManifestPath, 
snapshotId(), appendedManifestsSummary);
   }
 
+  /**
+   * Validates that no files matching given partitions have been added to the 
table since a starting snapshot.
+   *
+   * @param base table metadata to validate
+   * @param startingSnapshotId id of the snapshot current at the start of the 
operation
+   * @param partitionSet a set of partitions to check against, or none if 
check is to be against all files
+   */
+  protected void validateAddedDataFiles(TableMetadata base, Long 
startingSnapshotId,

Review comment:
       Rather than using an `Optional<PartitionSet>` couldn't this use 
`validateAddedDataFiles` with filter `Expressions.alwaysTrue()` for 
unpartitioned cases? That would make this a bit more straightforward. and avoid 
the `isPresent` check inside a loop (which probably isn't a good idea anyway).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to