rdblue commented on a change in pull request #2925:
URL: https://github.com/apache/iceberg/pull/2925#discussion_r711814105
##########
File path: core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java
##########
@@ -242,6 +245,53 @@ private ManifestFile copyManifest(ManifestFile manifest) {
current.formatVersion(), toCopy, current.specsById(), newManifestPath,
snapshotId(), appendedManifestsSummary);
}
+ /**
+ * Validates that no files matching given partitions have been added to the
table since a starting snapshot.
+ *
+ * @param base table metadata to validate
+ * @param startingSnapshotId id of the snapshot current at the start of the
operation
+ * @param partitionSet a set of partitions to check against, or none if
check is to be against all files
+ */
+ protected void validateAddedDataFiles(TableMetadata base, Long
startingSnapshotId,
Review comment:
Rather than using an `Optional<PartitionSet>` couldn't this use
`validateAddedDataFiles` with filter `Expressions.alwaysTrue()`? That would
make this a bit more straightforward. and avoid the `isPresent` check inside a
loop (which probably isn't a good idea anyway).
##########
File path: core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java
##########
@@ -242,6 +245,53 @@ private ManifestFile copyManifest(ManifestFile manifest) {
current.formatVersion(), toCopy, current.specsById(), newManifestPath,
snapshotId(), appendedManifestsSummary);
}
+ /**
+ * Validates that no files matching given partitions have been added to the
table since a starting snapshot.
+ *
+ * @param base table metadata to validate
+ * @param startingSnapshotId id of the snapshot current at the start of the
operation
+ * @param partitionSet a set of partitions to check against, or none if
check is to be against all files
+ */
+ protected void validateAddedDataFiles(TableMetadata base, Long
startingSnapshotId,
Review comment:
Rather than using an `Optional<PartitionSet>` couldn't this use
`validateAddedDataFiles` with filter `Expressions.alwaysTrue()` for
unpartitioned cases? That would make this a bit more straightforward. and avoid
the `isPresent` check inside a loop (which probably isn't a good idea anyway).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]