aokolnychyi commented on a change in pull request #3069:
URL: https://github.com/apache/iceberg/pull/3069#discussion_r710380785
##########
File path: core/src/main/java/org/apache/iceberg/BaseRowDelta.java
##########
@@ -94,9 +102,12 @@ protected void validate(TableMetadata base) {
validateDataFilesExist(base, startingSnapshotId, referencedDataFiles,
!validateDeletes);
}
- // TODO: does this need to check new delete files?
- if (conflictDetectionFilter != null) {
- validateAddedDataFiles(base, startingSnapshotId,
conflictDetectionFilter, caseSensitive);
+ if (appendConflictDetectionFilter != null) {
+ validateAddedDataFiles(base, startingSnapshotId,
appendConflictDetectionFilter, caseSensitive);
+ }
+
+ if (deleteConflictDetectionFilter != null) {
+ validateNoNewDeletes(base, startingSnapshotId,
deleteConflictDetectionFilter, caseSensitive);
Review comment:
> you mean it will be over-aggressive and report false negatives even if
rows do not actually conflict, until we make the optimization.
Yeah, it may report false positives. The data filter is helpful but I think
it won't help much within the same partition. Position deletes are scoped to a
partition so the data filter should help us when there is a concurrent delete
in another partition. Within the partition, though, most of position deletes
will match that row filter as we don't persist the deleted row (by default).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]