openinx commented on a change in pull request #3069:
URL: https://github.com/apache/iceberg/pull/3069#discussion_r702826793



##########
File path: core/src/main/java/org/apache/iceberg/BaseOverwriteFiles.java
##########
@@ -129,6 +119,7 @@ protected void validate(TableMetadata base) {
 
     if (conflictDetectionFilter != null && base.currentSnapshot() != null) {
       validateAddedDataFiles(base, startingSnapshotId, 
conflictDetectionFilter, caseSensitive);

Review comment:
       Q:  Should the OverwriteFiles also conflict with the REPLACE operation 
when checking the added data files in `validateAddedDataFiles` ?  Take the 
following example: 
   
   1. The table has 2 data files:  FILE_A,  FILE_B; 
   2. Start the CopyOnWrite to delete all rows in FILE_A.  Let's say txn1,  
it's just started but not committed; 
   3. Someone start the REPLACE txn to rewrite the FILE_A + FILE_B to FILE_C.  
Let's say txn2 , started but not committed; 
   4.  Committed the txn2; 
   5.  Committed the txn1
   
   Finally,  we will see all the rows from FILE_A again because the REPLACE 
operation has added them to table again.
   
   I raise this question because I see the 
[validateAddedDataFiles](https://github.com/apache/iceberg/pull/3069/files#diff-f2d4265e007483f23a69703b88348743aecf99cc98395cd1730de3900a609eb6R261)
 only check the `APPEND` & `OVERWRITE` operations ( it's 
VALIDATE_ADDED_FILES_OPERATIONS ).  So `REPLACE` operations should be also 
considered ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to