[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #4464: Core: Fix Conflicting transactions are ignored during commit

GitBox Sun, 03 Apr 2022 20:10:49 -0700


ajantha-bhat commented on code in PR #4464:
URL: https://github.com/apache/iceberg/pull/4464#discussion_r841337405



##########
core/src/main/java/org/apache/iceberg/BaseTransaction.java:
##########
@@ -437,6 +430,25 @@ private void commitSimpleTransaction() {
     }
   }
 
+  private void applyUpdates(TableOperations underlyingOps) {
+    if (base != underlyingOps.refresh()) {
+      TableMetadata oldBase = base;
+      // use refreshed the metadata
+      this.base = underlyingOps.current();
+      this.current = underlyingOps.current();
+      for (PendingUpdate update : updates) {
+        // re-commit each update in the chain to apply it and update current
+        try {
+          update.commit();
+        } catch (CommitFailedException e) {
+          // fallback to old base, so that it refreshes again on retry and 
apply pending updates.
+          base = oldBase;

Review Comment:
   >  The transaction fails to commit because of a concurrent update
   The base is updated to the refreshed metadata to re-apply commits, but one 
commit fails
   The retry runs again but detects that base is the same as the refreshed 
metadata, so the updates are not reapplied
   Because the updates aren't reapplied, commit is called with the refreshed 
base and the last inner commit to succeed
   
   > Is that right?
   
   Yes, exactly that is what happening.
   
   > In that case, I think that the right thing to do is to avoid catching the 
update's exception and throw it instead of retrying. If an update cannot be 
re-committed we should fail immediately rather than resetting to the original 
metadata to exhaust retries.
   
   Thanks for the suggestion. But I am wondering what is the point of retry 
configuration?
   In case of single update, your suggestion holds good. But I think it will 
impact multiple updates.
   
   Say I have 3 updates to apply, first update is success and the second update 
is failed because the first update is success. But doing a retry with rebase 
may help in second commit to get success right?  
   
   There are some operations like concurrent append, which can get success on 
retry and some operations like concurrent schema update which cannot even 
success after retry. So, we may need to retry when more than one update is 
present and it fails right?
   
   WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #4464: Core: Fix Conflicting transactions are ignored during commit

Reply via email to