ajantha-bhat commented on code in PR #4464:
URL: https://github.com/apache/iceberg/pull/4464#discussion_r841337405
##########
core/src/main/java/org/apache/iceberg/BaseTransaction.java:
##########
@@ -437,6 +430,25 @@ private void commitSimpleTransaction() {
}
}
+ private void applyUpdates(TableOperations underlyingOps) {
+ if (base != underlyingOps.refresh()) {
+ TableMetadata oldBase = base;
+ // use refreshed the metadata
+ this.base = underlyingOps.current();
+ this.current = underlyingOps.current();
+ for (PendingUpdate update : updates) {
+ // re-commit each update in the chain to apply it and update current
+ try {
+ update.commit();
+ } catch (CommitFailedException e) {
+ // fallback to old base, so that it refreshes again on retry and
apply pending updates.
+ base = oldBase;
Review Comment:
> The transaction fails to commit because of a concurrent update
The base is updated to the refreshed metadata to re-apply commits, but one
commit fails
The retry runs again but detects that base is the same as the refreshed
metadata, so the updates are not reapplied
Because the updates aren't reapplied, commit is called with the refreshed
base and the last inner commit to succeed
> Is that right?
Yes, exactly that is what happening.
> In that case, I think that the right thing to do is to avoid catching the
update's exception and throw it instead of retrying. If an update cannot be
re-committed we should fail immediately rather than resetting to the original
metadata to exhaust retries.
Thanks for the suggestion. But I am wondering what is the point of retry
configuration?
In case of single update, your suggestion holds good. But I think it will
impact multiple updates.
Say I have 3 updates to apply, first update is success and the second update
is failed because the first update is success. But doing a retry with rebase
may help in second commit to get success right?
There are some operations like concurrent append, which can get success on
retry and some operations like concurrent schema update which cannot even
success after retry. So, we may need to retry when more than one update is
present and it fails right?
WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]