alessandro-nori opened a new issue, #15050:
URL: https://github.com/apache/iceberg/issues/15050

   ### Apache Iceberg version
   
   1.10.1 (latest release)
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   Currently, HTTP 503 responses are not retried, yet they are still classified 
as cleanable failures for CreateTable transactions (stage create + updateTable 
request). This can lead to table corruption in scenarios where the commit is 
successfully persisted by the catalog, but an intermediate component returns a 
503 to the client.
   
   
https://github.com/apache/iceberg/blob/7bac8650f65279c470d7d2c005c40a858933134a/core/src/main/java/org/apache/iceberg/rest/RESTTableOperations.java#L172
   
   In our setup, Spark communicates with the catalog through Envoy (acting as a 
reverse proxy). When Envoy returns a 503 due to a transient downstream issue, 
the client assumes the commit failed and proceeds with cleanup. However, the 
catalog may have already committed the transaction successfully. As a result, 
valid manifest files can be incorrectly cleaned up, leaving the table in an 
corrupted state.
   
   This behavior makes 503 responses unsafe to treat as cleanable failures, 
especially in deployments with proxies between the client and the catalog.
   
   Should we use a `tableCommitErrorHandler` instead of a `tableErrorHandler` 
also in case of CREATE updateType and not only for REPLACE and SIMPLE?
   
   Previous related work:
   https://github.com/apache/iceberg/pull/13619 and 
[thread](https://lists.apache.org/thread/oqonscy1b4qlmovnjtbcohz38kgprgmq)
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [x] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [ ] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to