Hi all,
I’d like to propose a change to Iceberg’s REST API to make mutation
requests safely retryable.
*The Problem*
If a POST mutation (e.g., updateTable) succeeds in the catalog but the
client doesn’t receive the response (timeout, connection closed, etc.), a
second attempt can hit 409 Conflict. The client interprets the 409 as a
failed commit and deletes the associated metadata files, causing
catalog/storage inconsistency.
*The Proposed Solution*
Introduces an optional Idempotency-Key HTTP header on REST mutation
endpoints and has the Iceberg client pass it through.
*Semantics *(first processed request wins):
-
Same key + same canonical payload -> return the original result (no
re-execution).
-
Same key + different payload -> 422 (Unprocessable Content).
*Capability discovery:* catalogs can advertise support and retention so
clients know when a retry is safe, e.g.
{
"idempotency-tokens-respected": true,
"idempotency-token-lifetime": "30m" }
*Scope in Iceberg:* update the OpenAPI to include the header, and add
client pass-through + honoring capability discovery. No server
implementation is mandated—catalogs (e.g., Polaris) can implement
storage/TTL/replay as they choose.
*Standards alignment:* uses the industry-standard header name and matches
the IETF HTTPAPI Idempotency-Key draft
<https://datatracker.ietf.org/doc/html/draft-ietf-httpapi-idempotency-key-header>
semantics.
*Compatibility:* fully backward compatible. Servers that don’t support it
can ignore the header; clients can detect support via capability discovery.
Here is the proposal
<https://docs.google.com/document/d/1WyiIk08JRe8AjWh63txIP4i2xcIUHYQWFrF_1CCS3uw/edit?tab=t.0>.
Looking forward to your thoughts.
Thanks,
Huaxin