Hi everyone, I’d like to start a discussion regarding how we handle credentials for encryption (like KMS or Vault (recently being discussed)) in the REST catalog.
As we know, unlike other catalogs, the REST catalog mints credentials at the table level for the client to use in subsequent operations, and we already have the dedicated /credentials endpoint in place to handle refreshing these. While reviewing the recent encryption PR, a few architectural concerns came up that I believe we need to conclude on before we mark the REST catalog as "ready" for supporting encryption: - *Separation of KMS/Vault Creds from Storage Creds:* How should we handle external key managers like Vault? The current PR [1] advocates for including KMS credentials within the storage credentials object. If we end up supporting Vault, this can't be mixed with storage cred. *(Side note: catalogs have historically mixed KMS creds with this object for things like SSE, but that is entirely an object-store-level concept).* We need a clear path forward for how REST will return per-table credentials specifically for Vault/KMS stores. - *Catalog Awareness & Client-Side Assertions:* If the catalog returns credentials, it overrides the client-side credentials. This means a naive catalog that is unaware of encryption and just treats metadata as-is will have no clue it needs to vend these specific KMS credentials and if it forgerts to do that there is no way for the client to know this (for object store cases) except to fail during runtime ? Should the catalog fail such requests *as part of their contract of supporting v3* (we don't need this in spec), should the catalog send some signal, hey i send you creds for encryption too, and if the client doesn't find it fails early ? - *Backward Compatibility Risks:* If we release the client now with the expectation that "storage credentials" will always contain KMS credentials. If we later introduce a dedicated field for encryption credentials in the loadTable response, we will be forced to maintain backward compatibility to support both ways of returning credentials. To be clear, I do not want to block the progress on the current PR. I really appreciate all the hard work that has gone into it! However, I think it is crucial that we align on these design points for the REST catalog's encryption architecture before finalizing it. I would appreciate any thoughts or feedback on how we should structure this. [1] https://github.com/apache/iceberg/pull/13225 Thanks, Prashant Singh
