Hi everyone,

I’d like to start a discussion regarding how we handle credentials for
encryption (like KMS or Vault (recently being discussed)) in the REST
catalog.

As we know, unlike other catalogs, the REST catalog mints credentials at
the table level for the client to use in subsequent operations, and we
already have the dedicated /credentials endpoint in place to handle
refreshing these.

While reviewing the recent encryption PR, a few architectural concerns came
up that I believe we need to conclude on before we mark the REST catalog as
"ready" for supporting encryption:

   -

   *Separation of KMS/Vault Creds from Storage Creds:* How should we handle
   external key managers like Vault? The current PR [1] advocates for
   including KMS credentials within the storage credentials object. If we end
   up supporting Vault, this can't be mixed with storage cred. *(Side note:
   catalogs have historically mixed KMS creds with this object for things like
   SSE, but that is entirely an object-store-level concept).* We need a
   clear path forward for how REST will return per-table credentials
   specifically for Vault/KMS stores.
   -

   *Catalog Awareness & Client-Side Assertions:* If the catalog returns
   credentials, it overrides the client-side credentials. This means a naive
   catalog that is unaware of encryption and just treats metadata as-is will
   have no clue it needs to vend these specific KMS credentials and if it
   forgerts to do that there is no way for the client to know this (for object
   store cases) except to fail during runtime ? Should the catalog fail such
   requests *as part of their contract of supporting v3* (we don't need
   this in spec), should the catalog send some signal, hey i send you creds
   for encryption too, and if the client doesn't find it fails early ?
   -

   *Backward Compatibility Risks:* If we release the client now with the
   expectation that "storage credentials" will always contain KMS credentials.
   If we later introduce a dedicated field for encryption credentials in the
   loadTable response, we will be forced to maintain backward compatibility
   to support both ways of returning credentials.

To be clear, I do not want to block the progress on the current PR. I
really appreciate all the hard work that has gone into it! However, I think
it is crucial that we align on these design points for the REST catalog's
encryption architecture before finalizing it.

I would appreciate any thoughts or feedback on how we should structure this.

[1] https://github.com/apache/iceberg/pull/13225

Thanks,

Prashant Singh

Reply via email to