Thanks everyone, this helps clarify the discussion.

I think we should separate two related but different topics:

   1. KMS/Vault credential vending to clients via Iceberg REST.
   2. KMS configuration used by Polaris itself for server-side operations.

I agree that #1 should be discussed on the Iceberg side and should not be
invented as Polaris-specific behavior. I’m also happy to participate in it
as I already have a working dev setup with REST client-side encryption
enabled, plus a POC for catalog-level KMS configuration. I can help
brainstorm/test concrete options but I do see this as a parallel workstream.

For Polaris, though, I think #2 will be needed regardless of the final REST
credential-vending implementation.
Iceberg table encryption is coming, and Polaris server-side operations that
read encrypted Iceberg artifacts will need KMS support. The immediate
example is drop table with purge / table cleanup, where Polaris reads
manifest lists and manifests to enumerate files for deletion. Those paths
will need an EncryptingFileIO initialized with catalog-level KMS
configuration.

I also agree with the RFC that metadata integrity protection should be part
of the first Polaris effort, since metadata.json is not encrypted and
Polaris should detect out-of-band modification before trusting it for
encrypted tables.

So my suggested first phase would be limited to:

   - catalog-level KMS configuration (separate from storage configuration)
   - AWS KMS wiring for Polaris server-side operations
   - metadata integrity checks for encrypted tables

The current RFC seems structured around a broader end-to-end
table-encryption story (including client credential vending, key rotation,
governance/lifecycle topics, and general Iceberg encryption background).
Those are important, but I think it would be easier to make progress if we
first split out and design the narrower Polaris server-side building block
above, and discuss the broader pieces separately.

Does that separation sound reasonable?

Cheers,
Adam

On Thu, 28 May 2026 at 03:26, Yufei Gu <[email protected]> wrote:

> Thanks Adam for raising this. I think it's a great feature to have.
>
> Agreed on what Prashant said. We need some work on the IRC side to avoid
> any premature implementation in Polaris.
>
> Yufei
>
>
> On Wed, May 27, 2026 at 9:14 AM Prashant Singh via dev <
> [email protected]> wrote:
>
> > Hey Adam,
> >
> > Thanks for starting a thread on this in the Polaris community.
> > I believe we need a dedicated field in the loadTable response in IRC to
> > vend KMS credentials. Currently, KMS credentials are mixed with storage
> > credentials to achieve SSE, but there is no consistent way to enforce
> this
> > because the spec is silent about it.
> > With CSE (Iceberg v3 encryption), things get more involved because one
> can
> > use Vault with S3 as the combination of their KMS and ObjectStore.
> > Consequently, a catalog cannot provide access to both as part of
> loadTable
> > response, my take here is if catalog is giving access to a caller
> > If the catalog grants access to a caller because it has SELECT privilege
> it
> > should provide access to both KMS and Storage.
> >
> > I have an open thread in the *Iceberg community* [1] . Let's conclude
> there
> > what the IRC response should look like after consulting with the broader
> > Iceberg catalog community (I added REST catalog encryption support in the
> > last catalog community sync agenda but we ran out of time [2]), and then
> we
> > can circle back in the Polaris community to see what would looks like to
> > support here.
> >
> > Best,
> > Prashant
> >
> > [1] https://lists.apache.org/thread/z48t5wgx778j17pzto9kqxwysw4ysxxo
> > [2]
> >
> >
> https://docs.google.com/document/d/1iPGVCIcr-M0XtAiudOguWAvmqIdVgpYN5vz5ohO8PKw/edit?tab=t.0#heading=h.cr6o1g2rn5hc
> >
> > On Wed, May 27, 2026 at 8:38 AM Alexandre Dutra <[email protected]>
> wrote:
> >
> > > Hi Adam, hi all,
> > >
> > > I did some archaeology on this topic and (unless I'm reading this
> > > wrong) it seems there is some previous work on this topic by Anand
> > > Sankaran. He sent his proposal to the Polaris dev mailing list in
> > > February [1] and wrote a design doc: [2]. Yufei also opened an issue a
> > > while ago: [3].
> > >
> > > I think that the best next step would be to revive Anand's design doc
> > > and see if it aligns with what you have in mind.
> > >
> > > I agree that this feature should be prioritized as it is extremely
> > > useful for users running on untrusted storage providers. However, if I
> > > understand the situation correctly, it seems that on the Iceberg side
> > > the feature is already in the REST spec, but client-side support is
> > > still pending [4] – it's been under review for a year. Is that
> > > assessment correct? (If so, this would be a good candidate for a
> > > feature branch on our side, while we wait for the 1.12 release to
> > > land.)
> > >
> > > Thanks,
> > > Alex
> > >
> > > [1]: https://lists.apache.org/thread/mpg46o0w2bzy75hyhx2j74dgwzjh2ob7
> > > [2]:
> > >
> >
> https://docs.google.com/document/d/1f4Mgg5W1t4NT6R7KLq5K3S4pHlAwYwXTFwUR9uNNpSU/edit?tab=t.0#heading=h.7ucqpo88io4u
> > > [3]: https://github.com/apache/polaris/issues/2829
> > > [4]: https://github.com/apache/iceberg/pull/13225
> > >
> > > On Wed, May 27, 2026 at 10:55 AM Adam Szita <[email protected]> wrote:
> > > >
> > > > Thanks for your replies Dmitri and JB,
> > > >
> > > > IIUC, the KMS integration you’re referring to is closely tied to AWS
> S3
> > > > storage. It is storage-layer encryption at rest: Polaris can record
> AWS
> > > KMS
> > > > key ARNs in the S3 storage configuration, and during storage
> credential
> > > > vending it grants the vended AWS credentials the required KMS
> > permissions
> > > > such as decrypt/encrypt/data-key operations. That lets clients
> > read/write
> > > > SSE-KMS encrypted S3 objects, but it is still a low-level storage
> > concern
> > > > and does not know whether the object is an Iceberg data file,
> manifest,
> > > or
> > > > anything else.
> > > >
> > > > Iceberg table encryption is different. It is one abstraction level
> > higher
> > > > and is table-format aware:
> > > >
> > > >    - under the hood an EncryptingFileIO is used to access encrypted
> > > >    artifacts
> > > >    - it uses envelope encryption to encrypt data files, manifest
> files
> > > and
> > > >    snapshot files, defining a master table key to be managed in a KMS
> > > (for
> > > >    some more context: https://www.youtube.com/watch?v=G7Y2eNS_d-s)
> > > >    - table metadata carries encryption metadata and key references; a
> > > >    KMS-backed `KeyManagementClient` wraps/unwraps the keys.
> > > >    - it provides better portability of encrypted tables, it's vendor
> > > >    independent - in theory you could have a combination of S3 storage
> > > with GCP
> > > >    KMS, or even a custom KMS client implementation should enterprise
> > > users
> > > >    favor that
> > > >    - supporting catalogs would have to bear additional
> responsibilities
> > > >    such as protecting metadata integrity and preventing master
> > > encryption key
> > > >    changes (which is an Iceberg table property)
> > > >
> > > > The catalog-level KMS config I’m proposing is for Iceberg table
> > > encryption,
> > > > not for S3 SSE-KMS. It also shouldn't be modeled as storage
> > configuration
> > > > because the storage backend and table-encryption KMS provider do not
> > have
> > > > to match, perhaps we could use a more concrete naming such
> > > > as icebergTableEncryptionKmsConfigInfo to avoid confusion.
> > > > In any case I'm happy to draft a design doc and share it here.
> > > >
> > > > Cheers,
> > > > Adam
> > > >
> > > >
> > > >
> > > > On Wed, 27 May 2026 at 08:07, Jean-Baptiste Onofré <[email protected]>
> > > wrote:
> > > >
> > > > > Hi Adam,
> > > > >
> > > > > Thanks for the proposal.
> > > > >
> > > > > I share Dmitri's question; my understanding is that this pertains
> to
> > > > > client-side encryption. I can confirm that KMS should work, as I
> > > recall an
> > > > > issue regarding this being fixed in the past.
> > > > >
> > > > > Adam, could you please clarify the scope of this work?
> > > > >
> > > > > Regards,
> > > > > JB
> > > > >
> > > > >
> > > > > On Tue, May 26, 2026 at 8:01 PM Dmitri Bourlatchkov <
> > [email protected]>
> > > > > wrote:
> > > > >
> > > > > > Hi Adam,
> > > > > >
> > > > > > Thanks for this proposal!
> > > > > >
> > > > > > Polaris should already support storage-side KMS in AWS (and
> > > compatible
> > > > > > systems) via [2802] (cf. [1]).
> > > > > >
> > > > > > I guess the new features you mention relate to client-side
> > > encryption,
> > > > > > right?
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > >
> >
> https://polaris.apache.org/blog/2025/12/24/securing-s3-data-with-aws-kms/
> > > > > >
> > > > > > [2802] https://github.com/apache/polaris/pull/2802
> > > > > >
> > > > > > Cheers,
> > > > > > Dmitri.
> > > > > >
> > > > > > On Tue, May 26, 2026 at 11:06 AM Adam Szita <[email protected]>
> > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > Iceberg 1.11 shipped the base implementation for table
> > encryption,
> > > > > > > including KMS-based key wrapping/unwrapping and encrypted
> > > data/delete,
> > > > > > > manifest, and manifest-list files. REST catalog support is also
> > > being
> > > > > > > worked on in Iceberg (see
> > > https://github.com/apache/iceberg/pull/13225
> > > > > ).
> > > > > > >
> > > > > > > I have been testing Polaris with Iceberg REST client-side
> > > encryption
> > > > > > > enabled. Basic catalog operations such as loadTable,
> commit/drop
> > > > > without
> > > > > > > purge, list, etc. work without Polaris changes because Polaris
> > only
> > > > > needs
> > > > > > > the table metadata JSON for those paths, and metadata.json is
> not
> > > > > > > encrypted.
> > > > > > >
> > > > > > > The places where Polaris does need encryption awareness are the
> > > > > > server-side
> > > > > > > paths that read encrypted Iceberg artifacts. The first concrete
> > > example
> > > > > > is
> > > > > > > drop table with purge: TableCleanupTask reads snapshot manifest
> > > lists
> > > > > and
> > > > > > > manifests to enumerate files for deletion, so it needs to use
> an
> > > > > > > EncryptingFileIO. The same would apply to any Polaris-side
> table
> > > > > > > maintenance/optimization, orphan/snapshot cleanup logic, or any
> > > future
> > > > > > > remote scan/planning capability that reads manifests or
> > data/delete
> > > > > > files.
> > > > > > >
> > > > > > > There is also a related but separate topic around vending KMS
> > > > > credentials
> > > > > > > to clients. That likely needs Iceberg REST spec work first,
> > > similar in
> > > > > > > spirit to current storage credential vending, so I think it
> > should
> > > be
> > > > > > > designed for but not required as the first Polaris step.
> > > > > > >
> > > > > > > The first Polaris-side building block I would propose is to
> allow
> > > > > Iceberg
> > > > > > > catalogs to carry KMS configuration, similarly to how catalogs
> > > > > currently
> > > > > > > carry StorageConfigurationInfo. This should be separate from
> > > storage
> > > > > > > configuration because the storage backend and KMS provider may
> > > differ,
> > > > > > for
> > > > > > > example GCS storage with AWS KMS. AWS KMS would be a reasonable
> > > first
> > > > > > > implementation target, using Iceberg’s existing
> > > KeyManagementClient/AWS
> > > > > > KMS
> > > > > > > support, while leaving the model extensible for Azure and GCP.
> > > > > > >
> > > > > > > I have already been experimenting with this locally and would
> be
> > > happy
> > > > > to
> > > > > > > work on the Polaris changes. A possible first PR could be
> limited
> > > to:
> > > > > > >
> > > > > > > 1. Add catalog-level KMS configuration model/API support.
> > > > > > > 2. Add AWS KMS server-side configuration wiring.
> > > > > > >
> > > > > > > Any feedback is welcome.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Adam
> > > > > > >
> > > > > >
> > > > >
> > >
> >
>

Reply via email to