Hi folks,

On the note of testing in a realistic setting: I think the best way is to
apply the yet-to-be-merged PR https://github.com/apache/iceberg/pull/13225
on the client side and use it to create/write an Iceberg encrypted table.
With that, an Iceberg client can be configured to create/write an encrypted
Iceberg table using the existing AWS/Azure/GCP KMS support. For simple
catalog operations this works with Polaris, with the caveat Yufei also
mentioned.

As for starting the work on Polaris side incrementally: I have put together
a proposal doc, available at:
https://docs.google.com/document/d/1Ui5bYVci7BtEkTITxjgW7Pp9SjrTyq8pW2Z7lVt4TQA
It largely describes my proposed approach for letting catalogs be
configured with a KMS connection info.

Let me know what you think.

Cheers,
Adam

On Tue, 9 Jun 2026 at 18:32, Yufei Gu <[email protected]> wrote:

> > if it is practically possible right now to produce an Iceberg table with
> encrypted data files so that Polaris could be tested in a realistic
> setting?
>
> Yes with the caveat that certain operations are not possible as we
> discussed, like drop-by-purge and future scan planning.
>
> Yufei
>
>
> On Tue, Jun 9, 2026 at 8:53 AM Dmitri Bourlatchkov <[email protected]>
> wrote:
>
> > Hi Adam,
> >
> > Working incrementally on this makes sense. I agree that handling internal
> > Polaris workflows that deal with encrypted files sounds like a good
> > starting point.
> >
> > I wonder, though, if it is practically possible right now to produce an
> > Iceberg table with encrypted data files so that Polaris could be tested
> in
> > a realistic setting? Do you mean something like storing encrypted files
> > directly from a client and later registering the table with Polaris? This
> > is not a blocker for starting KMS work of course. I'm just trying to
> > understand how much of that feature can be practically usable ATM.
> >
> > Cheers,
> > Dmitri.
> >
> > On Tue, Jun 2, 2026 at 10:55 AM Adam Szita <[email protected]> wrote:
> >
> > > Thanks everyone, this helps clarify the discussion.
> > >
> > > I think we should separate two related but different topics:
> > >
> > >    1. KMS/Vault credential vending to clients via Iceberg REST.
> > >    2. KMS configuration used by Polaris itself for server-side
> > operations.
> > >
> > > I agree that #1 should be discussed on the Iceberg side and should not
> be
> > > invented as Polaris-specific behavior. I’m also happy to participate in
> > it
> > > as I already have a working dev setup with REST client-side encryption
> > > enabled, plus a POC for catalog-level KMS configuration. I can help
> > > brainstorm/test concrete options but I do see this as a parallel
> > > workstream.
> > >
> > > For Polaris, though, I think #2 will be needed regardless of the final
> > REST
> > > credential-vending implementation.
> > > Iceberg table encryption is coming, and Polaris server-side operations
> > that
> > > read encrypted Iceberg artifacts will need KMS support. The immediate
> > > example is drop table with purge / table cleanup, where Polaris reads
> > > manifest lists and manifests to enumerate files for deletion. Those
> paths
> > > will need an EncryptingFileIO initialized with catalog-level KMS
> > > configuration.
> > >
> > > I also agree with the RFC that metadata integrity protection should be
> > part
> > > of the first Polaris effort, since metadata.json is not encrypted and
> > > Polaris should detect out-of-band modification before trusting it for
> > > encrypted tables.
> > >
> > > So my suggested first phase would be limited to:
> > >
> > >    - catalog-level KMS configuration (separate from storage
> > configuration)
> > >    - AWS KMS wiring for Polaris server-side operations
> > >    - metadata integrity checks for encrypted tables
> > >
> > > The current RFC seems structured around a broader end-to-end
> > > table-encryption story (including client credential vending, key
> > rotation,
> > > governance/lifecycle topics, and general Iceberg encryption
> background).
> > > Those are important, but I think it would be easier to make progress if
> > we
> > > first split out and design the narrower Polaris server-side building
> > block
> > > above, and discuss the broader pieces separately.
> > >
> > > Does that separation sound reasonable?
> > >
> > > Cheers,
> > > Adam
> > >
> > > On Thu, 28 May 2026 at 03:26, Yufei Gu <[email protected]> wrote:
> > >
> > > > Thanks Adam for raising this. I think it's a great feature to have.
> > > >
> > > > Agreed on what Prashant said. We need some work on the IRC side to
> > avoid
> > > > any premature implementation in Polaris.
> > > >
> > > > Yufei
> > > >
> > > >
> > > > On Wed, May 27, 2026 at 9:14 AM Prashant Singh via dev <
> > > > [email protected]> wrote:
> > > >
> > > > > Hey Adam,
> > > > >
> > > > > Thanks for starting a thread on this in the Polaris community.
> > > > > I believe we need a dedicated field in the loadTable response in
> IRC
> > to
> > > > > vend KMS credentials. Currently, KMS credentials are mixed with
> > storage
> > > > > credentials to achieve SSE, but there is no consistent way to
> enforce
> > > > this
> > > > > because the spec is silent about it.
> > > > > With CSE (Iceberg v3 encryption), things get more involved because
> > one
> > > > can
> > > > > use Vault with S3 as the combination of their KMS and ObjectStore.
> > > > > Consequently, a catalog cannot provide access to both as part of
> > > > loadTable
> > > > > response, my take here is if catalog is giving access to a caller
> > > > > If the catalog grants access to a caller because it has SELECT
> > > privilege
> > > > it
> > > > > should provide access to both KMS and Storage.
> > > > >
> > > > > I have an open thread in the *Iceberg community* [1] . Let's
> conclude
> > > > there
> > > > > what the IRC response should look like after consulting with the
> > > broader
> > > > > Iceberg catalog community (I added REST catalog encryption support
> in
> > > the
> > > > > last catalog community sync agenda but we ran out of time [2]), and
> > > then
> > > > we
> > > > > can circle back in the Polaris community to see what would looks
> like
> > > to
> > > > > support here.
> > > > >
> > > > > Best,
> > > > > Prashant
> > > > >
> > > > > [1]
> https://lists.apache.org/thread/z48t5wgx778j17pzto9kqxwysw4ysxxo
> > > > > [2]
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1iPGVCIcr-M0XtAiudOguWAvmqIdVgpYN5vz5ohO8PKw/edit?tab=t.0#heading=h.cr6o1g2rn5hc
> > > > >
> > > > > On Wed, May 27, 2026 at 8:38 AM Alexandre Dutra <[email protected]
> >
> > > > wrote:
> > > > >
> > > > > > Hi Adam, hi all,
> > > > > >
> > > > > > I did some archaeology on this topic and (unless I'm reading this
> > > > > > wrong) it seems there is some previous work on this topic by
> Anand
> > > > > > Sankaran. He sent his proposal to the Polaris dev mailing list in
> > > > > > February [1] and wrote a design doc: [2]. Yufei also opened an
> > issue
> > > a
> > > > > > while ago: [3].
> > > > > >
> > > > > > I think that the best next step would be to revive Anand's design
> > doc
> > > > > > and see if it aligns with what you have in mind.
> > > > > >
> > > > > > I agree that this feature should be prioritized as it is
> extremely
> > > > > > useful for users running on untrusted storage providers. However,
> > if
> > > I
> > > > > > understand the situation correctly, it seems that on the Iceberg
> > side
> > > > > > the feature is already in the REST spec, but client-side support
> is
> > > > > > still pending [4] – it's been under review for a year. Is that
> > > > > > assessment correct? (If so, this would be a good candidate for a
> > > > > > feature branch on our side, while we wait for the 1.12 release to
> > > > > > land.)
> > > > > >
> > > > > > Thanks,
> > > > > > Alex
> > > > > >
> > > > > > [1]:
> > > https://lists.apache.org/thread/mpg46o0w2bzy75hyhx2j74dgwzjh2ob7
> > > > > > [2]:
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1f4Mgg5W1t4NT6R7KLq5K3S4pHlAwYwXTFwUR9uNNpSU/edit?tab=t.0#heading=h.7ucqpo88io4u
> > > > > > [3]: https://github.com/apache/polaris/issues/2829
> > > > > > [4]: https://github.com/apache/iceberg/pull/13225
> > > > > >
> > > > > > On Wed, May 27, 2026 at 10:55 AM Adam Szita <[email protected]>
> > > wrote:
> > > > > > >
> > > > > > > Thanks for your replies Dmitri and JB,
> > > > > > >
> > > > > > > IIUC, the KMS integration you’re referring to is closely tied
> to
> > > AWS
> > > > S3
> > > > > > > storage. It is storage-layer encryption at rest: Polaris can
> > record
> > > > AWS
> > > > > > KMS
> > > > > > > key ARNs in the S3 storage configuration, and during storage
> > > > credential
> > > > > > > vending it grants the vended AWS credentials the required KMS
> > > > > permissions
> > > > > > > such as decrypt/encrypt/data-key operations. That lets clients
> > > > > read/write
> > > > > > > SSE-KMS encrypted S3 objects, but it is still a low-level
> storage
> > > > > concern
> > > > > > > and does not know whether the object is an Iceberg data file,
> > > > manifest,
> > > > > > or
> > > > > > > anything else.
> > > > > > >
> > > > > > > Iceberg table encryption is different. It is one abstraction
> > level
> > > > > higher
> > > > > > > and is table-format aware:
> > > > > > >
> > > > > > >    - under the hood an EncryptingFileIO is used to access
> > encrypted
> > > > > > >    artifacts
> > > > > > >    - it uses envelope encryption to encrypt data files,
> manifest
> > > > files
> > > > > > and
> > > > > > >    snapshot files, defining a master table key to be managed
> in a
> > > KMS
> > > > > > (for
> > > > > > >    some more context:
> > https://www.youtube.com/watch?v=G7Y2eNS_d-s)
> > > > > > >    - table metadata carries encryption metadata and key
> > > references; a
> > > > > > >    KMS-backed `KeyManagementClient` wraps/unwraps the keys.
> > > > > > >    - it provides better portability of encrypted tables, it's
> > > vendor
> > > > > > >    independent - in theory you could have a combination of S3
> > > storage
> > > > > > with GCP
> > > > > > >    KMS, or even a custom KMS client implementation should
> > > enterprise
> > > > > > users
> > > > > > >    favor that
> > > > > > >    - supporting catalogs would have to bear additional
> > > > responsibilities
> > > > > > >    such as protecting metadata integrity and preventing master
> > > > > > encryption key
> > > > > > >    changes (which is an Iceberg table property)
> > > > > > >
> > > > > > > The catalog-level KMS config I’m proposing is for Iceberg table
> > > > > > encryption,
> > > > > > > not for S3 SSE-KMS. It also shouldn't be modeled as storage
> > > > > configuration
> > > > > > > because the storage backend and table-encryption KMS provider
> do
> > > not
> > > > > have
> > > > > > > to match, perhaps we could use a more concrete naming such
> > > > > > > as icebergTableEncryptionKmsConfigInfo to avoid confusion.
> > > > > > > In any case I'm happy to draft a design doc and share it here.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Adam
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, 27 May 2026 at 08:07, Jean-Baptiste Onofré <
> > > [email protected]>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Adam,
> > > > > > > >
> > > > > > > > Thanks for the proposal.
> > > > > > > >
> > > > > > > > I share Dmitri's question; my understanding is that this
> > pertains
> > > > to
> > > > > > > > client-side encryption. I can confirm that KMS should work,
> as
> > I
> > > > > > recall an
> > > > > > > > issue regarding this being fixed in the past.
> > > > > > > >
> > > > > > > > Adam, could you please clarify the scope of this work?
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > JB
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, May 26, 2026 at 8:01 PM Dmitri Bourlatchkov <
> > > > > [email protected]>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Adam,
> > > > > > > > >
> > > > > > > > > Thanks for this proposal!
> > > > > > > > >
> > > > > > > > > Polaris should already support storage-side KMS in AWS (and
> > > > > > compatible
> > > > > > > > > systems) via [2802] (cf. [1]).
> > > > > > > > >
> > > > > > > > > I guess the new features you mention relate to client-side
> > > > > > encryption,
> > > > > > > > > right?
> > > > > > > > >
> > > > > > > > > [1]
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://polaris.apache.org/blog/2025/12/24/securing-s3-data-with-aws-kms/
> > > > > > > > >
> > > > > > > > > [2802] https://github.com/apache/polaris/pull/2802
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Dmitri.
> > > > > > > > >
> > > > > > > > > On Tue, May 26, 2026 at 11:06 AM Adam Szita <
> > [email protected]>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi all,
> > > > > > > > > >
> > > > > > > > > > Iceberg 1.11 shipped the base implementation for table
> > > > > encryption,
> > > > > > > > > > including KMS-based key wrapping/unwrapping and encrypted
> > > > > > data/delete,
> > > > > > > > > > manifest, and manifest-list files. REST catalog support
> is
> > > also
> > > > > > being
> > > > > > > > > > worked on in Iceberg (see
> > > > > > https://github.com/apache/iceberg/pull/13225
> > > > > > > > ).
> > > > > > > > > >
> > > > > > > > > > I have been testing Polaris with Iceberg REST client-side
> > > > > > encryption
> > > > > > > > > > enabled. Basic catalog operations such as loadTable,
> > > > commit/drop
> > > > > > > > without
> > > > > > > > > > purge, list, etc. work without Polaris changes because
> > > Polaris
> > > > > only
> > > > > > > > needs
> > > > > > > > > > the table metadata JSON for those paths, and
> metadata.json
> > is
> > > > not
> > > > > > > > > > encrypted.
> > > > > > > > > >
> > > > > > > > > > The places where Polaris does need encryption awareness
> are
> > > the
> > > > > > > > > server-side
> > > > > > > > > > paths that read encrypted Iceberg artifacts. The first
> > > concrete
> > > > > > example
> > > > > > > > > is
> > > > > > > > > > drop table with purge: TableCleanupTask reads snapshot
> > > manifest
> > > > > > lists
> > > > > > > > and
> > > > > > > > > > manifests to enumerate files for deletion, so it needs to
> > use
> > > > an
> > > > > > > > > > EncryptingFileIO. The same would apply to any
> Polaris-side
> > > > table
> > > > > > > > > > maintenance/optimization, orphan/snapshot cleanup logic,
> or
> > > any
> > > > > > future
> > > > > > > > > > remote scan/planning capability that reads manifests or
> > > > > data/delete
> > > > > > > > > files.
> > > > > > > > > >
> > > > > > > > > > There is also a related but separate topic around vending
> > KMS
> > > > > > > > credentials
> > > > > > > > > > to clients. That likely needs Iceberg REST spec work
> first,
> > > > > > similar in
> > > > > > > > > > spirit to current storage credential vending, so I think
> it
> > > > > should
> > > > > > be
> > > > > > > > > > designed for but not required as the first Polaris step.
> > > > > > > > > >
> > > > > > > > > > The first Polaris-side building block I would propose is
> to
> > > > allow
> > > > > > > > Iceberg
> > > > > > > > > > catalogs to carry KMS configuration, similarly to how
> > > catalogs
> > > > > > > > currently
> > > > > > > > > > carry StorageConfigurationInfo. This should be separate
> > from
> > > > > > storage
> > > > > > > > > > configuration because the storage backend and KMS
> provider
> > > may
> > > > > > differ,
> > > > > > > > > for
> > > > > > > > > > example GCS storage with AWS KMS. AWS KMS would be a
> > > reasonable
> > > > > > first
> > > > > > > > > > implementation target, using Iceberg’s existing
> > > > > > KeyManagementClient/AWS
> > > > > > > > > KMS
> > > > > > > > > > support, while leaving the model extensible for Azure and
> > > GCP.
> > > > > > > > > >
> > > > > > > > > > I have already been experimenting with this locally and
> > would
> > > > be
> > > > > > happy
> > > > > > > > to
> > > > > > > > > > work on the Polaris changes. A possible first PR could be
> > > > limited
> > > > > > to:
> > > > > > > > > >
> > > > > > > > > > 1. Add catalog-level KMS configuration model/API support.
> > > > > > > > > > 2. Add AWS KMS server-side configuration wiring.
> > > > > > > > > >
> > > > > > > > > > Any feedback is welcome.
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > > Adam
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to