Hi Adam, hi all,

I did some archaeology on this topic and (unless I'm reading this
wrong) it seems there is some previous work on this topic by Anand
Sankaran. He sent his proposal to the Polaris dev mailing list in
February [1] and wrote a design doc: [2]. Yufei also opened an issue a
while ago: [3].

I think that the best next step would be to revive Anand's design doc
and see if it aligns with what you have in mind.

I agree that this feature should be prioritized as it is extremely
useful for users running on untrusted storage providers. However, if I
understand the situation correctly, it seems that on the Iceberg side
the feature is already in the REST spec, but client-side support is
still pending [4] – it's been under review for a year. Is that
assessment correct? (If so, this would be a good candidate for a
feature branch on our side, while we wait for the 1.12 release to
land.)

Thanks,
Alex

[1]: https://lists.apache.org/thread/mpg46o0w2bzy75hyhx2j74dgwzjh2ob7
[2]: 
https://docs.google.com/document/d/1f4Mgg5W1t4NT6R7KLq5K3S4pHlAwYwXTFwUR9uNNpSU/edit?tab=t.0#heading=h.7ucqpo88io4u
[3]: https://github.com/apache/polaris/issues/2829
[4]: https://github.com/apache/iceberg/pull/13225

On Wed, May 27, 2026 at 10:55 AM Adam Szita <[email protected]> wrote:
>
> Thanks for your replies Dmitri and JB,
>
> IIUC, the KMS integration you’re referring to is closely tied to AWS S3
> storage. It is storage-layer encryption at rest: Polaris can record AWS KMS
> key ARNs in the S3 storage configuration, and during storage credential
> vending it grants the vended AWS credentials the required KMS permissions
> such as decrypt/encrypt/data-key operations. That lets clients read/write
> SSE-KMS encrypted S3 objects, but it is still a low-level storage concern
> and does not know whether the object is an Iceberg data file, manifest, or
> anything else.
>
> Iceberg table encryption is different. It is one abstraction level higher
> and is table-format aware:
>
>    - under the hood an EncryptingFileIO is used to access encrypted
>    artifacts
>    - it uses envelope encryption to encrypt data files, manifest files and
>    snapshot files, defining a master table key to be managed in a KMS (for
>    some more context: https://www.youtube.com/watch?v=G7Y2eNS_d-s)
>    - table metadata carries encryption metadata and key references; a
>    KMS-backed `KeyManagementClient` wraps/unwraps the keys.
>    - it provides better portability of encrypted tables, it's vendor
>    independent - in theory you could have a combination of S3 storage with GCP
>    KMS, or even a custom KMS client implementation should enterprise users
>    favor that
>    - supporting catalogs would have to bear additional responsibilities
>    such as protecting metadata integrity and preventing master encryption key
>    changes (which is an Iceberg table property)
>
> The catalog-level KMS config I’m proposing is for Iceberg table encryption,
> not for S3 SSE-KMS. It also shouldn't be modeled as storage configuration
> because the storage backend and table-encryption KMS provider do not have
> to match, perhaps we could use a more concrete naming such
> as icebergTableEncryptionKmsConfigInfo to avoid confusion.
> In any case I'm happy to draft a design doc and share it here.
>
> Cheers,
> Adam
>
>
>
> On Wed, 27 May 2026 at 08:07, Jean-Baptiste Onofré <[email protected]> wrote:
>
> > Hi Adam,
> >
> > Thanks for the proposal.
> >
> > I share Dmitri's question; my understanding is that this pertains to
> > client-side encryption. I can confirm that KMS should work, as I recall an
> > issue regarding this being fixed in the past.
> >
> > Adam, could you please clarify the scope of this work?
> >
> > Regards,
> > JB
> >
> >
> > On Tue, May 26, 2026 at 8:01 PM Dmitri Bourlatchkov <[email protected]>
> > wrote:
> >
> > > Hi Adam,
> > >
> > > Thanks for this proposal!
> > >
> > > Polaris should already support storage-side KMS in AWS (and compatible
> > > systems) via [2802] (cf. [1]).
> > >
> > > I guess the new features you mention relate to client-side encryption,
> > > right?
> > >
> > > [1]
> > >
> > https://polaris.apache.org/blog/2025/12/24/securing-s3-data-with-aws-kms/
> > >
> > > [2802] https://github.com/apache/polaris/pull/2802
> > >
> > > Cheers,
> > > Dmitri.
> > >
> > > On Tue, May 26, 2026 at 11:06 AM Adam Szita <[email protected]> wrote:
> > >
> > > > Hi all,
> > > >
> > > > Iceberg 1.11 shipped the base implementation for table encryption,
> > > > including KMS-based key wrapping/unwrapping and encrypted data/delete,
> > > > manifest, and manifest-list files. REST catalog support is also being
> > > > worked on in Iceberg (see https://github.com/apache/iceberg/pull/13225
> > ).
> > > >
> > > > I have been testing Polaris with Iceberg REST client-side encryption
> > > > enabled. Basic catalog operations such as loadTable, commit/drop
> > without
> > > > purge, list, etc. work without Polaris changes because Polaris only
> > needs
> > > > the table metadata JSON for those paths, and metadata.json is not
> > > > encrypted.
> > > >
> > > > The places where Polaris does need encryption awareness are the
> > > server-side
> > > > paths that read encrypted Iceberg artifacts. The first concrete example
> > > is
> > > > drop table with purge: TableCleanupTask reads snapshot manifest lists
> > and
> > > > manifests to enumerate files for deletion, so it needs to use an
> > > > EncryptingFileIO. The same would apply to any Polaris-side table
> > > > maintenance/optimization, orphan/snapshot cleanup logic, or any future
> > > > remote scan/planning capability that reads manifests or data/delete
> > > files.
> > > >
> > > > There is also a related but separate topic around vending KMS
> > credentials
> > > > to clients. That likely needs Iceberg REST spec work first, similar in
> > > > spirit to current storage credential vending, so I think it should be
> > > > designed for but not required as the first Polaris step.
> > > >
> > > > The first Polaris-side building block I would propose is to allow
> > Iceberg
> > > > catalogs to carry KMS configuration, similarly to how catalogs
> > currently
> > > > carry StorageConfigurationInfo. This should be separate from storage
> > > > configuration because the storage backend and KMS provider may differ,
> > > for
> > > > example GCS storage with AWS KMS. AWS KMS would be a reasonable first
> > > > implementation target, using Iceberg’s existing KeyManagementClient/AWS
> > > KMS
> > > > support, while leaving the model extensible for Azure and GCP.
> > > >
> > > > I have already been experimenting with this locally and would be happy
> > to
> > > > work on the Polaris changes. A possible first PR could be limited to:
> > > >
> > > > 1. Add catalog-level KMS configuration model/API support.
> > > > 2. Add AWS KMS server-side configuration wiring.
> > > >
> > > > Any feedback is welcome.
> > > >
> > > > Cheers,
> > > > Adam
> > > >
> > >
> >

Reply via email to