Thanks everyone, this helps clarify the discussion. I think we should separate two related but different topics:
1. KMS/Vault credential vending to clients via Iceberg REST. 2. KMS configuration used by Polaris itself for server-side operations. I agree that #1 should be discussed on the Iceberg side and should not be invented as Polaris-specific behavior. I’m also happy to participate in it as I already have a working dev setup with REST client-side encryption enabled, plus a POC for catalog-level KMS configuration. I can help brainstorm/test concrete options but I do see this as a parallel workstream. For Polaris, though, I think #2 will be needed regardless of the final REST credential-vending implementation. Iceberg table encryption is coming, and Polaris server-side operations that read encrypted Iceberg artifacts will need KMS support. The immediate example is drop table with purge / table cleanup, where Polaris reads manifest lists and manifests to enumerate files for deletion. Those paths will need an EncryptingFileIO initialized with catalog-level KMS configuration. I also agree with the RFC that metadata integrity protection should be part of the first Polaris effort, since metadata.json is not encrypted and Polaris should detect out-of-band modification before trusting it for encrypted tables. So my suggested first phase would be limited to: - catalog-level KMS configuration (separate from storage configuration) - AWS KMS wiring for Polaris server-side operations - metadata integrity checks for encrypted tables The current RFC seems structured around a broader end-to-end table-encryption story (including client credential vending, key rotation, governance/lifecycle topics, and general Iceberg encryption background). Those are important, but I think it would be easier to make progress if we first split out and design the narrower Polaris server-side building block above, and discuss the broader pieces separately. Does that separation sound reasonable? Cheers, Adam On Thu, 28 May 2026 at 03:26, Yufei Gu <[email protected]> wrote: > Thanks Adam for raising this. I think it's a great feature to have. > > Agreed on what Prashant said. We need some work on the IRC side to avoid > any premature implementation in Polaris. > > Yufei > > > On Wed, May 27, 2026 at 9:14 AM Prashant Singh via dev < > [email protected]> wrote: > > > Hey Adam, > > > > Thanks for starting a thread on this in the Polaris community. > > I believe we need a dedicated field in the loadTable response in IRC to > > vend KMS credentials. Currently, KMS credentials are mixed with storage > > credentials to achieve SSE, but there is no consistent way to enforce > this > > because the spec is silent about it. > > With CSE (Iceberg v3 encryption), things get more involved because one > can > > use Vault with S3 as the combination of their KMS and ObjectStore. > > Consequently, a catalog cannot provide access to both as part of > loadTable > > response, my take here is if catalog is giving access to a caller > > If the catalog grants access to a caller because it has SELECT privilege > it > > should provide access to both KMS and Storage. > > > > I have an open thread in the *Iceberg community* [1] . Let's conclude > there > > what the IRC response should look like after consulting with the broader > > Iceberg catalog community (I added REST catalog encryption support in the > > last catalog community sync agenda but we ran out of time [2]), and then > we > > can circle back in the Polaris community to see what would looks like to > > support here. > > > > Best, > > Prashant > > > > [1] https://lists.apache.org/thread/z48t5wgx778j17pzto9kqxwysw4ysxxo > > [2] > > > > > https://docs.google.com/document/d/1iPGVCIcr-M0XtAiudOguWAvmqIdVgpYN5vz5ohO8PKw/edit?tab=t.0#heading=h.cr6o1g2rn5hc > > > > On Wed, May 27, 2026 at 8:38 AM Alexandre Dutra <[email protected]> > wrote: > > > > > Hi Adam, hi all, > > > > > > I did some archaeology on this topic and (unless I'm reading this > > > wrong) it seems there is some previous work on this topic by Anand > > > Sankaran. He sent his proposal to the Polaris dev mailing list in > > > February [1] and wrote a design doc: [2]. Yufei also opened an issue a > > > while ago: [3]. > > > > > > I think that the best next step would be to revive Anand's design doc > > > and see if it aligns with what you have in mind. > > > > > > I agree that this feature should be prioritized as it is extremely > > > useful for users running on untrusted storage providers. However, if I > > > understand the situation correctly, it seems that on the Iceberg side > > > the feature is already in the REST spec, but client-side support is > > > still pending [4] – it's been under review for a year. Is that > > > assessment correct? (If so, this would be a good candidate for a > > > feature branch on our side, while we wait for the 1.12 release to > > > land.) > > > > > > Thanks, > > > Alex > > > > > > [1]: https://lists.apache.org/thread/mpg46o0w2bzy75hyhx2j74dgwzjh2ob7 > > > [2]: > > > > > > https://docs.google.com/document/d/1f4Mgg5W1t4NT6R7KLq5K3S4pHlAwYwXTFwUR9uNNpSU/edit?tab=t.0#heading=h.7ucqpo88io4u > > > [3]: https://github.com/apache/polaris/issues/2829 > > > [4]: https://github.com/apache/iceberg/pull/13225 > > > > > > On Wed, May 27, 2026 at 10:55 AM Adam Szita <[email protected]> wrote: > > > > > > > > Thanks for your replies Dmitri and JB, > > > > > > > > IIUC, the KMS integration you’re referring to is closely tied to AWS > S3 > > > > storage. It is storage-layer encryption at rest: Polaris can record > AWS > > > KMS > > > > key ARNs in the S3 storage configuration, and during storage > credential > > > > vending it grants the vended AWS credentials the required KMS > > permissions > > > > such as decrypt/encrypt/data-key operations. That lets clients > > read/write > > > > SSE-KMS encrypted S3 objects, but it is still a low-level storage > > concern > > > > and does not know whether the object is an Iceberg data file, > manifest, > > > or > > > > anything else. > > > > > > > > Iceberg table encryption is different. It is one abstraction level > > higher > > > > and is table-format aware: > > > > > > > > - under the hood an EncryptingFileIO is used to access encrypted > > > > artifacts > > > > - it uses envelope encryption to encrypt data files, manifest > files > > > and > > > > snapshot files, defining a master table key to be managed in a KMS > > > (for > > > > some more context: https://www.youtube.com/watch?v=G7Y2eNS_d-s) > > > > - table metadata carries encryption metadata and key references; a > > > > KMS-backed `KeyManagementClient` wraps/unwraps the keys. > > > > - it provides better portability of encrypted tables, it's vendor > > > > independent - in theory you could have a combination of S3 storage > > > with GCP > > > > KMS, or even a custom KMS client implementation should enterprise > > > users > > > > favor that > > > > - supporting catalogs would have to bear additional > responsibilities > > > > such as protecting metadata integrity and preventing master > > > encryption key > > > > changes (which is an Iceberg table property) > > > > > > > > The catalog-level KMS config I’m proposing is for Iceberg table > > > encryption, > > > > not for S3 SSE-KMS. It also shouldn't be modeled as storage > > configuration > > > > because the storage backend and table-encryption KMS provider do not > > have > > > > to match, perhaps we could use a more concrete naming such > > > > as icebergTableEncryptionKmsConfigInfo to avoid confusion. > > > > In any case I'm happy to draft a design doc and share it here. > > > > > > > > Cheers, > > > > Adam > > > > > > > > > > > > > > > > On Wed, 27 May 2026 at 08:07, Jean-Baptiste Onofré <[email protected]> > > > wrote: > > > > > > > > > Hi Adam, > > > > > > > > > > Thanks for the proposal. > > > > > > > > > > I share Dmitri's question; my understanding is that this pertains > to > > > > > client-side encryption. I can confirm that KMS should work, as I > > > recall an > > > > > issue regarding this being fixed in the past. > > > > > > > > > > Adam, could you please clarify the scope of this work? > > > > > > > > > > Regards, > > > > > JB > > > > > > > > > > > > > > > On Tue, May 26, 2026 at 8:01 PM Dmitri Bourlatchkov < > > [email protected]> > > > > > wrote: > > > > > > > > > > > Hi Adam, > > > > > > > > > > > > Thanks for this proposal! > > > > > > > > > > > > Polaris should already support storage-side KMS in AWS (and > > > compatible > > > > > > systems) via [2802] (cf. [1]). > > > > > > > > > > > > I guess the new features you mention relate to client-side > > > encryption, > > > > > > right? > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > https://polaris.apache.org/blog/2025/12/24/securing-s3-data-with-aws-kms/ > > > > > > > > > > > > [2802] https://github.com/apache/polaris/pull/2802 > > > > > > > > > > > > Cheers, > > > > > > Dmitri. > > > > > > > > > > > > On Tue, May 26, 2026 at 11:06 AM Adam Szita <[email protected]> > > > wrote: > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > Iceberg 1.11 shipped the base implementation for table > > encryption, > > > > > > > including KMS-based key wrapping/unwrapping and encrypted > > > data/delete, > > > > > > > manifest, and manifest-list files. REST catalog support is also > > > being > > > > > > > worked on in Iceberg (see > > > https://github.com/apache/iceberg/pull/13225 > > > > > ). > > > > > > > > > > > > > > I have been testing Polaris with Iceberg REST client-side > > > encryption > > > > > > > enabled. Basic catalog operations such as loadTable, > commit/drop > > > > > without > > > > > > > purge, list, etc. work without Polaris changes because Polaris > > only > > > > > needs > > > > > > > the table metadata JSON for those paths, and metadata.json is > not > > > > > > > encrypted. > > > > > > > > > > > > > > The places where Polaris does need encryption awareness are the > > > > > > server-side > > > > > > > paths that read encrypted Iceberg artifacts. The first concrete > > > example > > > > > > is > > > > > > > drop table with purge: TableCleanupTask reads snapshot manifest > > > lists > > > > > and > > > > > > > manifests to enumerate files for deletion, so it needs to use > an > > > > > > > EncryptingFileIO. The same would apply to any Polaris-side > table > > > > > > > maintenance/optimization, orphan/snapshot cleanup logic, or any > > > future > > > > > > > remote scan/planning capability that reads manifests or > > data/delete > > > > > > files. > > > > > > > > > > > > > > There is also a related but separate topic around vending KMS > > > > > credentials > > > > > > > to clients. That likely needs Iceberg REST spec work first, > > > similar in > > > > > > > spirit to current storage credential vending, so I think it > > should > > > be > > > > > > > designed for but not required as the first Polaris step. > > > > > > > > > > > > > > The first Polaris-side building block I would propose is to > allow > > > > > Iceberg > > > > > > > catalogs to carry KMS configuration, similarly to how catalogs > > > > > currently > > > > > > > carry StorageConfigurationInfo. This should be separate from > > > storage > > > > > > > configuration because the storage backend and KMS provider may > > > differ, > > > > > > for > > > > > > > example GCS storage with AWS KMS. AWS KMS would be a reasonable > > > first > > > > > > > implementation target, using Iceberg’s existing > > > KeyManagementClient/AWS > > > > > > KMS > > > > > > > support, while leaving the model extensible for Azure and GCP. > > > > > > > > > > > > > > I have already been experimenting with this locally and would > be > > > happy > > > > > to > > > > > > > work on the Polaris changes. A possible first PR could be > limited > > > to: > > > > > > > > > > > > > > 1. Add catalog-level KMS configuration model/API support. > > > > > > > 2. Add AWS KMS server-side configuration wiring. > > > > > > > > > > > > > > Any feedback is welcome. > > > > > > > > > > > > > > Cheers, > > > > > > > Adam > > > > > > > > > > > > > > > > > > > > > > > >
