Hi Dennis,

I've approved Tornike's PR [3699] in GH. As far as I can tell the PR is
aligned with previous discussions regarding credential vending SPI changes.

Please post your review too.

[3699] https://github.com/apache/polaris/pull/3699

Thanks,
Dmitri.


On Thu, Apr 9, 2026 at 6:44 PM Dennis Huo <[email protected]> wrote:

> Thanks for updating the PR! I'll dive into the details there. And generally
> I think we can be flexible on the specifics to do whatever we find to be
> cleanest as long as the described use case (lease/commit offloaded
> credential vending) is preserved.
>
> Your suggestion about passing the List<PolarisEntity> sounds good to me; it
> aligns with plans for more advanced composition of inheritance in the
> general path hierarchy instead of assuming that the base CatalogEntity is
> the only authoritative definition of storage and connection configuration.
>
> On Sat, Apr 4, 2026 at 2:23 AM Tornike Gurgenidze <[email protected]>
> wrote:
>
> > Hi Dennis,
> >
> > Thanks a lot for such a detailed review. I understand the use case and
> the
> > reasons for the current design much better now.
> >
> > 1. Solidify some variation of PolarisCredentialVendor as the interface
> >
> > definition for *using* a PolarisStorageIntegration to get a credential.
> > > This doesn't *need* to be implemented by monolithic MetaStoreManager
> > impls,
> > > but it still needs to exist as an optionally persistence-aware
> interface
> >
> >
> > That sounds good to me, we can bring the interface back. iiuc, oss impl
> > will no longer have anything to do with MetastoreManager, while custom
> impl
> > will be free to integrate with IntegrationPersistence. We will simply
> have
> > to keep PolarisCredentialVendor as an SPI + relevant methods in
> > IntegrationPersistence around even if unused in oss. Let's iterate on the
> > actual interface signature in the PR.
> >
> > 2. Solidify some variation of PolarisStorageIntegrationProvider as the
> >
> > interface definition for *constructing/leasing* a new
> > > PolarisStorageIntegration strictly for CreateCatalog (or Create for any
> >
> > entity that might hold a StorageConfig)
> >
> >
> > Makes sense. PR changes PolarisStorageIntegrations to be singletons (one
> > per integration type) and PolarisStorageIntegration objects accept
> > StorageConfig directly. I'm assuming that will probably be too awkward
> with
> > config/integration separation you described in mind. I can change oss
> > PolarisStorageIntegrationProvider impl to serve per-storageConfig
> > singletons instead. That way we can avoid including StorageConfig into
> the
> > StorageIntegration method signature.
> >
> > 3. Maybe reconcile those two "using" and "constructing" methods under a
> > > single unified interface that is still injectable
> >
> > I'm sure I won't be able to come up with a good enough name for the
> > combined SPI :)
> >
> > 6. We can evolve the PolarisCredentialVendor method interface to take a
> > > whole PolarisEntity instead of entityId, type, etc., to potentially
> > avoid a
> > > re-lookup. Such a syntactic change is still *semantically* compatible
> > with
> > > the SPI because any impls that used to need enttyId can still call
> > > entity.getId() if we pass in the whole entity
> >
> >
> > Let's iterate on the PR. I think a single PolarisEntity might not be
> > sufficient unfortunately because of the storage name override PR (
> > https://github.com/apache/polaris/pull/4023). The resolution there
> > effectively depends on constructing StorageConfig from several entities
> > (plus application config). Current implementation in this PR hoped to
> make
> > that easy by bypassing PolarisEntity altogether and passing constructed
> > StorageConfig directly, but now I understand that will complicate
> > persistence for your custom CredentialVendor implementation. Maybe
> passing
> > List<PolarisEntity> for the whole resolved path instead of a single
> > "resolved" PolarisEntity is the way to go?
> >
> > 7. Alternatively, if we really want to get callsites to call
> >
> > getSubscopedCreds directly on a PolarisStorageIntegration object, maybe
> > > it's possible to model the persistence cooperation in subclasses of
> > > PolarisStorageIntegration itself, but we still need a way to
> distinguish
> > > between whether a caller is *constructing* a new StorageIntegration or
> >
> > *using* an entity-linked StorageIntegration.
> >
> >
> > I think that would make the design more coupled and complicated, probably
> > not worth it.
> >
> > Once again, thanks for a thorough review. I'll start adapting the PR with
> > these in mind.
> >
> > On Fri, Apr 3, 2026 at 4:15 AM Dennis Huo <[email protected]> wrote:
> >
> > > Thanks for starting this discussion! Some refactoring does seem helpful
> > in
> > > this area. However, I think there are some elements of the currently
> > > entrenched SPIs to discuss. Specifically, the coordination with the
> > > persistence layer is an intentional feature, even though we should
> > probably
> > > encapsulate it to simplify the cases where it's not needed by the
> > > persistence layer. Reposting some of my comments from the PR:
> > >
> > > Though we should indeed be able to *decouple* the various interfaces
> > pulled
> > > in/aggregated by `PolarisMetaStoreManager`:
> > >
> > >         PolarisSecretsManager,
> > >         PolarisGrantManager,
> > >         PolarisCredentialVendor,
> > >         PolarisPolicyMappingManager,
> > >         PolarisEventManager,
> > >         PolarisMetricsManager
> > >
> > > those interfaces were specifically written as part of SPIs, and at
> least
> > > the same *functionality* provided by those extensibility points needs
> to
> > be
> > > preserved more strictly than syntax evolution of those interfaces,
> > unless a
> > > much deeper attempt is made amongst Polaris users to agree on a
> migration
> > > path off of such functionality.
> > >
> > > There's some discussion about the list of core SPIs in here:
> > > https://lists.apache.org/thread/0nj24zro7kyctqfnlml08ppo7zs9xcqs
> > >
> > > The way that applies to this PR is that the interaction with the
> > > persistence layer is intentional, especially for the
> > > TransactionalMetaStoreManager path.
> > >
> > > The basic use case if if a custom Polaris deployment uses subclasses
> that
> > > leverage those SPIs in order to perform additional book-keeping around
> > > storage-integration functionality and needs that to be transactional
> with
> > > entity creation/updates. This is best highlighted by thinking about the
> > > interaction between the subset of `PolarisCredentialVendor` method
> > > implementations with `IntegrationPersistence` which has these methods:
> > >
> > >     createStorageIntegration
> > >     persistStorageIntegrationIfNeeded
> > >     loadPolarisStorageIntegration
> > >
> > > These are in the interface specifically to define the SPI by which
> users
> > > customize a "lease, commit, use" model for secondary metadata necessary
> > for
> > > StorageIntegrations. In this current PR, it's not clear what happens to
> > > those methods now -- will
> > > `IntegrationPersistence::loadPolarisStorageIntegration` just be
> ignored?
> > >
> > > Overall, I think the refactoring on the credential caching layer should
> > > still be possible, while preserving the `PolarisCredentialVendor`
> > interface
> > > so that implementations of `PolarisCredentialVendor` still work
> > correctly.
> > >
> > > The use case to preserve is as follows:
> > >
> > > 1. Suppose the running Polaris service doesn't just use a static/fixed
> > > source IAM User from environment variables, but instead maintains a
> > *pool*
> > > of different source-IAM-user credentials in an offloaded sidecar
> > > 2. Each new Catalog/StorageConfig created is intended to lease a new
> > > identity from the *pool* and keep track of that lease atomically with
> the
> > > CatalogEntity
> > > 3. The CreateCatalog flow as traced through
> > > `TransactionalMetaStoreManagerImpl.createcatalog` is as follows:
> > >     a. Prepare catalog entity info in-memory
> > >     b. Call IntegrationPersistence.createStorageIntegrationInCurrentTxn
> > --
> > > this will call the sidecar credential-pool layer to *lease* a new IAM
> > user
> > > by reference
> > >     c. Start *transaction* for Polaris persistence layer
> > >     d. Lookups to validate current transaction state
> > >     e. In the same transaction that persists the new CatalogEntity,
> > > calls persistStorageIntegrationIfNeededInCurrentTxn - this may be
> custom
> > to
> > > different Polaris service owners, where a book-keeping table is updated
> > > cooperatively with the sidecar credential-pool layer to consume the
> > leased
> > > IAM user and store the fact that the new CatalogEntity's entityId is
> > > tightly coupled to that leased IAM user
> > >     f. Commit transaction all in one piece
> > > 4. The getSubscopedCreds flow would then be as follows:
> > >     a. Load entity again based on id (this is indeed probably redundant
> > and
> > > we could probably refactor this)
> > >     b. Call
> > > IntegrationPersistence.loadPolarisStorageIntegrationInCurrentTxn with
> the
> > > retrieved linked PolarisEntity
> > >     c. The custom service instance can have an impl of
> > > IntegrationPersistence here that uses the *stateful* book-keeping of
> the
> > > leased credential-pool object to return a handle into that sidecar
> > > credential-pool
> > >     d. The StorageIntegration returned may not be the same thing as the
> > > original object from "createStorageIntegrationInCurrentTxn" - it might
> > be a
> > > handle to a temporary authenticated connection into the sidecar
> > > credential-vendor, for example
> > >     e. This new stateful and entity-coupled PolarisStorageIntegration
> is
> > > ultimately used to getSubscopedCreds
> > >
> > > Today, I think the main cause of confusion is that we only have
> > > "placeholder" implementations in the main repo of
> > > loadPolarisStorageIntegrationInCurrentTxn that conflate
> "lease/construct
> > > StorageIntegration" with "fetch a vending-ready StorageIntegration".
> This
> > > was because credential-pooling semantics tend to be specific to each
> > > individual Polaris service-runner, and it'd be a lot of complexity to
> > have
> > > a "toy" implementation of credential-pooling for single-tenant use
> cases
> > > that don't need it.
> > >
> > > I think the overall flow could still be preserved while performing most
> > of
> > > the proposed refactors if we:
> > >
> > > 1. Solidify some variation of PolarisCredentialVendor as the interface
> > > definition for *using* a PolarisStorageIntegration to get a credential.
> > > This doesn't *need* to be implemented by monolithic MetaStoreManager
> > impls,
> > > but it still needs to exist as an optionally persistence-aware
> interface
> > > 2. Solidify some variation of PolarisStorageIntegrationProvider as the
> > > interface definition for *constructing/leasing* a new
> > > PolarisStorageIntegration strictly for CreateCatalog (or Create for any
> > > entity that might hold a StorageConfig)
> > > 3. Maybe reconcile those two "using" and "constructing" methods under a
> > > single unified interface that is still injectable
> > > 4. If we want some persistence types like the NoSql stack to always
> stay
> > > away from being in the business of facilitating credential vending, the
> > > callsite should obtain a PolarisCredentialVendor from some injection or
> > > factory method -- the default PolarisCredentialVendor can indeed be the
> > > current dumb in-memory "Construct from application-level configs
> > in-memory
> > > and vend immediately" implementation.
> > > 5. Persistence impls that need stateful cooperation with a credential
> > pool
> > > can still configure to provide their existing
> "CustomMetaStoreManagerImpl
> > > implements PolarisCredentialVendor" as the thing that's returned to
> > > callsites wanting to get credentials
> > > 6. We can evolve the PolarisCredentialVendor method interface to take a
> > > whole PolarisEntity instead of entityId, type, etc., to potentially
> > avoid a
> > > re-lookup. Such a syntactic change is still *semantically* compatible
> > with
> > > the SPI because any impls that used to need enttyId can still call
> > > entity.getId() if we pass in the whole entity
> > > 7. Alternatively, if we really want to get callsites to call
> > > getSubscopedCreds directly on a PolarisStorageIntegration object, maybe
> > > it's possible to model the persistence cooperation in subclasses of
> > > PolarisStorageIntegration itself, but we still need a way to
> distinguish
> > > between whether a caller is *constructing* a new StorageIntegration or
> > > *using* an entity-linked StorageIntegration
> > >
> > > On Wed, Apr 1, 2026 at 1:34 PM Tornike Gurgenidze <
> > [email protected]>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I'd like to bring up a refactoring effort around credential vending
> > that
> > > > I've been working on in PR #3699
> > > > <https://github.com/apache/polaris/pull/3699>. Dmitri has been
> > providing
> > > > feedback and helping a lot along the way, but I wanted to open this
> up
> > > for
> > > > broader discussion before iterating further.
> > > >
> > > > Motivation
> > > >
> > > > The current credential vending flow is deeply entangled with the
> > > > persistence layer. When a client requests scoped credentials (e.g.
> for
> > > S3,
> > > > GCS, or Azure), the request goes through: StorageCredentialsVendor ->
> > > > PolarisCredentialVendor -> MetaStoreManager -> persistence layer ->
> > back
> > > > out through PolarisStorageIntegrationProvider. This means credential
> > > > vending re-loads entities from persistence even though the caller
> > already
> > > > has them, and MetaStoreManager implementations are burdened with
> > > credential
> > > > vending logic that doesn't belong in persistence.
> > > >
> > > > Overall, the sheer amount of complexity and the amount of layers that
> > > > credential vending flow goes through makes further changes
> particularly
> > > > challenging as evidenced by some recent efforts around cache key
> > > > generation, storage info resolution, additional storage backends and
> so
> > > on.
> > > >
> > > > What the PR does
> > > >
> > > > 1. Removes credential vending from MetaStoreManager. The
> > > > PolarisCredentialVendor interface, StorageCredentialsVendor, and
> > > > getSubscopedCredsForEntity() implementations are removed from
> > > > MetaStoreManager. This cleans up both the transactional and NoSQL
> > > backends.
> > > >
> > > > 2. Moves orchestration into StorageAccessConfigProvider. This
> > > > application-scoped bean now directly resolves the storage integration
> > and
> > > > delegates to it, cutting out the persistence round-trip.
> > > >
> > > > 3. Moves caching into storage integrations. Each
> > > PolarisStorageIntegration
> > > > subclass (AWS, GCP, Azure) now owns its StorageCredentialCache
> > > interaction
> > > > and builds cloud-specific cache keys, rather than using a
> > > one-size-fits-all
> > > > key.
> > > >
> > > > I'd appreciate any feedback on the overall direction, concerns about
> > API
> > > > compatibility in polaris-core, or suggestions for how to best land
> > these
> > > > changes.
> > > >
> > > > Thanks,
> > > > Tornike
> > > >
> > >
> >
>

Reply via email to