Hi Dennis, I've approved Tornike's PR [3699] in GH. As far as I can tell the PR is aligned with previous discussions regarding credential vending SPI changes.
Please post your review too. [3699] https://github.com/apache/polaris/pull/3699 Thanks, Dmitri. On Thu, Apr 9, 2026 at 6:44 PM Dennis Huo <[email protected]> wrote: > Thanks for updating the PR! I'll dive into the details there. And generally > I think we can be flexible on the specifics to do whatever we find to be > cleanest as long as the described use case (lease/commit offloaded > credential vending) is preserved. > > Your suggestion about passing the List<PolarisEntity> sounds good to me; it > aligns with plans for more advanced composition of inheritance in the > general path hierarchy instead of assuming that the base CatalogEntity is > the only authoritative definition of storage and connection configuration. > > On Sat, Apr 4, 2026 at 2:23 AM Tornike Gurgenidze <[email protected]> > wrote: > > > Hi Dennis, > > > > Thanks a lot for such a detailed review. I understand the use case and > the > > reasons for the current design much better now. > > > > 1. Solidify some variation of PolarisCredentialVendor as the interface > > > > definition for *using* a PolarisStorageIntegration to get a credential. > > > This doesn't *need* to be implemented by monolithic MetaStoreManager > > impls, > > > but it still needs to exist as an optionally persistence-aware > interface > > > > > > That sounds good to me, we can bring the interface back. iiuc, oss impl > > will no longer have anything to do with MetastoreManager, while custom > impl > > will be free to integrate with IntegrationPersistence. We will simply > have > > to keep PolarisCredentialVendor as an SPI + relevant methods in > > IntegrationPersistence around even if unused in oss. Let's iterate on the > > actual interface signature in the PR. > > > > 2. Solidify some variation of PolarisStorageIntegrationProvider as the > > > > interface definition for *constructing/leasing* a new > > > PolarisStorageIntegration strictly for CreateCatalog (or Create for any > > > > entity that might hold a StorageConfig) > > > > > > Makes sense. PR changes PolarisStorageIntegrations to be singletons (one > > per integration type) and PolarisStorageIntegration objects accept > > StorageConfig directly. I'm assuming that will probably be too awkward > with > > config/integration separation you described in mind. I can change oss > > PolarisStorageIntegrationProvider impl to serve per-storageConfig > > singletons instead. That way we can avoid including StorageConfig into > the > > StorageIntegration method signature. > > > > 3. Maybe reconcile those two "using" and "constructing" methods under a > > > single unified interface that is still injectable > > > > I'm sure I won't be able to come up with a good enough name for the > > combined SPI :) > > > > 6. We can evolve the PolarisCredentialVendor method interface to take a > > > whole PolarisEntity instead of entityId, type, etc., to potentially > > avoid a > > > re-lookup. Such a syntactic change is still *semantically* compatible > > with > > > the SPI because any impls that used to need enttyId can still call > > > entity.getId() if we pass in the whole entity > > > > > > Let's iterate on the PR. I think a single PolarisEntity might not be > > sufficient unfortunately because of the storage name override PR ( > > https://github.com/apache/polaris/pull/4023). The resolution there > > effectively depends on constructing StorageConfig from several entities > > (plus application config). Current implementation in this PR hoped to > make > > that easy by bypassing PolarisEntity altogether and passing constructed > > StorageConfig directly, but now I understand that will complicate > > persistence for your custom CredentialVendor implementation. Maybe > passing > > List<PolarisEntity> for the whole resolved path instead of a single > > "resolved" PolarisEntity is the way to go? > > > > 7. Alternatively, if we really want to get callsites to call > > > > getSubscopedCreds directly on a PolarisStorageIntegration object, maybe > > > it's possible to model the persistence cooperation in subclasses of > > > PolarisStorageIntegration itself, but we still need a way to > distinguish > > > between whether a caller is *constructing* a new StorageIntegration or > > > > *using* an entity-linked StorageIntegration. > > > > > > I think that would make the design more coupled and complicated, probably > > not worth it. > > > > Once again, thanks for a thorough review. I'll start adapting the PR with > > these in mind. > > > > On Fri, Apr 3, 2026 at 4:15 AM Dennis Huo <[email protected]> wrote: > > > > > Thanks for starting this discussion! Some refactoring does seem helpful > > in > > > this area. However, I think there are some elements of the currently > > > entrenched SPIs to discuss. Specifically, the coordination with the > > > persistence layer is an intentional feature, even though we should > > probably > > > encapsulate it to simplify the cases where it's not needed by the > > > persistence layer. Reposting some of my comments from the PR: > > > > > > Though we should indeed be able to *decouple* the various interfaces > > pulled > > > in/aggregated by `PolarisMetaStoreManager`: > > > > > > PolarisSecretsManager, > > > PolarisGrantManager, > > > PolarisCredentialVendor, > > > PolarisPolicyMappingManager, > > > PolarisEventManager, > > > PolarisMetricsManager > > > > > > those interfaces were specifically written as part of SPIs, and at > least > > > the same *functionality* provided by those extensibility points needs > to > > be > > > preserved more strictly than syntax evolution of those interfaces, > > unless a > > > much deeper attempt is made amongst Polaris users to agree on a > migration > > > path off of such functionality. > > > > > > There's some discussion about the list of core SPIs in here: > > > https://lists.apache.org/thread/0nj24zro7kyctqfnlml08ppo7zs9xcqs > > > > > > The way that applies to this PR is that the interaction with the > > > persistence layer is intentional, especially for the > > > TransactionalMetaStoreManager path. > > > > > > The basic use case if if a custom Polaris deployment uses subclasses > that > > > leverage those SPIs in order to perform additional book-keeping around > > > storage-integration functionality and needs that to be transactional > with > > > entity creation/updates. This is best highlighted by thinking about the > > > interaction between the subset of `PolarisCredentialVendor` method > > > implementations with `IntegrationPersistence` which has these methods: > > > > > > createStorageIntegration > > > persistStorageIntegrationIfNeeded > > > loadPolarisStorageIntegration > > > > > > These are in the interface specifically to define the SPI by which > users > > > customize a "lease, commit, use" model for secondary metadata necessary > > for > > > StorageIntegrations. In this current PR, it's not clear what happens to > > > those methods now -- will > > > `IntegrationPersistence::loadPolarisStorageIntegration` just be > ignored? > > > > > > Overall, I think the refactoring on the credential caching layer should > > > still be possible, while preserving the `PolarisCredentialVendor` > > interface > > > so that implementations of `PolarisCredentialVendor` still work > > correctly. > > > > > > The use case to preserve is as follows: > > > > > > 1. Suppose the running Polaris service doesn't just use a static/fixed > > > source IAM User from environment variables, but instead maintains a > > *pool* > > > of different source-IAM-user credentials in an offloaded sidecar > > > 2. Each new Catalog/StorageConfig created is intended to lease a new > > > identity from the *pool* and keep track of that lease atomically with > the > > > CatalogEntity > > > 3. The CreateCatalog flow as traced through > > > `TransactionalMetaStoreManagerImpl.createcatalog` is as follows: > > > a. Prepare catalog entity info in-memory > > > b. Call IntegrationPersistence.createStorageIntegrationInCurrentTxn > > -- > > > this will call the sidecar credential-pool layer to *lease* a new IAM > > user > > > by reference > > > c. Start *transaction* for Polaris persistence layer > > > d. Lookups to validate current transaction state > > > e. In the same transaction that persists the new CatalogEntity, > > > calls persistStorageIntegrationIfNeededInCurrentTxn - this may be > custom > > to > > > different Polaris service owners, where a book-keeping table is updated > > > cooperatively with the sidecar credential-pool layer to consume the > > leased > > > IAM user and store the fact that the new CatalogEntity's entityId is > > > tightly coupled to that leased IAM user > > > f. Commit transaction all in one piece > > > 4. The getSubscopedCreds flow would then be as follows: > > > a. Load entity again based on id (this is indeed probably redundant > > and > > > we could probably refactor this) > > > b. Call > > > IntegrationPersistence.loadPolarisStorageIntegrationInCurrentTxn with > the > > > retrieved linked PolarisEntity > > > c. The custom service instance can have an impl of > > > IntegrationPersistence here that uses the *stateful* book-keeping of > the > > > leased credential-pool object to return a handle into that sidecar > > > credential-pool > > > d. The StorageIntegration returned may not be the same thing as the > > > original object from "createStorageIntegrationInCurrentTxn" - it might > > be a > > > handle to a temporary authenticated connection into the sidecar > > > credential-vendor, for example > > > e. This new stateful and entity-coupled PolarisStorageIntegration > is > > > ultimately used to getSubscopedCreds > > > > > > Today, I think the main cause of confusion is that we only have > > > "placeholder" implementations in the main repo of > > > loadPolarisStorageIntegrationInCurrentTxn that conflate > "lease/construct > > > StorageIntegration" with "fetch a vending-ready StorageIntegration". > This > > > was because credential-pooling semantics tend to be specific to each > > > individual Polaris service-runner, and it'd be a lot of complexity to > > have > > > a "toy" implementation of credential-pooling for single-tenant use > cases > > > that don't need it. > > > > > > I think the overall flow could still be preserved while performing most > > of > > > the proposed refactors if we: > > > > > > 1. Solidify some variation of PolarisCredentialVendor as the interface > > > definition for *using* a PolarisStorageIntegration to get a credential. > > > This doesn't *need* to be implemented by monolithic MetaStoreManager > > impls, > > > but it still needs to exist as an optionally persistence-aware > interface > > > 2. Solidify some variation of PolarisStorageIntegrationProvider as the > > > interface definition for *constructing/leasing* a new > > > PolarisStorageIntegration strictly for CreateCatalog (or Create for any > > > entity that might hold a StorageConfig) > > > 3. Maybe reconcile those two "using" and "constructing" methods under a > > > single unified interface that is still injectable > > > 4. If we want some persistence types like the NoSql stack to always > stay > > > away from being in the business of facilitating credential vending, the > > > callsite should obtain a PolarisCredentialVendor from some injection or > > > factory method -- the default PolarisCredentialVendor can indeed be the > > > current dumb in-memory "Construct from application-level configs > > in-memory > > > and vend immediately" implementation. > > > 5. Persistence impls that need stateful cooperation with a credential > > pool > > > can still configure to provide their existing > "CustomMetaStoreManagerImpl > > > implements PolarisCredentialVendor" as the thing that's returned to > > > callsites wanting to get credentials > > > 6. We can evolve the PolarisCredentialVendor method interface to take a > > > whole PolarisEntity instead of entityId, type, etc., to potentially > > avoid a > > > re-lookup. Such a syntactic change is still *semantically* compatible > > with > > > the SPI because any impls that used to need enttyId can still call > > > entity.getId() if we pass in the whole entity > > > 7. Alternatively, if we really want to get callsites to call > > > getSubscopedCreds directly on a PolarisStorageIntegration object, maybe > > > it's possible to model the persistence cooperation in subclasses of > > > PolarisStorageIntegration itself, but we still need a way to > distinguish > > > between whether a caller is *constructing* a new StorageIntegration or > > > *using* an entity-linked StorageIntegration > > > > > > On Wed, Apr 1, 2026 at 1:34 PM Tornike Gurgenidze < > > [email protected]> > > > wrote: > > > > > > > Hi all, > > > > > > > > I'd like to bring up a refactoring effort around credential vending > > that > > > > I've been working on in PR #3699 > > > > <https://github.com/apache/polaris/pull/3699>. Dmitri has been > > providing > > > > feedback and helping a lot along the way, but I wanted to open this > up > > > for > > > > broader discussion before iterating further. > > > > > > > > Motivation > > > > > > > > The current credential vending flow is deeply entangled with the > > > > persistence layer. When a client requests scoped credentials (e.g. > for > > > S3, > > > > GCS, or Azure), the request goes through: StorageCredentialsVendor -> > > > > PolarisCredentialVendor -> MetaStoreManager -> persistence layer -> > > back > > > > out through PolarisStorageIntegrationProvider. This means credential > > > > vending re-loads entities from persistence even though the caller > > already > > > > has them, and MetaStoreManager implementations are burdened with > > > credential > > > > vending logic that doesn't belong in persistence. > > > > > > > > Overall, the sheer amount of complexity and the amount of layers that > > > > credential vending flow goes through makes further changes > particularly > > > > challenging as evidenced by some recent efforts around cache key > > > > generation, storage info resolution, additional storage backends and > so > > > on. > > > > > > > > What the PR does > > > > > > > > 1. Removes credential vending from MetaStoreManager. The > > > > PolarisCredentialVendor interface, StorageCredentialsVendor, and > > > > getSubscopedCredsForEntity() implementations are removed from > > > > MetaStoreManager. This cleans up both the transactional and NoSQL > > > backends. > > > > > > > > 2. Moves orchestration into StorageAccessConfigProvider. This > > > > application-scoped bean now directly resolves the storage integration > > and > > > > delegates to it, cutting out the persistence round-trip. > > > > > > > > 3. Moves caching into storage integrations. Each > > > PolarisStorageIntegration > > > > subclass (AWS, GCP, Azure) now owns its StorageCredentialCache > > > interaction > > > > and builds cloud-specific cache keys, rather than using a > > > one-size-fits-all > > > > key. > > > > > > > > I'd appreciate any feedback on the overall direction, concerns about > > API > > > > compatibility in polaris-core, or suggestions for how to best land > > these > > > > changes. > > > > > > > > Thanks, > > > > Tornike > > > > > > > > > >
