Thanks for the ping, I verified that the updates match what I had in mind, thanks for all the rounds of updates, and sorry it dragged on for so long!
Approved the PR. On Tue, Jun 2, 2026 at 3:45 PM Dmitri Bourlatchkov <[email protected]> wrote: > Hi Dennis, > > I've approved Tornike's PR [3699] in GH. As far as I can tell the PR is > aligned with previous discussions regarding credential vending SPI changes. > > Please post your review too. > > [3699] https://github.com/apache/polaris/pull/3699 > > Thanks, > Dmitri. > > > On Thu, Apr 9, 2026 at 6:44 PM Dennis Huo <[email protected]> wrote: > > > Thanks for updating the PR! I'll dive into the details there. And > generally > > I think we can be flexible on the specifics to do whatever we find to be > > cleanest as long as the described use case (lease/commit offloaded > > credential vending) is preserved. > > > > Your suggestion about passing the List<PolarisEntity> sounds good to me; > it > > aligns with plans for more advanced composition of inheritance in the > > general path hierarchy instead of assuming that the base CatalogEntity is > > the only authoritative definition of storage and connection > configuration. > > > > On Sat, Apr 4, 2026 at 2:23 AM Tornike Gurgenidze < > [email protected]> > > wrote: > > > > > Hi Dennis, > > > > > > Thanks a lot for such a detailed review. I understand the use case and > > the > > > reasons for the current design much better now. > > > > > > 1. Solidify some variation of PolarisCredentialVendor as the interface > > > > > > definition for *using* a PolarisStorageIntegration to get a credential. > > > > This doesn't *need* to be implemented by monolithic MetaStoreManager > > > impls, > > > > but it still needs to exist as an optionally persistence-aware > > interface > > > > > > > > > That sounds good to me, we can bring the interface back. iiuc, oss impl > > > will no longer have anything to do with MetastoreManager, while custom > > impl > > > will be free to integrate with IntegrationPersistence. We will simply > > have > > > to keep PolarisCredentialVendor as an SPI + relevant methods in > > > IntegrationPersistence around even if unused in oss. Let's iterate on > the > > > actual interface signature in the PR. > > > > > > 2. Solidify some variation of PolarisStorageIntegrationProvider as the > > > > > > interface definition for *constructing/leasing* a new > > > > PolarisStorageIntegration strictly for CreateCatalog (or Create for > any > > > > > > entity that might hold a StorageConfig) > > > > > > > > > Makes sense. PR changes PolarisStorageIntegrations to be singletons > (one > > > per integration type) and PolarisStorageIntegration objects accept > > > StorageConfig directly. I'm assuming that will probably be too awkward > > with > > > config/integration separation you described in mind. I can change oss > > > PolarisStorageIntegrationProvider impl to serve per-storageConfig > > > singletons instead. That way we can avoid including StorageConfig into > > the > > > StorageIntegration method signature. > > > > > > 3. Maybe reconcile those two "using" and "constructing" methods under a > > > > single unified interface that is still injectable > > > > > > I'm sure I won't be able to come up with a good enough name for the > > > combined SPI :) > > > > > > 6. We can evolve the PolarisCredentialVendor method interface to take a > > > > whole PolarisEntity instead of entityId, type, etc., to potentially > > > avoid a > > > > re-lookup. Such a syntactic change is still *semantically* compatible > > > with > > > > the SPI because any impls that used to need enttyId can still call > > > > entity.getId() if we pass in the whole entity > > > > > > > > > Let's iterate on the PR. I think a single PolarisEntity might not be > > > sufficient unfortunately because of the storage name override PR ( > > > https://github.com/apache/polaris/pull/4023). The resolution there > > > effectively depends on constructing StorageConfig from several entities > > > (plus application config). Current implementation in this PR hoped to > > make > > > that easy by bypassing PolarisEntity altogether and passing constructed > > > StorageConfig directly, but now I understand that will complicate > > > persistence for your custom CredentialVendor implementation. Maybe > > passing > > > List<PolarisEntity> for the whole resolved path instead of a single > > > "resolved" PolarisEntity is the way to go? > > > > > > 7. Alternatively, if we really want to get callsites to call > > > > > > getSubscopedCreds directly on a PolarisStorageIntegration object, maybe > > > > it's possible to model the persistence cooperation in subclasses of > > > > PolarisStorageIntegration itself, but we still need a way to > > distinguish > > > > between whether a caller is *constructing* a new StorageIntegration > or > > > > > > *using* an entity-linked StorageIntegration. > > > > > > > > > I think that would make the design more coupled and complicated, > probably > > > not worth it. > > > > > > Once again, thanks for a thorough review. I'll start adapting the PR > with > > > these in mind. > > > > > > On Fri, Apr 3, 2026 at 4:15 AM Dennis Huo <[email protected]> wrote: > > > > > > > Thanks for starting this discussion! Some refactoring does seem > helpful > > > in > > > > this area. However, I think there are some elements of the currently > > > > entrenched SPIs to discuss. Specifically, the coordination with the > > > > persistence layer is an intentional feature, even though we should > > > probably > > > > encapsulate it to simplify the cases where it's not needed by the > > > > persistence layer. Reposting some of my comments from the PR: > > > > > > > > Though we should indeed be able to *decouple* the various interfaces > > > pulled > > > > in/aggregated by `PolarisMetaStoreManager`: > > > > > > > > PolarisSecretsManager, > > > > PolarisGrantManager, > > > > PolarisCredentialVendor, > > > > PolarisPolicyMappingManager, > > > > PolarisEventManager, > > > > PolarisMetricsManager > > > > > > > > those interfaces were specifically written as part of SPIs, and at > > least > > > > the same *functionality* provided by those extensibility points needs > > to > > > be > > > > preserved more strictly than syntax evolution of those interfaces, > > > unless a > > > > much deeper attempt is made amongst Polaris users to agree on a > > migration > > > > path off of such functionality. > > > > > > > > There's some discussion about the list of core SPIs in here: > > > > https://lists.apache.org/thread/0nj24zro7kyctqfnlml08ppo7zs9xcqs > > > > > > > > The way that applies to this PR is that the interaction with the > > > > persistence layer is intentional, especially for the > > > > TransactionalMetaStoreManager path. > > > > > > > > The basic use case if if a custom Polaris deployment uses subclasses > > that > > > > leverage those SPIs in order to perform additional book-keeping > around > > > > storage-integration functionality and needs that to be transactional > > with > > > > entity creation/updates. This is best highlighted by thinking about > the > > > > interaction between the subset of `PolarisCredentialVendor` method > > > > implementations with `IntegrationPersistence` which has these > methods: > > > > > > > > createStorageIntegration > > > > persistStorageIntegrationIfNeeded > > > > loadPolarisStorageIntegration > > > > > > > > These are in the interface specifically to define the SPI by which > > users > > > > customize a "lease, commit, use" model for secondary metadata > necessary > > > for > > > > StorageIntegrations. In this current PR, it's not clear what happens > to > > > > those methods now -- will > > > > `IntegrationPersistence::loadPolarisStorageIntegration` just be > > ignored? > > > > > > > > Overall, I think the refactoring on the credential caching layer > should > > > > still be possible, while preserving the `PolarisCredentialVendor` > > > interface > > > > so that implementations of `PolarisCredentialVendor` still work > > > correctly. > > > > > > > > The use case to preserve is as follows: > > > > > > > > 1. Suppose the running Polaris service doesn't just use a > static/fixed > > > > source IAM User from environment variables, but instead maintains a > > > *pool* > > > > of different source-IAM-user credentials in an offloaded sidecar > > > > 2. Each new Catalog/StorageConfig created is intended to lease a new > > > > identity from the *pool* and keep track of that lease atomically with > > the > > > > CatalogEntity > > > > 3. The CreateCatalog flow as traced through > > > > `TransactionalMetaStoreManagerImpl.createcatalog` is as follows: > > > > a. Prepare catalog entity info in-memory > > > > b. Call > IntegrationPersistence.createStorageIntegrationInCurrentTxn > > > -- > > > > this will call the sidecar credential-pool layer to *lease* a new IAM > > > user > > > > by reference > > > > c. Start *transaction* for Polaris persistence layer > > > > d. Lookups to validate current transaction state > > > > e. In the same transaction that persists the new CatalogEntity, > > > > calls persistStorageIntegrationIfNeededInCurrentTxn - this may be > > custom > > > to > > > > different Polaris service owners, where a book-keeping table is > updated > > > > cooperatively with the sidecar credential-pool layer to consume the > > > leased > > > > IAM user and store the fact that the new CatalogEntity's entityId is > > > > tightly coupled to that leased IAM user > > > > f. Commit transaction all in one piece > > > > 4. The getSubscopedCreds flow would then be as follows: > > > > a. Load entity again based on id (this is indeed probably > redundant > > > and > > > > we could probably refactor this) > > > > b. Call > > > > IntegrationPersistence.loadPolarisStorageIntegrationInCurrentTxn with > > the > > > > retrieved linked PolarisEntity > > > > c. The custom service instance can have an impl of > > > > IntegrationPersistence here that uses the *stateful* book-keeping of > > the > > > > leased credential-pool object to return a handle into that sidecar > > > > credential-pool > > > > d. The StorageIntegration returned may not be the same thing as > the > > > > original object from "createStorageIntegrationInCurrentTxn" - it > might > > > be a > > > > handle to a temporary authenticated connection into the sidecar > > > > credential-vendor, for example > > > > e. This new stateful and entity-coupled PolarisStorageIntegration > > is > > > > ultimately used to getSubscopedCreds > > > > > > > > Today, I think the main cause of confusion is that we only have > > > > "placeholder" implementations in the main repo of > > > > loadPolarisStorageIntegrationInCurrentTxn that conflate > > "lease/construct > > > > StorageIntegration" with "fetch a vending-ready StorageIntegration". > > This > > > > was because credential-pooling semantics tend to be specific to each > > > > individual Polaris service-runner, and it'd be a lot of complexity to > > > have > > > > a "toy" implementation of credential-pooling for single-tenant use > > cases > > > > that don't need it. > > > > > > > > I think the overall flow could still be preserved while performing > most > > > of > > > > the proposed refactors if we: > > > > > > > > 1. Solidify some variation of PolarisCredentialVendor as the > interface > > > > definition for *using* a PolarisStorageIntegration to get a > credential. > > > > This doesn't *need* to be implemented by monolithic MetaStoreManager > > > impls, > > > > but it still needs to exist as an optionally persistence-aware > > interface > > > > 2. Solidify some variation of PolarisStorageIntegrationProvider as > the > > > > interface definition for *constructing/leasing* a new > > > > PolarisStorageIntegration strictly for CreateCatalog (or Create for > any > > > > entity that might hold a StorageConfig) > > > > 3. Maybe reconcile those two "using" and "constructing" methods > under a > > > > single unified interface that is still injectable > > > > 4. If we want some persistence types like the NoSql stack to always > > stay > > > > away from being in the business of facilitating credential vending, > the > > > > callsite should obtain a PolarisCredentialVendor from some injection > or > > > > factory method -- the default PolarisCredentialVendor can indeed be > the > > > > current dumb in-memory "Construct from application-level configs > > > in-memory > > > > and vend immediately" implementation. > > > > 5. Persistence impls that need stateful cooperation with a credential > > > pool > > > > can still configure to provide their existing > > "CustomMetaStoreManagerImpl > > > > implements PolarisCredentialVendor" as the thing that's returned to > > > > callsites wanting to get credentials > > > > 6. We can evolve the PolarisCredentialVendor method interface to > take a > > > > whole PolarisEntity instead of entityId, type, etc., to potentially > > > avoid a > > > > re-lookup. Such a syntactic change is still *semantically* compatible > > > with > > > > the SPI because any impls that used to need enttyId can still call > > > > entity.getId() if we pass in the whole entity > > > > 7. Alternatively, if we really want to get callsites to call > > > > getSubscopedCreds directly on a PolarisStorageIntegration object, > maybe > > > > it's possible to model the persistence cooperation in subclasses of > > > > PolarisStorageIntegration itself, but we still need a way to > > distinguish > > > > between whether a caller is *constructing* a new StorageIntegration > or > > > > *using* an entity-linked StorageIntegration > > > > > > > > On Wed, Apr 1, 2026 at 1:34 PM Tornike Gurgenidze < > > > [email protected]> > > > > wrote: > > > > > > > > > Hi all, > > > > > > > > > > I'd like to bring up a refactoring effort around credential vending > > > that > > > > > I've been working on in PR #3699 > > > > > <https://github.com/apache/polaris/pull/3699>. Dmitri has been > > > providing > > > > > feedback and helping a lot along the way, but I wanted to open this > > up > > > > for > > > > > broader discussion before iterating further. > > > > > > > > > > Motivation > > > > > > > > > > The current credential vending flow is deeply entangled with the > > > > > persistence layer. When a client requests scoped credentials (e.g. > > for > > > > S3, > > > > > GCS, or Azure), the request goes through: StorageCredentialsVendor > -> > > > > > PolarisCredentialVendor -> MetaStoreManager -> persistence layer -> > > > back > > > > > out through PolarisStorageIntegrationProvider. This means > credential > > > > > vending re-loads entities from persistence even though the caller > > > already > > > > > has them, and MetaStoreManager implementations are burdened with > > > > credential > > > > > vending logic that doesn't belong in persistence. > > > > > > > > > > Overall, the sheer amount of complexity and the amount of layers > that > > > > > credential vending flow goes through makes further changes > > particularly > > > > > challenging as evidenced by some recent efforts around cache key > > > > > generation, storage info resolution, additional storage backends > and > > so > > > > on. > > > > > > > > > > What the PR does > > > > > > > > > > 1. Removes credential vending from MetaStoreManager. The > > > > > PolarisCredentialVendor interface, StorageCredentialsVendor, and > > > > > getSubscopedCredsForEntity() implementations are removed from > > > > > MetaStoreManager. This cleans up both the transactional and NoSQL > > > > backends. > > > > > > > > > > 2. Moves orchestration into StorageAccessConfigProvider. This > > > > > application-scoped bean now directly resolves the storage > integration > > > and > > > > > delegates to it, cutting out the persistence round-trip. > > > > > > > > > > 3. Moves caching into storage integrations. Each > > > > PolarisStorageIntegration > > > > > subclass (AWS, GCP, Azure) now owns its StorageCredentialCache > > > > interaction > > > > > and builds cloud-specific cache keys, rather than using a > > > > one-size-fits-all > > > > > key. > > > > > > > > > > I'd appreciate any feedback on the overall direction, concerns > about > > > API > > > > > compatibility in polaris-core, or suggestions for how to best land > > > these > > > > > changes. > > > > > > > > > > Thanks, > > > > > Tornike > > > > > > > > > > > > > > >
