Hi Dennis, Interesting proposal!
I left a few comments in the doc. At a high level, from my POV, I believe we need to clarify the high-level requirements. I do not insist on using the requirements approach, actually, but the doc already uses that terminology. Alternatively, we could focus on the use cases Open Sharing enables. In any case, I believe it would be beneficial to draw a clean line between what we would like to achieve and how we achieve that. I tend to think we need to distinguish Consumption aspects from Governance aspects. The former will generally be an Engine-side concern, while the latter will be the data owner's concern. Different actors will be involved on both sides of this line. On the Consumption side, using IRC does seem like the best solution nowadays. On the Governance side, I believe some aspects of this proposal can benefit from more discussion. In particular: * It is not clear whether / how the Polaris Authorizer is involved on the "shared" access path. Are new authorizable operations expected to be added to the Authorizer SPI? * It is not clear whether all Share users (external) have the same or different access rights to the shared tables. * How external IdPs can be integrated. The doc mentions OIDC_FEDERATION, but it's not clear how it relates to ExternalConsumer. Where / how is OIDC federation configured? * It would be nice to delineate Authentication aspects from Authorization aspects in Shares/Listings. I hope this will make it easier to evolve the system by adding different AuthN methods later. * In the case of AWS S3 what controls the role (plus External ID, etc.) used for minting STS session (vended) credentials? Are they supposed to be the same or different for different external clients? * If a view is shared, how can we be sure its SQL definition is resolvable (in terms of following table names/paths) in the context of a share? Thanks, Dmitri. On Thu, May 28, 2026 at 1:46 AM Dennis Huo <[email protected]> wrote: > Hi All, > > One big emerging enterprise use case coming up as more people consolidate > Data Lakehouses and Catalogs is something commonly known as "Data Sharing", > an more specifically over the course of adoption of open table formats > "Open Data Sharing". > > Examples of existing managed service providers' Data Sharing features: > > https://www.databricks.com/product/delta-sharing > https://docs.snowflake.com/en/user-guide/data-sharing-intro > https://docs.aws.amazon.com/redshift/latest/dg/datashare-overview.html > https://docs.cloud.google.com/bigquery/docs/analytics-hub-introduction > > https://learn.microsoft.com/en-us/fabric/governance/external-data-sharing-overview > > The basic idea is that when you share data between different companies, you > need a first-class governance/management layer and extra bells-and-whistles > that are distinct from just the basic capabilities of RBAC or generalized > access-control (i.e. if you're sharing across partially-untrusted org > boundaries, you don't just let the consumer organization log into your > datalake like one of your own employees). > > JB and I put together this high-level proposal for supporting Open Sharing > in Polaris: > > > https://docs.google.com/document/d/1Y0yQi5iWbmuTHPkFiIs7WjIiC3EXJTl1PzZ-wtoRnZ0/edit?usp=sharing > > Tentatively, it means adding ~5 logical data model constructs, some of > which may be a first-class PolarisEntity type, others subtypes of existing > entities, and others just a nested construct: > > - ShareEntity (would behave similarly to a Catalog) > - ExternalConsumer (mostly inherits from Principal) > - Listing (Similar to a "role grant" but has different metadata) > - EndpointConfig (nested config under Listing) > - ShareMembership (Similar to a "securable grant" but different > metadata) > > Feedback/comments welcome! I'll also bring it up for live discussion if > there's time in the community sync. > > Cheers, > Dennis >
