Hi Romain, I actually think code modularization is part of the discussion here.
More modules don't necessarily lead to better code. They also introduce additional complexity, especially around dependency management. Common issues include circular dependencies and module proliferation. For example, if two modules end up sharing common classes, we may have to introduce yet another module just to hold those shared classes so the original modules can remain decoupled. That can easily lead to module proliferation without providing much real value. If modules were a silver bullet for code organization, we wouldn't still rely on Java packages and classes to structure code within a module. In practice, modules are just one tool, and I think they should be introduced when they solve a concrete problem, rather than becoming the default design choice. Yufei On Fri, Jun 26, 2026 at 4:41 PM Yufei Gu <[email protected]> wrote: > Hi Alex, > > I'm not sure the security vulnerability argument is really related to > modularization. > > A CVE is generally against the Polaris project as a whole, not simply > whether a particular API lives in one module or another. By that logic, we > could make the same argument against almost any PR that introduces new > functionality, since any new code could potentially contain a security > issue in the future. > > I think the more relevant question is whether the API is conceptually part > of the Polaris core model. If we agree it is, then the possibility of > future vulnerabilities doesn't seem like a strong reason to split it into a > separate module. > > Yufei > > > On Fri, Jun 26, 2026 at 8:47 AM Alexandre Dutra <[email protected]> wrote: > >> Hi Romain, >> >> You bring up a good point about docker images: in a world where >> features are modularized, what the default image should contain is >> indeed up for debate. But I don't see this as an argument against >> modularization, e.g. I could see us providing 2 flavors: a "thin" one >> with just the essential stuff, then a "full" one with "all the >> things". >> >> We also discussed [1] an assembly tool for Polaris. Such a tool would >> lower the barrier for creating custom Polaris distros. >> >> About H2: I'd say that's slightly different because H2 is a >> dependency, not a Polaris module. But yes, in general we should not >> ship dependencies if they are not useful for a majority of users. In >> the case of H2 as you know we've been leaning towards having it by >> default in the official image because it improves the onboarding >> experience [2]. >> >> Thanks, >> Alex >> >> [1]: https://lists.apache.org/thread/gd7s3dgqqr5olm5go5wst998cogk05n4 >> [2]: https://lists.apache.org/thread/yw8l026g2smdk7gdg7k61tdcvdwcncqw >> >> On Fri, Jun 26, 2026 at 2:57 PM Romain Manni-Bucau >> <[email protected]> wrote: >> > >> > think there are two levels: >> > >> > * code itself -> don't think there is a debate about modularity there, >> it >> > is easier to integrate, refactor, drop potentially etc >> > * docker image -> while I agree it is better to have an adjusted bundle >> it >> > is also true end users will want supported runtime so default is the >> real >> > question and being forced to build a custom distro defeats the default >> > build and increases support work. Also note it is true for jdbc driver >> so >> > h2 must not come in the default image following the "minimal surface" >> > logic. So my 2cts would be to get something in between with a promotion >> > logic of feature once mature enough in the default build. >> > >> > hope it makes sense >> > >> > Romain Manni-Bucau >> > @rmannibucau <https://x.com/rmannibucau> | .NET Blog >> > <https://dotnetbirdie.github.io/> | Blog < >> https://rmannibucau.github.io/> | Old >> > Blog <http://rmannibucau.wordpress.com> | Github >> > <https://github.com/rmannibucau> | LinkedIn >> > <https://www.linkedin.com/in/rmannibucau> | Book >> > < >> https://www.packtpub.com/en-us/product/java-ee-8-high-performance-9781788473064 >> > >> > Javaccino founder (Java/.NET service - contact via linkedin) >> > >> > >> > Le ven. 26 juin 2026 à 13:11, Alexandre Dutra <[email protected]> a >> écrit : >> > >> > > Hi all, >> > > >> > > I am not a fan of gating an entire API behind a feature flag. >> > > >> > > Another reason not mentioned yet is: if a security vulnerability is >> > > detected in the new code, and that code is shipped unconditionally in >> > > polaris-runtime-service, then all deployments of that artifact will be >> > > flagged by security scans, regardless of whether they opted out of it >> via >> > > the feature flag. If the CVE targets a separate module instead, only >> users >> > > of that module would be affected. >> > > >> > > That is another reason why I think isolating the API in its module is >> a >> > > better design choice. >> > > >> > > Thanks, >> > > Alex >> > > >> > > Le ven. 26 juin 2026 à 01:02, Yufei Gu <[email protected]> a >> écrit : >> > > >> > > > Hi Dmitri, >> > > > >> > > > Thanks for the clarification. >> > > > >> > > > Could you elaborate on why having an empty HTTP layer is a concern >> for >> > > > downstream systems? If the feature is disabled, couldn't we simply >> > > return a >> > > > 404 or 501, similar to how Quarkus behaves when an endpoint is not >> > > > registered? >> > > > >> > > > Thanks, >> > > > >> > > > Yufei >> > > > >> > > > >> > > > On Wed, Jun 24, 2026 at 12:38 PM Dmitri Bourlatchkov < >> [email protected]> >> > > > wrote: >> > > > >> > > > > Hi Yufei, >> > > > > >> > > > > As I commented in this thread earlier, storing OSI data as Polaris >> > > > entities >> > > > > is a reasonable approach. >> > > > > >> > > > > However, adding hard dependencies from `runtime/service` to the >> new OSI >> > > > > RESP API impl. is not acceptable from my POV, as it forces >> > > > > downstream projects into exposing the OSI API without explicit >> opt-in. >> > > > > Feature flags are not relevant here because they work only after >> REST >> > > > > requests are accepted at the HTTP layer. >> > > > > >> > > > > This is discussed from a more general perspective in [1] >> > > > > >> > > > > All in all, I do not see any disadvantage to using separate >> modules for >> > > > new >> > > > > REST API implementations, but disadvantages in bundling them into >> > > > > runtime/serice do exist. >> > > > > >> > > > > [1] >> https://lists.apache.org/thread/d9dj3w8ktwdn6w27z7tvvgkljgw3n43b >> > > > > >> > > > > Cheers, >> > > > > Dmitri. >> > > > > >> > > > > On Tue, Jun 23, 2026 at 8:58 PM Yufei Gu <[email protected]> >> wrote: >> > > > > >> > > > > > Hi Dmitri, >> > > > > > >> > > > > > I see the value of keeping Polaris modular, but I have a >> slightly >> > > > > different >> > > > > > view on this particular case. >> > > > > > >> > > > > > To me, semantic models are closer to tables, views, and >> policies than >> > > > to >> > > > > > metrics or events. The proposal introduces a new Polaris entity >> type >> > > > with >> > > > > > its own lifecycle, authorization model, and metadata >> management. In >> > > > that >> > > > > > sense, semantic model support is part of the core Polaris >> metadata >> > > > model >> > > > > > rather than an optional auxiliary capability like metrics. >> > > > > > >> > > > > > For that reason, I would lean toward treating semantic models >> > > similarly >> > > > > to >> > > > > > other Polaris entities and keeping the API as part of the core >> > > Polaris >> > > > > > service. We already provide a feature flag to disable the >> > > > functionality, >> > > > > > which gives operators and downstream distributions the >> flexibility to >> > > > > turn >> > > > > > it off when it is not needed. >> > > > > > >> > > > > > Thanks, >> > > > > > Yufei >> > > > > > >> > > > > > >> > > > > > On Mon, Jun 22, 2026 at 2:28 PM Dmitri Bourlatchkov < >> > > > > > [email protected]> wrote: >> > > > > > >> > > > > > > Hi Yufei, >> > > > > > > >> > > > > > > Persisting OSI data as Polaris entities sounds reasonable to >> me. >> > > > > > > >> > > > > > > However, I believe the REST API layer for OSI should be >> structured >> > > > as a >> > > > > > > module with opt in/out opportunities for downstream builds >> (similar >> > > > to >> > > > > > the >> > > > > > > Metric query API). This is not a feature flag concern, but a >> point >> > > > > about >> > > > > > > the composition of the Polaris code. A modular approach >> promotes >> > > code >> > > > > > > clarity and allows both including the new API into default >> Polaris >> > > > > > > images as well as flexibility downstream projects. I do not >> see any >> > > > > > > downside to the modular approach. >> > > > > > > >> > > > > > > Feature flags can certainly be supported in the new API >> modules. >> > > > > > > >> > > > > > > Cheers, >> > > > > > > Dmitri. >> > > > > > > >> > > > > > > On Mon, Jun 22, 2026 at 5:00 PM Yufei Gu < >> [email protected]> >> > > > wrote: >> > > > > > > >> > > > > > > > Anand, thanks for chiming in. Looking forward to work >> together on >> > > > it. >> > > > > > > > >> > > > > > > > Dmitri, Adam, Adnan, thanks for the clarification. I think >> we can >> > > > > > > separate >> > > > > > > > a few concerns here. >> > > > > > > > >> > > > > > > > Apache Ossie specifies the OSI model spec itself, but not >> the >> > > CRUD >> > > > > REST >> > > > > > > > endpoints for managing OSI documents in Polaris. Polaris >> has the >> > > > > > > > opportunity to define those APIs. As Adam mentioned, the >> > > validator >> > > > is >> > > > > > > > intended for Ossie schema validation. That should >> definitely be >> > > > > version >> > > > > > > > based, so Polaris can validate the submitted document >> against the >> > > > > > > > corresponding OSI spec version while keeping the REST API >> > > contract >> > > > > > under >> > > > > > > > Polaris control. >> > > > > > > > >> > > > > > > > On the "first class" point, I think Adnan's interpretation >> is >> > > > > correct. >> > > > > > > The >> > > > > > > > intent is that a semantic model is a Polaris entity in the >> same >> > > > sense >> > > > > > as >> > > > > > > an >> > > > > > > > Iceberg table, view, generic table, or policy. It >> participates in >> > > > the >> > > > > > > > Polaris metadata model, authorization model, and lifecycle >> as a >> > > > > managed >> > > > > > > > entity. In that sense, it is different from metrics or >> events, >> > > > which >> > > > > > are >> > > > > > > > auxiliary data associated with entities rather than entities >> > > > > > themselves. >> > > > > > > > >> > > > > > > > On the "always active" point, providing a feature flag makes >> > > sense, >> > > > > > this >> > > > > > > is >> > > > > > > > already included in PR 4816. We can run the OSI API by >> default in >> > > > the >> > > > > > > > Apache Polaris build, but allow downstream admins to turn >> it off >> > > if >> > > > > > they >> > > > > > > do >> > > > > > > > not need it in their deployment. >> > > > > > > > >> > > > > > > > Thanks, >> > > > > > > > Yufei >> > > > > > > > >> > > > > > > > >> > > > > > > > On Mon, Jun 22, 2026 at 1:37 PM Anand Kumar Sankaran via >> dev < >> > > > > > > > [email protected]> wrote: >> > > > > > > > >> > > > > > > > > JB and Yufei, >> > > > > > > > > >> > > > > > > > > Thanks for doing this. We have customers asking for this >> as >> > > well. >> > > > > > Happy >> > > > > > > > to >> > > > > > > > > help in any way possible. >> > > > > > > > > >> > > > > > > > > - >> > > > > > > > > Anand >> > > > > > > > > >> > > > > > > > > From: Adnan Hemani via dev <[email protected]> >> > > > > > > > > Date: Monday, June 22, 2026 at 12:18 PM >> > > > > > > > > To: [email protected] <[email protected]> >> > > > > > > > > Cc: Adnan Hemani <[email protected]> >> > > > > > > > > Subject: Re: [DISCUSS] Semantic Layer Support in Apache >> Polaris >> > > > > > > > > >> > > > > > > > > This Message Is From an External Sender >> > > > > > > > > This message came from outside your organization. >> > > > > > > > > Report Suspicious< >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> https://us-phishalarm-ewt.proofpoint.com/EWT/v1/Iz9xO38YGHZK!YhNDZAGomgiHL51L-6FL3QPZjxHXwiq6JCAQHbb6PAE7K6Eqwb--zyy23NolE2-B94Vu6rTO00mQ6c0S3xLY-wGl3G8wkj5qTIJjWF_iK7wIvcJej0eX1hsbj7Uhl7_c$ >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > Hi Adam, Dmitri, Yufei, >> > > > > > > > > >> > > > > > > > > Adding in a clarification: I believe "first class" in the >> > > context >> > > > > of >> > > > > > > OSI >> > > > > > > > > would mean that it is given the same level of importance >> as a >> > > > > Polaris >> > > > > > > > > entity as a Table or View would. Is that generally >> correct? >> > > > > > > > > >> > > > > > > > > Best, >> > > > > > > > > Adnan Hemani >> > > > > > > > > >> > > > > > > > > On Mon, Jun 22, 2026 at 10:50 AM Adam Christian < >> > > > > > > > > [email protected]> wrote: >> > > > > > > > > >> > > > > > > > > > Hi Dmitri, >> > > > > > > > > > >> > > > > > > > > > This proposal [1] includes a second tab with the >> detailed >> > > > design. >> > > > > > It >> > > > > > > > > shows >> > > > > > > > > > the REST APIs that handle the CRUD operations for OSI >> > > Semantic >> > > > > > > Models. >> > > > > > > > > The >> > > > > > > > > > Semantic Model will be validated in the >> OsiDocumentValidator >> > > > > which >> > > > > > I >> > > > > > > > > assume >> > > > > > > > > > will validate against the Apache Ossie version. In my >> > > reading, >> > > > > > > Polaris >> > > > > > > > > does >> > > > > > > > > > not control it; we will leverage the upstream spec. >> > > > > > > > > > >> > > > > > > > > > Regarding OSI functionality if this feature is always >> > > active, I >> > > > > > > assume >> > > > > > > > > > users would benefit from it being active. If an admin >> user >> > > does >> > > > > not >> > > > > > > > want >> > > > > > > > > to >> > > > > > > > > > leverage OSI inside their Polaris instance, they simply >> won't >> > > > > grant >> > > > > > > the >> > > > > > > > > > privileges to the consuming users. >> > > > > > > > > > >> > > > > > > > > > [1] - >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> https://urldefense.com/v3/__https://docs.google.com/document/d/1ZdI-1w_5LbyCMhvUhLCtOt-N1Z89L2P-oiGLaYayCZg/edit?usp=sharing__;!!Iz9xO38YGHZK!5zRt5PLr106Rj8WbH_RftJ4SqCWP119n37Z77kzoNL-_JhobudorMvD0UdqyXJTi1PCMu0vGL3KGPBIq6oELUw$ >> > > > > > > > > > >> > > > > > > > > > On Thu, Jun 18, 2026 at 6:13 PM Dmitri Bourlatchkov < >> > > > > > > [email protected]> >> > > > > > > > > > wrote: >> > > > > > > > > > >> > > > > > > > > > > Hi Yufei, >> > > > > > > > > > > >> > > > > > > > > > > Sorry for getting to this proposal late. I postred >> some >> > > > > comments >> > > > > > on >> > > > > > > > PR >> > > > > > > > > > > 4816, recounting the key points here in more detail. >> > > > > > > > > > > >> > > > > > > > > > > * From the proposal doc: Goal G1: "Store OSI 0.1.x >> > > documents >> > > > as >> > > > > > > > > > first-class >> > > > > > > > > > > Polaris entities, scoped under a Namespace" >> > > > > > > > > > > >> > > > > > > > > > > I believe this needs a bit more discussion before we >> > > proceed >> > > > to >> > > > > > > > > concrete >> > > > > > > > > > > code changes. The idea of persisting OSI data is >> totally >> > > > valid. >> > > > > > > > > However, >> > > > > > > > > > > I'm not sure what "first class" means in this >> context? Does >> > > > it >> > > > > > mean >> > > > > > > > > that >> > > > > > > > > > > OSI functionality has to be active all the time? >> > > > > > > > > > > >> > > > > > > > > > > My initial perception of this proposal is that as a >> use >> > > case >> > > > it >> > > > > > is >> > > > > > > > > > similar >> > > > > > > > > > > to persisting Metrics (or Events) in Polaris. That >> is, the >> > > > > > feature >> > > > > > > is >> > > > > > > > > > > valuable, but downstream projects may want to have the >> > > > > > flexibility >> > > > > > > of >> > > > > > > > > > > deciding whether to include it or not. >> > > > > > > > > > > >> > > > > > > > > > > * Another point I'd like to clarity is about the REST >> API >> > > > > > > definition. >> > > > > > > > > Are >> > > > > > > > > > > API endpoints going be defined and controlled by the >> > > Polaris >> > > > > > > project? >> > > > > > > > > > > >> > > > > > > > > > > * Are REST API payload types defined and controlled by >> > > > Polaris >> > > > > or >> > > > > > > by >> > > > > > > > > > Apache >> > > > > > > > > > > Ossie [1]? >> > > > > > > > > > > >> > > > > > > > > > > [1] >> > > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> https://urldefense.com/v3/__https://www.mail-archive.com/[email protected]/msg86564.html__;!!Iz9xO38YGHZK!5zRt5PLr106Rj8WbH_RftJ4SqCWP119n37Z77kzoNL-_JhobudorMvD0UdqyXJTi1PCMu0vGL3KGPBLfgVHpkQ$ >> > > > > > > > > > > >> > > > > > > > > > > Thanks, >> > > > > > > > > > > Dmitri. >> > > > > > > > > > > >> > > > > > > > > > > On Fri, May 29, 2026 at 6:34 PM Yufei Gu < >> > > > [email protected] >> > > > > > >> > > > > > > > wrote: >> > > > > > > > > > > >> > > > > > > > > > > > Hi folks, >> > > > > > > > > > > > >> > > > > > > > > > > > As AI agents, BI tools, notebooks, and query engines >> > > > > > increasingly >> > > > > > > > > > consume >> > > > > > > > > > > > the same data, semantic definitions such as metrics >> and >> > > > > > > dimensions >> > > > > > > > > are >> > > > > > > > > > > > often duplicated across multiple systems. This >> leads to >> > > > > > > > inconsistent >> > > > > > > > > > > > definitions, duplicated effort, and governance >> > > challenges. >> > > > > The >> > > > > > > rise >> > > > > > > > > of >> > > > > > > > > > AI >> > > > > > > > > > > > agents further amplifies this problem, as agents >> rely on >> > > > > > semantic >> > > > > > > > > > context >> > > > > > > > > > > > to understand data and reason about business >> concepts. >> > > > > Without >> > > > > > a >> > > > > > > > > shared >> > > > > > > > > > > > semantic layer, organizations often end up >> maintaining >> > > > > multiple >> > > > > > > > > > versions >> > > > > > > > > > > of >> > > > > > > > > > > > the same business definitions across tools and >> > > > applications. >> > > > > > > > > > > > >> > > > > > > > > > > > JB and I would like to start a discussion on adding >> > > > semantic >> > > > > > > layer >> > > > > > > > > > > support >> > > > > > > > > > > > to Apache Polaris so semantic models can be defined >> once, >> > > > > > > governed >> > > > > > > > > > > > centrally, and consumed consistently across tools. >> The >> > > > > > > proposal[1] >> > > > > > > > > > > > introduces semantic models as a first class Polaris >> > > entity >> > > > > > using >> > > > > > > > the >> > > > > > > > > > Open >> > > > > > > > > > > > Semantic Interchange (OSI)[2] specification[3]. At >> a high >> > > > > > level, >> > > > > > > > the >> > > > > > > > > > > > proposal adds: >> > > > > > > > > > > > >> > > > > > > > > > > > - A new SEMANTIC_MODEL entity type >> > > > > > > > > > > > - CRUD APIs for semantic models >> > > > > > > > > > > > - Schema validation and authorization >> > > > > > > > > > > > >> > > > > > > > > > > > Polaris remains a metadata service and does not >> execute >> > > > > metrics >> > > > > > > or >> > > > > > > > > > > semantic >> > > > > > > > > > > > queries. >> > > > > > > > > > > > Feedback on the overall direction, design, and OSI >> > > adoption >> > > > > > would >> > > > > > > > be >> > > > > > > > > > > > greatly appreciated. >> > > > > > > > > > > > >> > > > > > > > > > > > 1. >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> https://urldefense.com/v3/__https://docs.google.com/document/d/1ZdI-1w_5LbyCMhvUhLCtOt-N1Z89L2P-oiGLaYayCZg/edit?usp=sharing__;!!Iz9xO38YGHZK!5zRt5PLr106Rj8WbH_RftJ4SqCWP119n37Z77kzoNL-_JhobudorMvD0UdqyXJTi1PCMu0vGL3KGPBIq6oELUw$ >> > > > > > > > > > > > 2. >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> https://urldefense.com/v3/__https://open-semantic-interchange.org__;!!Iz9xO38YGHZK!5zRt5PLr106Rj8WbH_RftJ4SqCWP119n37Z77kzoNL-_JhobudorMvD0UdqyXJTi1PCMu0vGL3KGPBKnqDA0QQ$ >> > > > > > > > > > > > 3. >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> https://urldefense.com/v3/__https://github.com/open-semantic-interchange/OSI/blob/main/core-spec/spec.md__;!!Iz9xO38YGHZK!5zRt5PLr106Rj8WbH_RftJ4SqCWP119n37Z77kzoNL-_JhobudorMvD0UdqyXJTi1PCMu0vGL3KGPBLfoGUc7Q$ >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > Yufei >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > -- >> > > > > > > Dmitri Bourlatchkov >> > > > > > > Senior Staff Software Engineer, Dremio >> > > > > > > Dremio.com >> > > > > > > < >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> https://www.dremio.com/?utm_medium=email&utm_source=signature&utm_term=na&utm_content=email-signature&utm_campaign=email-signature >> > > > > > > > >> > > > > > > / >> > > > > > > Follow Us on LinkedIn < >> https://www.linkedin.com/company/dremio> / >> > > > Get >> > > > > > > Started <https://www.dremio.com/get-started/> >> > > > > > > >> > > > > > > >> > > > > > > The Agentic Lakehouse >> > > > > > > The only lakehouse built for agents, managed by agents >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> >
