Re: [I] [SIP-182] Semantic Layer Support in Apache Superset [superset]

via GitHub Tue, 06 Jan 2026 11:14:04 -0800


barakalon commented on issue #35003:
URL: https://github.com/apache/superset/issues/35003#issuecomment-3715970055


   👋 Hello!
   
   I work on Minerva, Airbnb's in-house semantic layer.
   
   IIUC this proposal would make it _easier_ to use Superset as a thin, 
read-only layer on top of a "headless" semantic layer like Minerva, Cube, etc. 
I think this approach is very practical given that Superset is 
"dataset-centric".
   
   We've been using Superset like this for many years. It works. But is has its 
limitations - mainly, **analytics development lifecycle velocity**.
   
   As an example, let's look at the workflow you mention above:
   1. Power user explores raw data with SQL
   2. Power user defines the semantic layer in an external system (e.g. 
semantic view DDL for snowflake, or git-based config files in Minerva, Cube, 
MetricFlow, etc).
   3. Power user registers the semantic layer in Superset
   4. Non-power users build charts on top of the semantic layer
   
   However, there are some complications, including:
   - **Slow iteration** - The semantic layer might have concepts that can't 
easily be tested with raw SQL in step 1, so simply developing a metric spans 
steps 1-4, which involves multiple different systems.
   - **Fragile version control** - If a particular explorable/metric/etc is 
widely used, you have to be careful when changing its definition. Without 
first-class version control in Superset, it can be very hard to test how a 
change might affect existing charts and dashboards.
   
   So you end up with a heavyweight, waterfall-style lifecycle that 
_necessitates_ power users. Many non-power users end up bypassing the semantic 
layer, modeling data directly in Superset's thin semantic layer. This works 
well for isolated, quick analyses but lacks many of the benefits of a more 
powerful semantic layer (collaborative data modeling, preaggregation, 
integration with other applications, etc.).
   
   So I love the direction of this proposal, and I think its worthwhile, but I 
think's still fundamentally limited compared to tools like Looker and Tableau, 
which have more sophisticated semantic layers.
   
   I'm new to the Superset OSS community, and this might be tangential to other 
goals, overly-ambitious, or already discussed, but instead of keeping 
Superset's semantic layer thin, maybe we should be making it fatter? My dream 
is building Minerva directly into Superset...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] [SIP-182] Semantic Layer Support in Apache Superset [superset]

Reply via email to