GitHub user dosubot[bot] added a comment to the discussion: RLS condition inconsistently applied inside dataset metric subqueries
<!-- Answer --> Your hypothesis is close, but the root cause is slightly different than you might expect. Sqlglot **does** detect `tableau_brand` as an `exp.Table` node in your metric subquery — the issue is in the **RLS predicate lookup**, not in table detection. ## How RLS Predicate Matching Actually Works When `apply_rls()` processes your metric subquery `SELECT * FROM tableau_brand`, it: 1. Extracts the table name `"tableau_brand"` via `traverse_scope()` [[1]](https://github.com/apache/superset/blob/829e4d92d91ceae4b43b1ed3b063ffe45377799c/superset/sql/parse.py#L1316-L1330) 2. Looks up RLS predicates by querying `SqlaTable` with an **exact name match** [[2]](https://github.com/apache/superset/blob/829e4d92d91ceae4b43b1ed3b063ffe45377799c/superset/utils/rls.py#L85-L102): ```python dataset = db.session.query(SqlaTable).filter( and_( SqlaTable.database_id == database.id, catalog_predicate, SqlaTable.schema == table.schema, SqlaTable.table_name == table.table, # exact match on "tableau_brand" ) ).one_or_none() if not dataset: return [] # No predicates found → no RLS injection return [str(predicate.compile(...)) for predicate in dataset.get_sqla_row_level_filters()] ``` ## Why RLS Isn't Injected in Your Case The most likely explanation: **your RLS rules are associated with the underlying physical tables** (the tables referenced *inside* the virtual dataset's SQL), not with the `tableau_brand` virtual dataset record itself. When the metric subquery says `FROM tableau_brand`: - Sqlglot sees it as `exp.Table(name="tableau_brand")` - The lookup finds the `SqlaTable` record for `tableau_brand` - But if no RLS rules are directly associated with that `SqlaTable` record → empty predicates → no injection **There is no recursive resolution.** Superset does NOT: - Detect that `tableau_brand` is a virtual dataset - Parse its underlying SQL to find physical tables - Cascade RLS rules from those physical tables into the metric subquery This is an architectural limitation — `apply_rls()` receives only `(database, catalog, schema, parsed_statement)` and has no access to the dataset object or its virtual SQL [[3]](https://github.com/apache/superset/blob/829e4d92d91ceae4b43b1ed3b063ffe45377799c/superset/utils/rls.py#L32-L36). ## Why the Outer Query Gets RLS Correctly For the main dataset query, Superset uses a completely different code path: `get_sqla_row_level_filters()` is called directly on the dataset object [[4]](https://github.com/apache/superset/blob/829e4d92d91ceae4b43b1ed3b063ffe45377799c/superset/models/helpers.py#L1928), which knows its own RLS rules. The outer query RLS is applied via SQLAlchemy filter expressions at the ORM level, not through the sqlglot-based `apply_rls()` mechanism. ## The `kpi_groupe` Difference If `kpi_groupe` appears to work, check whether: - It has RLS rules directly associated with the virtual dataset record itself (not just underlying tables) - Or the consistent results are coincidental (the metric doesn't filter out rows that RLS would affect) ## Debugging To confirm which `exp.Table` nodes are detected and what predicates are resolved, you can add temporary logging in `superset/utils/rls.py` at the `apply_rls()` function [[5]](https://github.com/apache/superset/blob/829e4d92d91ceae4b43b1ed3b063ffe45377799c/superset/utils/rls.py#L47-L54): ```python for table in parsed_statement.tables: logger.debug(f"RLS: detected table={table}, predicates={get_predicates_for_table(table, database, catalog)}") ``` ## Recommended Workarounds 1. **Associate RLS rules directly with the `tableau_brand` virtual dataset** — this way the exact name match succeeds and predicates are injected into the subquery. 2. **Reference the physical table name in your metric subqueries** instead of the virtual dataset name — e.g., `SELECT SUM(value_2::int) FROM physical_table_name AS sub WHERE ...`. Physical table RLS rules will then apply. 3. **Avoid subqueries in metrics entirely** — restructure as simple aggregations (`SUM(CASE WHEN type = 'ts' THEN value_2::int END)`) and let the dataset's main query handle RLS uniformly at the outer level. 4. **Embed security in the virtual dataset SQL** using Jinja: ```sql SELECT * FROM physical_brands WHERE {{ filter_values('warehouse_id') }} ``` Option 3 is the safest architectural pattern — it ensures RLS is applied once, consistently, at the dataset query level without relying on recursive subquery injection. <!-- Dosu Comment Footer --> *To reply, just mention [@dosu](https://go.dosu.dev/dosubot?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=reply-with-mention&utm_term=apache%2Fsuperset).* --- Docs are dead. Just use [Dosu](https://dosu.dev?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=static-docs-tagline&utm_term=apache%2Fsuperset). [](https://app.dosu.dev/response-feedback?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=static-docs-feedback&utm_term=apache%2Fsuperset&message_id=ce0dd4d6-adea-4f5c-907f-a73b12522bef) [](https://github.dosu.com/apache/superset?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=static-docs-ask-repo&utm_term=apache%2Fsuperset) [](https://app.dosu.dev/signup?referrer=openSource&source=github-footer&utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=static-docs-share-team&utm_term=apache%2Fsuperset) GitHub link: https://github.com/apache/superset/discussions/40400#discussioncomment-17040759 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
