GitHub user mohamedsigservice added a comment to the discussion: RLS condition 
inconsistently applied inside dataset metric subqueries

@dosu

Thanks, this clarifies a lot.

I would like more details about how sqlglot detects tables/subqueries during 
RLS injection.

For example:

* Does Superset/sqlglot traverse the full AST recursively and inject RLS into 
every `SELECT` node?
* What exactly is considered a "table reference"?
* Are CTEs (`WITH` clauses), scalar subqueries, correlated subqueries, and 
derived tables all treated the same way?
* Is the behavior based only on `FROM/JOIN` detection?
* Why would some metric expressions trigger recursive RLS injection while 
others do not?

I am trying to better understand which SQL patterns are considered safe vs 
unsafe when creating dataset metrics.

For example, are these cases handled differently?

```sql id="yvj8q1"
SUM(amount)
```

vs

```sql id="q3m7kd"
(
  SELECT SUM(s.amount)
  FROM sales s
  WHERE s.product_id = main.product_id
)
```

vs

```sql id="w6t2bn"
SUM(
  CASE WHEN EXISTS (
    SELECT 1
    FROM stock st
    WHERE st.product_id = product_id
  )
  THEN qty
END)
```

Also:

* does sqlglot resolve aliases/scopes before injecting RLS?
* is there a way to inspect/debug the transformed AST or the SQL after RLS 
injection but before execution?
* does the sqlglot/RLS parsing happen before or after Jinja templating is 
rendered?

For example, if a metric contains Jinja-generated SQL:

* is the AST built from the raw template,
* or from the final rendered SQL after Jinja execution?

Thanks.


GitHub link: 
https://github.com/apache/superset/discussions/40400#discussioncomment-17039345

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: 
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to