dosubot[bot] commented on issue #40465: URL: https://github.com/apache/superset/issues/40465#issuecomment-4554592705
<!-- Greeting --> Hi @fl0-m! I'm [Dosu](https://go.dosu.dev/dosubot) and I’m helping the superset team. <!-- Answer --> Great bug report — I can confirm both issues trace directly to the streaming export path introduced in PR #35478 [[1]](https://github.com/apache/superset/pull/35478). The root cause is that `_execute_query_and_stream` in `superset/commands/streaming_export/base.py` executes raw SQL via `connection.execute(text(sql))` without going through the standard pipeline that non-streaming exports use via `Database.get_df()` [[2]](https://github.com/apache/superset/blob/829e4d92d91ceae4b43b1ed3b063ffe45377799c/superset/models/core.py#L679-L682). Specifically: 1. **No `mutate_sql_based_on_config()`** — the non-streaming path parses SQL into statements and calls this for each one, which is where trailing semicolons get stripped and sqlglot reformatting happens. The streaming path skips this entirely. 2. **No user context on engine acquisition** — the streaming path calls `get_sqla_engine()` without passing catalog/schema/user context, whereas the standard path uses parameterized engine acquisition to enable impersonation headers. **Related prior fix (partial):** PR #40194 [[3]](https://github.com/apache/superset/pull/40194) addressed a related gap — PostgreSQL prequeries (e.g., `SET search_path`) were also not running on the streaming connection. That PR moved prequery execution into SQLAlchemy `connect` events on `get_sqla_engine()`. However, it did **not** address SQL mutation or user impersonation. **Potential fix direction:** There's a unified SQL execution API introduced in PRs #36529/#36739 [[4]](https://github.com/apache/superset/pull/36529) [[5]](https://github.com/apache/superset/pull/36739) that properly applies `mutate_sql_based_on_config()` and other preprocessing. Integrating the streaming path with that API (or at minimum calling `mutate_sql_based_on_config()` on the SQL and passing `user_name` to the engine context) would resolve both bugs while preserving the memory-efficient streaming architecture. As a short-term workaround for Bug 1, a `SQL_QUERY_MUTATOR` that strips trailing semicolons would prevent the Trino crash, but it won't address Bug 2 (impersonation bypass). <!-- Dosu Comment Footer --> *To reply, just mention [@dosu](https://go.dosu.dev/dosubot?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=reply-with-mention&utm_term=apache%2Fsuperset).* --- Docs are dead. Just use [Dosu](https://dosu.dev?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=static-docs-tagline&utm_term=apache%2Fsuperset). [](https://app.dosu.dev/response-feedback?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=static-docs-feedback&utm_term=apache%2Fsuperset&message_id=21176f16-a8d8-48e9-9c3f-fe7102541533) [](https://github.dosu.com/apache/superset?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=static-docs-ask-repo&utm_term=apache%2Fsuperset) [](https://app.dosu.dev/signup?referrer=openSource&source=github-footer&utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=static-docs-share-team&utm_term=apache%2Fsuperset) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
