codeant-ai-for-open-source[bot] commented on code in PR #40473:
URL: https://github.com/apache/superset/pull/40473#discussion_r3328098579
##########
superset/mcp_service/dashboard/schemas.py:
##########
@@ -288,6 +288,33 @@ def validate_search_and_filters(self) ->
"ListDashboardsRequest":
return self
+DEFAULT_GET_DASHBOARD_INFO_COLUMNS: List[str] = [
+ "id",
+ "dashboard_title",
+ "slug",
+ "description",
+ "certified_by",
+ "certification_details",
+ "published",
+ "is_managed_externally",
+ "external_url",
+ "created_on",
+ "changed_on",
+ "uuid",
+ "url",
+ "created_on_humanized",
+ "changed_on_humanized",
+ "chart_count",
+ "tags",
+ "charts",
+ "native_filters",
+ "cross_filters_enabled",
+ "is_permalink_state",
+ "permalink_key",
+ "filter_state",
Review Comment:
**Suggestion:** The default dashboard projection still includes
`filter_state`, which is the large permalink payload this change intended to
avoid by default. When `permalink_key` is used, this can bloat responses back
into truncation by `ResponseSizeGuardMiddleware`, so remove `filter_state` from
the default list and require callers to opt in explicitly when they need it.
[performance]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ get_dashboard_info with permalink returns oversized, truncation-prone
payloads.
- ⚠️ ResponseSizeGuardMiddleware truncates dashboard info for complex
filters.
- ⚠️ LLM agents receive partial dashboard filter context unexpectedly.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Inspect the default projection for dashboard info in
`superset/mcp_service/dashboard/schemas.py:22-46`. The
`DEFAULT_GET_DASHBOARD_INFO_COLUMNS` list includes `"filter_state"` as one
of the default
top-level fields (line 45 in that file, line 314 in the PR hunk).
2. Inspect how `GetDashboardInfoRequest.select_columns` is populated in
`superset/mcp_service/dashboard/schemas.py:73-95`. The field uses
`default_factory=lambda:
list(DEFAULT_GET_DASHBOARD_INFO_COLUMNS)` and the
`@field_validator("select_columns",
mode="before")` returns `list(DEFAULT_GET_DASHBOARD_INFO_COLUMNS)` whenever
the incoming
value is `None` or parses to an empty list, so any call that omits
`select_columns`
receives a default that *includes* `"filter_state"`.
3. Follow the execution of the MCP tool in
`superset/mcp_service/dashboard/tool/get_dashboard_info.py:107-118` and
`:163-175` plus
`:251-254`. FastMCP calls `get_dashboard_info(request:
GetDashboardInfoRequest, ctx:
Context)`, which uses `ModelGetInfoCore` to produce a `DashboardInfo`
instance and then
serializes it via `result.model_dump(mode="json", context={"select_columns":
request.select_columns})`. Because `DashboardInfo` implements
`_filter_fields_by_context`
at `superset/mcp_service/dashboard/schemas.py:75-90` (lines 75-90 of that
class), the
presence of `"filter_state"` in `request.select_columns` guarantees that
`filter_state` is
included in the serialized payload whenever the field is non-null.
4. Examine how `filter_state` is populated and used when permalink state is
present.
`_apply_permalink_state` in
`superset/mcp_service/dashboard/tool/get_dashboard_info.py:75-89` builds a
payload from
`result.model_dump(mode="python")`, sets `payload["permalink_key"] =
permalink_key`, and
sets `payload["filter_state"]` to a sanitized copy of the permalink state
dict (which can
contain nested `dataMask`, `activeTabs`, etc.), then revalidates it into
`DashboardInfo`.
The `DashboardInfo.filter_state` field in
`superset/mcp_service/dashboard/schemas.py:47-63` is a `Dict[str, Any]` and
is expected to
contain the full permalink state. The unit test
`test_get_dashboard_info_permalink_does_not_double_sanitize` in
`tests/unit_tests/mcp_service/dashboard/tool/test_dashboard_tools.py:214-250`
and
`:440-50` confirms that, for a call `client.call_tool("get_dashboard_info",
{"request":
{"identifier": 1, "permalink_key": "permalink-1"}})` with no
`select_columns` provided,
the response includes a deeply nested `filter_state` structure with repeated
string values
(e.g. labels, URLs, filters). Since `select_columns` defaults to
`DEFAULT_GET_DASHBOARD_INFO_COLUMNS`, this large `filter_state` is *always*
present by
default whenever a permalink is used. That enlarged payload is then passed
through
`ResponseSizeGuardMiddleware`
(`superset/mcp_service/middleware.py:1158-1200` and
`:1228-95`), which estimates token size and truncates or blocks responses
that exceed the
token limit for info tools (see tests
`test_middleware_response_truncation_for_get_dashboard_info` and related
cases in
`tests/unit_tests/mcp_service/test_middleware.py:676-693` where
`context.message.name =
"get_dashboard_info"` and a large payload triggers truncation). Thus, a real
MCP call to
`get_dashboard_info` with only `identifier` and `permalink_key` and no
explicit
`select_columns` will, for dashboards with large permalink state, include
the full
`filter_state` by default, significantly increasing the response size and
making
truncation by `ResponseSizeGuardMiddleware` more likely than if
`filter_state` were
omitted from the default projection and only included when explicitly
requested.
```
</details>
[Fix in
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=44b7505f1d8f4322a6bc0cce4e0d4a1f&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
| [Fix in VSCode
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=44b7505f1d8f4322a6bc0cce4e0d4a1f&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/dashboard/schemas.py
**Line:** 314:314
**Comment:**
*Performance: The default dashboard projection still includes
`filter_state`, which is the large permalink payload this change intended to
avoid by default. When `permalink_key` is used, this can bloat responses back
into truncation by `ResponseSizeGuardMiddleware`, so remove `filter_state` from
the default list and require callers to opt in explicitly when they need it.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40473&comment_hash=d359f61f5d93488c12bf83c6464b358236d53200645d5484f05a51e9e34069d2&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40473&comment_hash=d359f61f5d93488c12bf83c6464b358236d53200645d5484f05a51e9e34069d2&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]