codeant-ai-for-open-source[bot] commented on code in PR #40473:
URL: https://github.com/apache/superset/pull/40473#discussion_r3328098579


##########
superset/mcp_service/dashboard/schemas.py:
##########
@@ -288,6 +288,33 @@ def validate_search_and_filters(self) -> 
"ListDashboardsRequest":
         return self
 
 
+DEFAULT_GET_DASHBOARD_INFO_COLUMNS: List[str] = [
+    "id",
+    "dashboard_title",
+    "slug",
+    "description",
+    "certified_by",
+    "certification_details",
+    "published",
+    "is_managed_externally",
+    "external_url",
+    "created_on",
+    "changed_on",
+    "uuid",
+    "url",
+    "created_on_humanized",
+    "changed_on_humanized",
+    "chart_count",
+    "tags",
+    "charts",
+    "native_filters",
+    "cross_filters_enabled",
+    "is_permalink_state",
+    "permalink_key",
+    "filter_state",

Review Comment:
   **Suggestion:** The default dashboard projection still includes 
`filter_state`, which is the large permalink payload this change intended to 
avoid by default. When `permalink_key` is used, this can bloat responses back 
into truncation by `ResponseSizeGuardMiddleware`, so remove `filter_state` from 
the default list and require callers to opt in explicitly when they need it. 
[performance]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ get_dashboard_info with permalink returns oversized, truncation-prone 
payloads.
   - ⚠️ ResponseSizeGuardMiddleware truncates dashboard info for complex 
filters.
   - ⚠️ LLM agents receive partial dashboard filter context unexpectedly.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Inspect the default projection for dashboard info in
   `superset/mcp_service/dashboard/schemas.py:22-46`. The
   `DEFAULT_GET_DASHBOARD_INFO_COLUMNS` list includes `"filter_state"` as one 
of the default
   top-level fields (line 45 in that file, line 314 in the PR hunk).
   
   2. Inspect how `GetDashboardInfoRequest.select_columns` is populated in
   `superset/mcp_service/dashboard/schemas.py:73-95`. The field uses 
`default_factory=lambda:
   list(DEFAULT_GET_DASHBOARD_INFO_COLUMNS)` and the 
`@field_validator("select_columns",
   mode="before")` returns `list(DEFAULT_GET_DASHBOARD_INFO_COLUMNS)` whenever 
the incoming
   value is `None` or parses to an empty list, so any call that omits 
`select_columns`
   receives a default that *includes* `"filter_state"`.
   
   3. Follow the execution of the MCP tool in
   `superset/mcp_service/dashboard/tool/get_dashboard_info.py:107-118` and 
`:163-175` plus
   `:251-254`. FastMCP calls `get_dashboard_info(request: 
GetDashboardInfoRequest, ctx:
   Context)`, which uses `ModelGetInfoCore` to produce a `DashboardInfo` 
instance and then
   serializes it via `result.model_dump(mode="json", context={"select_columns":
   request.select_columns})`. Because `DashboardInfo` implements 
`_filter_fields_by_context`
   at `superset/mcp_service/dashboard/schemas.py:75-90` (lines 75-90 of that 
class), the
   presence of `"filter_state"` in `request.select_columns` guarantees that 
`filter_state` is
   included in the serialized payload whenever the field is non-null.
   
   4. Examine how `filter_state` is populated and used when permalink state is 
present.
   `_apply_permalink_state` in
   `superset/mcp_service/dashboard/tool/get_dashboard_info.py:75-89` builds a 
payload from
   `result.model_dump(mode="python")`, sets `payload["permalink_key"] = 
permalink_key`, and
   sets `payload["filter_state"]` to a sanitized copy of the permalink state 
dict (which can
   contain nested `dataMask`, `activeTabs`, etc.), then revalidates it into 
`DashboardInfo`.
   The `DashboardInfo.filter_state` field in
   `superset/mcp_service/dashboard/schemas.py:47-63` is a `Dict[str, Any]` and 
is expected to
   contain the full permalink state. The unit test
   `test_get_dashboard_info_permalink_does_not_double_sanitize` in
   
`tests/unit_tests/mcp_service/dashboard/tool/test_dashboard_tools.py:214-250` 
and
   `:440-50` confirms that, for a call `client.call_tool("get_dashboard_info", 
{"request":
   {"identifier": 1, "permalink_key": "permalink-1"}})` with no 
`select_columns` provided,
   the response includes a deeply nested `filter_state` structure with repeated 
string values
   (e.g. labels, URLs, filters). Since `select_columns` defaults to
   `DEFAULT_GET_DASHBOARD_INFO_COLUMNS`, this large `filter_state` is *always* 
present by
   default whenever a permalink is used. That enlarged payload is then passed 
through
   `ResponseSizeGuardMiddleware` 
(`superset/mcp_service/middleware.py:1158-1200` and
   `:1228-95`), which estimates token size and truncates or blocks responses 
that exceed the
   token limit for info tools (see tests
   `test_middleware_response_truncation_for_get_dashboard_info` and related 
cases in
   `tests/unit_tests/mcp_service/test_middleware.py:676-693` where 
`context.message.name =
   "get_dashboard_info"` and a large payload triggers truncation). Thus, a real 
MCP call to
   `get_dashboard_info` with only `identifier` and `permalink_key` and no 
explicit
   `select_columns` will, for dashboards with large permalink state, include 
the full
   `filter_state` by default, significantly increasing the response size and 
making
   truncation by `ResponseSizeGuardMiddleware` more likely than if 
`filter_state` were
   omitted from the default projection and only included when explicitly 
requested.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=44b7505f1d8f4322a6bc0cce4e0d4a1f&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=44b7505f1d8f4322a6bc0cce4e0d4a1f&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/dashboard/schemas.py
   **Line:** 314:314
   **Comment:**
        *Performance: The default dashboard projection still includes 
`filter_state`, which is the large permalink payload this change intended to 
avoid by default. When `permalink_key` is used, this can bloat responses back 
into truncation by `ResponseSizeGuardMiddleware`, so remove `filter_state` from 
the default list and require callers to opt in explicitly when they need it.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40473&comment_hash=d359f61f5d93488c12bf83c6464b358236d53200645d5484f05a51e9e34069d2&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40473&comment_hash=d359f61f5d93488c12bf83c6464b358236d53200645d5484f05a51e9e34069d2&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to