Re: [PR] feat(mcp): add response size guard to prevent oversized responses [superset]

via GitHub Tue, 03 Feb 2026 09:47:31 -0800


codeant-ai-for-open-source[bot] commented on code in PR #37200:
URL: https://github.com/apache/superset/pull/37200#discussion_r2760256987



##########
superset/mcp_service/middleware.py:
##########
@@ -755,3 +759,167 @@ def _filter_response(self, response: Any, object_type: 
str, user: Any) -> Any:
                 "Unknown response type for field filtering: %s", type(response)
             )
             return response
+
+
+class ResponseSizeGuardMiddleware(Middleware):
+    """
+    Middleware that prevents oversized responses from overwhelming LLM clients.
+
+    When a tool response exceeds the configured token limit, this middleware
+    intercepts it and returns a helpful error message with suggestions for
+    reducing the response size.
+
+    This is critical for protecting LLM clients like Claude Desktop which can
+    crash or become unresponsive when receiving extremely large responses.
+
+    Configuration via MCP_RESPONSE_SIZE_CONFIG in superset_config.py:
+    - enabled: Toggle the guard on/off (default: True)
+    - token_limit: Maximum estimated tokens per response (default: 25,000)
+    - warn_threshold_pct: Log warnings above this % of limit (default: 80%)
+    - excluded_tools: Tools to skip checking
+    """
+
+    def __init__(
+        self,
+        token_limit: int = DEFAULT_TOKEN_LIMIT,
+        warn_threshold_pct: int = DEFAULT_WARN_THRESHOLD_PCT,
+        excluded_tools: list[str] | None = None,
+    ) -> None:
+        self.token_limit = token_limit
+        self.warn_threshold_pct = warn_threshold_pct
+        self.warn_threshold = int(token_limit * warn_threshold_pct / 100)
+        self.excluded_tools = set(excluded_tools or [])
+
+    async def on_call_tool(
+        self,
+        context: MiddlewareContext,
+        call_next: Callable[[MiddlewareContext], Awaitable[Any]],
+    ) -> Any:
+        """Check response size after tool execution."""
+        tool_name = getattr(context.message, "name", "unknown")
+
+        # Skip excluded tools
+        if tool_name in self.excluded_tools:
+            return await call_next(context)
+
+        # Execute the tool
+        response = await call_next(context)
+
+        # Estimate response token count (guard against huge responses causing 
OOM)
+        from superset.mcp_service.utils.token_utils import (
+            estimate_response_tokens,
+            format_size_limit_error,
+        )
+
+        try:
+            estimated_tokens = estimate_response_tokens(response)
+        except MemoryError as me:
+            logger.warning(
+                "MemoryError while estimating tokens for %s: %s", tool_name, me
+            )
+            # Treat as over limit to avoid further serialization
+            estimated_tokens = self.token_limit + 1
+        except Exception as e:  # noqa: BLE001
+            logger.warning(
+                "Failed to estimate response tokens for %s: %s", tool_name, e
+            )
+            # Conservative fallback: block rather than risk OOM
+            estimated_tokens = self.token_limit + 1
+
+        # Log warning if approaching limit
+        if estimated_tokens > self.warn_threshold:
+            logger.warning(
+                "Response size warning for %s: ~%d tokens (%.0f%% of %d 
limit)",
+                tool_name,
+                estimated_tokens,
+                (estimated_tokens / self.token_limit) * 100,

Review Comment:
   **Suggestion:** When the token limit is misconfigured to 0 or a negative 
value, the percentage calculation in the warning log will perform a division by 
zero, raising a runtime ZeroDivisionError and breaking tool calls instead of 
just logging a warning. [possible bug]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ MCP tool calls via ResponseSizeGuardMiddleware may crash.
   - ⚠️ Warning logging for large responses becomes a runtime exception.
   ```
   </details>
   
   ```suggestion
               # Avoid division by zero if token_limit is misconfigured to 0 or 
negative
               if self.token_limit > 0:
                   percent_of_limit = (estimated_tokens / self.token_limit) * 
100
               else:
                   percent_of_limit = 0
               logger.warning(
                   "Response size warning for %s: ~%d tokens (%.0f%% of %d 
limit)",
                   tool_name,
                   estimated_tokens,
                   percent_of_limit,
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Add or modify MCP_RESPONSE_SIZE_CONFIG so token_limit is 0 (Flask app 
config).
   
      - File: superset/mcp_service/mcp_config.py (or runtime Flask config), 
then ensure
   
        create_response_size_guard_middleware() is used during service startup.
   
      - Reference: create_response_size_guard_middleware in
      superset/mcp_service/middleware.py
   
        (lines 894-915 in PR hunk) which passes token_limit into 
ResponseSizeGuardMiddleware.
   
   2. ResponseSizeGuardMiddleware is constructed with token_limit=0.
   
      - Constructor: ResponseSizeGuardMiddleware.__init__ in
      superset/mcp_service/middleware.py
   
        (lines ~782-792 in PR hunk) sets self.token_limit to 0 and computes 
warn_threshold.
   
   3. Trigger any MCP tool call that produces a response (normal operation).
   
      - Execution path: ResponseSizeGuardMiddleware.on_call_tool in
   
        superset/mcp_service/middleware.py (lines ~793-881). After the tool 
runs,
   
        estimated_tokens > self.warn_threshold (warn_threshold == 0) becomes 
True.
   
   4. Logging executes the warning calculation and divides by zero at:
   
      - Line: the percentage expression (estimated_tokens / self.token_limit) at
   
        superset/mcp_service/middleware.py line ~835 in PR hunk, raising 
ZeroDivisionError
   
        and causing the middleware (and tool call) to raise instead of just 
logging.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/middleware.py
   **Line:** 831:835
   **Comment:**
        *Possible Bug: When the token limit is misconfigured to 0 or a negative 
value, the percentage calculation in the warning log will perform a division by 
zero, raising a runtime ZeroDivisionError and breaking tool calls instead of 
just logging a warning.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/mcp_service/middleware.py:
##########
@@ -755,3 +759,167 @@ def _filter_response(self, response: Any, object_type: 
str, user: Any) -> Any:
                 "Unknown response type for field filtering: %s", type(response)
             )
             return response
+
+
+class ResponseSizeGuardMiddleware(Middleware):
+    """
+    Middleware that prevents oversized responses from overwhelming LLM clients.
+
+    When a tool response exceeds the configured token limit, this middleware
+    intercepts it and returns a helpful error message with suggestions for
+    reducing the response size.
+
+    This is critical for protecting LLM clients like Claude Desktop which can
+    crash or become unresponsive when receiving extremely large responses.
+
+    Configuration via MCP_RESPONSE_SIZE_CONFIG in superset_config.py:
+    - enabled: Toggle the guard on/off (default: True)
+    - token_limit: Maximum estimated tokens per response (default: 25,000)
+    - warn_threshold_pct: Log warnings above this % of limit (default: 80%)
+    - excluded_tools: Tools to skip checking
+    """
+
+    def __init__(
+        self,
+        token_limit: int = DEFAULT_TOKEN_LIMIT,
+        warn_threshold_pct: int = DEFAULT_WARN_THRESHOLD_PCT,
+        excluded_tools: list[str] | None = None,
+    ) -> None:
+        self.token_limit = token_limit
+        self.warn_threshold_pct = warn_threshold_pct
+        self.warn_threshold = int(token_limit * warn_threshold_pct / 100)
+        self.excluded_tools = set(excluded_tools or [])

Review Comment:
   **Suggestion:** The excluded_tools configuration is converted to a set 
directly, so if a single string is provided instead of a list, it is split into 
individual characters and the intended tool name will never match, causing the 
guard to run for tools that were meant to be excluded. [logic error]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Excluded tools processed by ResponseSizeGuardMiddleware unexpectedly.
   - ⚠️ Intended exclusions (e.g., health_check) ignored.
   - ⚠️ Administrators may misconfigure exclusion semantics.
   ```
   </details>
   
   ```suggestion
           # Normalise excluded_tools so a single string doesn't get treated as 
characters
           if isinstance(excluded_tools, str):
               excluded = [excluded_tools]
           else:
               excluded = excluded_tools or []
           self.excluded_tools = set(excluded)
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Provide a single string for excluded_tools in MCP_RESPONSE_SIZE_CONFIG 
(incorrect
   shape).
   
      - e.g. MCP_RESPONSE_SIZE_CONFIG = {"excluded_tools": "list_charts"} in 
Flask config.
   
      - create_response_size_guard_middleware reads this and passes it into the 
constructor
   
        (superset/mcp_service/middleware.py lines ~894-915).
   
   2. ResponseSizeGuardMiddleware.__init__ receives excluded_tools as a str.
   
      - Constructor: ResponseSizeGuardMiddleware.__init__ in
      superset/mcp_service/middleware.py
   
        (lines ~782-792). It executes self.excluded_tools = set(excluded_tools 
or [])
        converting
   
        the string into a set of characters.
   
   3. Tool exclusion check uses membership test against that set of characters.
   
      - on_call_tool does "if tool_name in self.excluded_tools: return await
      call_next(context)"
   
        at superset/mcp_service/middleware.py lines ~799-804; the full tool 
name (e.g.
        "list_charts")
   
        will not match any single character, so the tool is not excluded.
   
   4. Result: a tool intended to be skipped will be processed by the guard 
(possible
   blocking).
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/middleware.py
   **Line:** 791:791
   **Comment:**
        *Logic Error: The excluded_tools configuration is converted to a set 
directly, so if a single string is provided instead of a list, it is split into 
individual characters and the intended tool name will never match, causing the 
guard to run for tools that were meant to be excluded.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat(mcp): add response size guard to prevent oversized responses [superset]

Reply via email to