Re: [PR] feat(mcp): add response size guard to prevent oversized responses [superset]

via GitHub Wed, 04 Feb 2026 01:49:51 -0800


codeant-ai-for-open-source[bot] commented on code in PR #37200:
URL: https://github.com/apache/superset/pull/37200#discussion_r2763126225



##########
superset/mcp_service/utils/token_utils.py:
##########
@@ -0,0 +1,414 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+Token counting and response size utilities for MCP service.
+
+This module provides utilities to estimate token counts and generate smart
+suggestions when responses exceed configured limits. This prevents large
+responses from overwhelming LLM clients like Claude Desktop.
+"""
+
+from __future__ import annotations
+
+import logging
+from typing import Any, Dict, List, Union
+
+from pydantic import BaseModel
+from typing_extensions import TypeAlias
+
+logger = logging.getLogger(__name__)
+
+# Type alias for MCP tool responses (Pydantic models, dicts, lists, strings, 
bytes)
+ToolResponse: TypeAlias = Union[BaseModel, Dict[str, Any], List[Any], str, 
bytes]
+
+# Approximate characters per token for estimation
+# Claude tokenizer averages ~4 chars per token for English text
+# JSON tends to be more verbose, so we use a slightly lower ratio
+CHARS_PER_TOKEN = 3.5
+
+
+def estimate_token_count(text: str | bytes) -> int:
+    """
+    Estimate the token count for a given text.
+
+    Uses a character-based heuristic since we don't have direct access to
+    the actual tokenizer. This is conservative to avoid underestimating.
+
+    Args:
+        text: The text to estimate tokens for (string or bytes)
+
+    Returns:
+        Estimated number of tokens
+    """
+    if isinstance(text, bytes):
+        text = text.decode("utf-8", errors="replace")
+
+    # Simple heuristic: ~3.5 characters per token for JSON/code
+    return int(len(text) / CHARS_PER_TOKEN)
+
+
+def estimate_response_tokens(response: ToolResponse) -> int:
+    """
+    Estimate token count for an MCP tool response.
+
+    Handles various response types including Pydantic models, dicts, and 
strings.
+
+    Args:
+        response: The response object to estimate
+
+    Returns:
+        Estimated number of tokens
+    """
+    try:
+        from superset.utils import json
+
+        # Convert response to JSON string for accurate estimation
+        if hasattr(response, "model_dump"):
+            # Pydantic model
+            response_str = json.dumps(response.model_dump())
+        elif isinstance(response, (dict, list)):
+            response_str = json.dumps(response)

Review Comment:
   **Suggestion:** In the token estimation for tool responses, bytes are 
manually decoded with the default UTF-8 settings, so non‑UTF‑8 or partially 
invalid byte payloads will raise UnicodeDecodeError and force the function into 
the conservative 100,000‑token fallback instead of using the more robust bytes 
handling already implemented in estimate_token_count. [possible bug]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ⚠️ estimate_response_tokens may overestimate binary responses.
   - ⚠️ ResponseSizeGuardMiddleware may wrongly block some responses.
   - ⚠️ Affects tools returning raw bytes (binary payloads).
   ```
   </details>
   
   ```suggestion
               elif isinstance(response, str):
                   response_str = response
               elif isinstance(response, bytes):
                   # Let estimate_token_count handle decoding of raw bytes 
safely
                   return estimate_token_count(response)
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Start MCP service with the PR code where token utilities are loaded from
   superset/mcp_service/utils/token_utils.py and the function 
estimate_response_tokens is
   defined (approx. lines 63-93 in that file).
   
   2. Cause an MCP tool handler (the new ResponseSizeGuardMiddleware described 
in the PR) to
   call estimate_response_tokens(...) with a response that is bytes containing
   invalid/non-UTF-8 sequences instead of a JSON string (the call site is the 
middleware
   which uses this util to estimate tokens; the util code is in token_utils.py:
   estimate_response_tokens).
   
   3. In estimate_response_tokens (token_utils.py: ~82-89), the code reaches 
the branch `elif
   isinstance(response, (str, bytes)):` and calls response.decode() without 
error handling;
   Python will raise UnicodeDecodeError for invalid byte sequences.
   
   4. The UnicodeDecodeError propagates to the try/except in 
estimate_response_tokens, which
   catches and logs it, then returns the conservative fallback of 100000 tokens 
— causing the
   middleware to treat the response as oversized and block it. This shows the 
current
   behavior; improved code delegates bytes decoding to estimate_token_count to 
avoid raising
   here.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/utils/token_utils.py
   **Line:** 84:85
   **Comment:**
        *Possible Bug: In the token estimation for tool responses, bytes are 
manually decoded with the default UTF-8 settings, so non‑UTF‑8 or partially 
invalid byte payloads will raise UnicodeDecodeError and force the function into 
the conservative 100,000‑token fallback instead of using the more robust bytes 
handling already implemented in estimate_token_count.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/mcp_service/utils/token_utils.py:
##########
@@ -0,0 +1,414 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+Token counting and response size utilities for MCP service.
+
+This module provides utilities to estimate token counts and generate smart
+suggestions when responses exceed configured limits. This prevents large
+responses from overwhelming LLM clients like Claude Desktop.
+"""
+
+from __future__ import annotations
+
+import logging
+from typing import Any, Dict, List, Union
+
+from pydantic import BaseModel
+from typing_extensions import TypeAlias
+
+logger = logging.getLogger(__name__)
+
+# Type alias for MCP tool responses (Pydantic models, dicts, lists, strings, 
bytes)
+ToolResponse: TypeAlias = Union[BaseModel, Dict[str, Any], List[Any], str, 
bytes]
+
+# Approximate characters per token for estimation
+# Claude tokenizer averages ~4 chars per token for English text
+# JSON tends to be more verbose, so we use a slightly lower ratio
+CHARS_PER_TOKEN = 3.5
+
+
+def estimate_token_count(text: str | bytes) -> int:
+    """
+    Estimate the token count for a given text.
+
+    Uses a character-based heuristic since we don't have direct access to
+    the actual tokenizer. This is conservative to avoid underestimating.
+
+    Args:
+        text: The text to estimate tokens for (string or bytes)
+
+    Returns:
+        Estimated number of tokens
+    """
+    if isinstance(text, bytes):
+        text = text.decode("utf-8", errors="replace")
+
+    # Simple heuristic: ~3.5 characters per token for JSON/code
+    return int(len(text) / CHARS_PER_TOKEN)
+
+
+def estimate_response_tokens(response: ToolResponse) -> int:
+    """
+    Estimate token count for an MCP tool response.
+
+    Handles various response types including Pydantic models, dicts, and 
strings.
+
+    Args:
+        response: The response object to estimate
+
+    Returns:
+        Estimated number of tokens
+    """
+    try:
+        from superset.utils import json
+
+        # Convert response to JSON string for accurate estimation
+        if hasattr(response, "model_dump"):
+            # Pydantic model
+            response_str = json.dumps(response.model_dump())
+        elif isinstance(response, (dict, list)):
+            response_str = json.dumps(response)
+        elif isinstance(response, (str, bytes)):
+            response_str = response if isinstance(response, str) else 
response.decode()
+        else:
+            response_str = str(response)
+
+        return estimate_token_count(response_str)
+    except Exception as e:  # noqa: BLE001
+        logger.warning("Failed to estimate response tokens: %s", e)
+        # Return a high estimate to be safe (conservative fallback)
+        return 100000
+
+
+def get_response_size_bytes(response: ToolResponse) -> int:
+    """
+    Get the size of a response in bytes.
+
+    Args:
+        response: The response object
+
+    Returns:
+        Size in bytes
+    """
+    try:
+        from superset.utils import json
+
+        if hasattr(response, "model_dump"):
+            response_str = json.dumps(response.model_dump())
+        elif isinstance(response, (dict, list)):
+            response_str = json.dumps(response)
+        elif isinstance(response, bytes):
+            return len(response)
+        elif isinstance(response, str):
+            return len(response.encode("utf-8"))
+        else:
+            response_str = str(response)
+
+        return len(response_str.encode("utf-8"))
+    except Exception as e:  # noqa: BLE001
+        logger.warning("Failed to get response size: %s", e)
+        # Return a conservative large value to avoid allowing oversized 
responses
+        # to bypass size checks (returning 0 would underestimate)
+        return 1_000_000  # 1MB fallback
+
+
+def extract_query_params(params: Dict[str, Any] | None) -> Dict[str, Any]:
+    """
+    Extract relevant query parameters from tool params for suggestions.
+
+    Args:
+        params: The tool parameters dict
+
+    Returns:
+        Extracted parameters relevant for size reduction suggestions
+    """
+    if not params:
+        return {}
+
+    # Handle nested request object (common pattern in MCP tools)
+    if "request" in params and isinstance(params["request"], dict):
+        params = params["request"]
+
+    # Keys to extract from params
+    extract_keys = [
+        # Pagination
+        "page_size",
+        "limit",
+        # Column selection
+        "select_columns",
+        "columns",
+        # Filters
+        "filters",
+        # Search
+        "search",
+    ]
+    return {k: params[k] for k in extract_keys if k in params}
+
+
+def generate_size_reduction_suggestions(
+    tool_name: str,
+    params: Dict[str, Any] | None,
+    estimated_tokens: int,
+    token_limit: int,
+    response: ToolResponse | None = None,
+) -> List[str]:
+    """
+    Generate smart suggestions for reducing response size.
+
+    Analyzes the tool and parameters to provide actionable recommendations.
+
+    Args:
+        tool_name: Name of the MCP tool
+        params: The tool parameters
+        estimated_tokens: Estimated token count of the response
+        token_limit: Configured token limit
+        response: Optional response object for additional analysis
+
+    Returns:
+        List of suggestion strings
+    """

Review Comment:
   **Suggestion:** In size-reduction suggestions, the code multiplies 
page_size/limit directly by a float, assuming the value is numeric, but these 
parameters come from external input and may be strings or other types, which 
will raise a TypeError and prevent suggestions from being generated instead of 
failing gracefully. [type error]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Suggestion generation can raise TypeError and fail.
   - ⚠️ format_size_limit_error may error during error formatting.
   - ⚠️ ResponseSizeGuardMiddleware suggestions become unavailable.
   ```
   </details>
   
   ```suggestion
           try:
               numeric_page_size = (
                   int(current_page_size) if current_page_size is not None else 
None
               )
           except (TypeError, ValueError):
               numeric_page_size = None
           if numeric_page_size and numeric_page_size > 0:
               # Calculate suggested new limit based on reduction needed
               suggested_limit = max(
                   1,
                   int(numeric_page_size * (token_limit / estimated_tokens))
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Prepare a call that results in generate_size_reduction_suggestions(...) 
being invoked
   in superset/mcp_service/utils/token_utils.py (function present at approx. 
lines 163-258).
   This happens during formatting of size-limit errors 
(format_size_limit_error) when the
   system analyzes params.
   
   2. Pass params with a page_size value coming from external input as a string 
(e.g., params
   = {"page_size": "100"}) — this is realistic because HTTP/query parameters 
and some RPC
   callers often provide numeric values as strings.
   
   3. generate_size_reduction_suggestions reads current_page_size = "100" and 
reaches the
   expression int(current_page_size * (token_limit / estimated_tokens)). 
Multiplying a str by
   a float raises TypeError.
   
   4. The TypeError is not caught within generate_size_reduction_suggestions, 
so it
   propagates back to format_size_limit_error / middleware, breaking suggestion 
generation
   and potentially causing the middleware or its caller to error instead of 
returning a
   helpful message.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/utils/token_utils.py
   **Line:** 182:184
   **Comment:**
        *Type Error: In size-reduction suggestions, the code multiplies 
page_size/limit directly by a float, assuming the value is numeric, but these 
parameters come from external input and may be strings or other types, which 
will raise a TypeError and prevent suggestions from being generated instead of 
failing gracefully.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/mcp_service/utils/token_utils.py:
##########
@@ -0,0 +1,414 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+Token counting and response size utilities for MCP service.
+
+This module provides utilities to estimate token counts and generate smart
+suggestions when responses exceed configured limits. This prevents large
+responses from overwhelming LLM clients like Claude Desktop.
+"""
+
+from __future__ import annotations
+
+import logging
+from typing import Any, Dict, List, Union
+
+from pydantic import BaseModel
+from typing_extensions import TypeAlias
+
+logger = logging.getLogger(__name__)
+
+# Type alias for MCP tool responses (Pydantic models, dicts, lists, strings, 
bytes)
+ToolResponse: TypeAlias = Union[BaseModel, Dict[str, Any], List[Any], str, 
bytes]
+
+# Approximate characters per token for estimation
+# Claude tokenizer averages ~4 chars per token for English text
+# JSON tends to be more verbose, so we use a slightly lower ratio
+CHARS_PER_TOKEN = 3.5
+
+
+def estimate_token_count(text: str | bytes) -> int:
+    """
+    Estimate the token count for a given text.
+
+    Uses a character-based heuristic since we don't have direct access to
+    the actual tokenizer. This is conservative to avoid underestimating.
+
+    Args:
+        text: The text to estimate tokens for (string or bytes)
+
+    Returns:
+        Estimated number of tokens
+    """
+    if isinstance(text, bytes):
+        text = text.decode("utf-8", errors="replace")
+
+    # Simple heuristic: ~3.5 characters per token for JSON/code
+    return int(len(text) / CHARS_PER_TOKEN)

Review Comment:
   **Suggestion:** The heuristic token counter can return 0 tokens for any 
non‑empty text shorter than CHARS_PER_TOKEN characters due to integer 
truncation, which contradicts the "conservative" expectation and can propagate 
zero into downstream logic that assumes a positive token count. [logic error]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ⚠️ Token estimates for very short responses become zero.
   - ⚠️ generate_size_reduction_suggestions uses fallback branches.
   - ⚠️ Error messages / suggestions may be misleading.
   ```
   </details>
   
   ```suggestion
   text_length = len(text)
   if text_length == 0:
       return 0
   return max(1, int(text_length / CHARS_PER_TOKEN))
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Call estimate_token_count(text) in 
superset/mcp_service/utils/token_utils.py (function
   defined at lines ~45-62) with a short non-empty string, e.g. text = "OK" or 
text = "a".
   
   2. The function computes int(len(text) / CHARS_PER_TOKEN). For len("a") == 1 
and
   CHARS_PER_TOKEN == 3.5 this yields int(0.285...) == 0, so 
estimate_token_count returns 0.
   
   3. The calling code (for example generate_size_reduction_suggestions in 
token_utils.py:
   ~163-258 and format_size_limit_error later in the same file) checks `if 
estimated_tokens`
   in places; a 0 value is falsy and causes code paths that assume "no 
estimate" to be taken,
   producing misleading suggestions or using fallback behavior.
   
   4. This demonstrates a real misestimation observable by unit tests that call
   estimate_token_count() with very short strings and then by invoking 
functions that consume
   that value (generate_size_reduction_suggestions/format_size_limit_error).
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/utils/token_utils.py
   **Line:** 61:62
   **Comment:**
        *Logic Error: The heuristic token counter can return 0 tokens for any 
non‑empty text shorter than CHARS_PER_TOKEN characters due to integer 
truncation, which contradicts the "conservative" expectation and can propagate 
zero into downstream logic that assumes a positive token count.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat(mcp): add response size guard to prevent oversized responses [superset]

Reply via email to