aminghadersohi commented on code in PR #38562:
URL: https://github.com/apache/superset/pull/38562#discussion_r2930497699


##########
superset/mcp_service/flask_singleton.py:
##########
@@ -53,61 +52,33 @@
         # Use _get_current_object() to get the actual Flask app, not the 
LocalProxy
         app = current_app._get_current_object()
     else:
-        # Either appbuilder is not initialized (standalone MCP server),
-        # or appbuilder is initialized but we're not in an app context
-        # (edge case - should rarely happen). In both cases, create a minimal 
app.
+        # Standalone MCP server — Superset models are deeply coupled to
+        # appbuilder, security_manager, event_logger, encrypted_field_factory,
+        # etc. so we use create_app() for full initialization rather than
+        # trying to init a minimal subset (which leads to cascading failures).
         #
-        # We avoid calling create_app() which would run full FAB initialization
-        # and could corrupt the shared appbuilder singleton if main app starts.
-        from superset.app import SupersetApp
+        # create_app() is safe here because in standalone mode the main

Review Comment:
   Done — added an explicit `elif appbuilder_initialized` branch that raises 
`RuntimeError` instead of silently calling `create_app()`. See ca64067.



##########
superset/mcp_service/server.py:
##########
@@ -151,6 +155,110 @@ def create_event_store(config: dict[str, Any] | None = 
None) -> Any | None:
         return None
 
 
+def _strip_titles(obj: Any, in_properties_map: bool = False) -> Any:
+    """Recursively strip schema metadata ``title`` keys.
+
+    Keeps real field names inside ``properties`` (e.g. a property literally
+    named ``title``), while removing auto-generated schema title metadata.
+    """
+    if isinstance(obj, dict):
+        result: dict[str, Any] = {}
+        for key, value in obj.items():
+            if key == "title" and not in_properties_map:
+                continue
+            result[key] = _strip_titles(value, in_properties_map=(key == 
"properties"))
+        return result
+    if isinstance(obj, list):
+        return [_strip_titles(item, in_properties_map=False) for item in obj]
+    return obj
+
+
+def _serialize_tools_without_output_schema(
+    tools: Sequence[Any],
+) -> list[dict[str, Any]]:
+    """Serialize tools to JSON, stripping outputSchema and titles to reduce 
tokens.
+
+    LLMs only need inputSchema to call tools. outputSchema accounts for
+    50-80% of the per-tool schema size, and auto-generated 'title' fields
+    add ~12% bloat. Stripping both cuts search result tokens significantly.
+    """
+    results = []
+    for tool in tools:
+        data = tool.to_mcp_tool().model_dump(mode="json", exclude_none=True)
+        data.pop("outputSchema", None)
+        if input_schema := data.get("inputSchema"):
+            data["inputSchema"] = _strip_titles(input_schema)
+        results.append(data)
+    return results
+
+
+def _fix_call_tool_schema(transform: Any) -> None:
+    """Patch the call_tool proxy to emit a clean ``type: object`` schema.
+
+    FastMCP's BaseSearchTransform defines ``arguments`` as
+    ``dict[str, Any] | None`` which emits an ``anyOf`` JSON Schema.
+    Some MCP bridges (mcp-remote, Claude Desktop) don't handle ``anyOf``
+    and strip it, leaving the field without a ``type`` — causing all
+    call_tool invocations to fail with "Input should be a valid dictionary".
+
+    This patches the transform's ``_make_call_tool`` to post-process the
+    schema, replacing the ``anyOf`` with a flat ``type: object``.
+    """
+    original_make = transform._make_call_tool
+
+    def patched_make_call_tool() -> Any:
+        tool = original_make()
+        if "arguments" in (props := (tool.parameters or {}).get("properties", 
{})):
+            props["arguments"] = {
+                "additionalProperties": True,
+                "default": None,
+                "description": "Arguments to pass to the tool",
+                "type": "object",
+            }
+        return tool
+
+    import types
+
+    transform._make_call_tool = types.MethodType(

Review Comment:
   Done — refactored from monkey-patching to subclassing 
(`_FixedBM25SearchTransform`, `_FixedRegexSearchTransform`). Also extracted the 
schema fix into a standalone `_fix_call_tool_arguments` function with its own 
unit tests. See ca64067.



##########
superset/mcp_service/chart/schemas.py:
##########


Review Comment:
   Done — trimmed verbose Field descriptions in `dashboard/schemas.py` and 
`system/schemas.py` to match the chart schema treatment. See ca64067.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to