Hi everyone, I'd like to open a discussion on a new Superset Improvement Proposal titled:
*SIP-171 (Revised): Model Context Protocol (MCP) Service for Apache Superset* This SIP proposes implementing a Model Context Protocol (MCP) service that enables LLM agents (Claude, GPT, etc.) to interact with Superset through standardized JSON-RPC 2.0 protocols. The proposal introduces a FastMCP-based standalone service that uses Superset as a library, allowing AI agents to naturally create charts, manage dashboards, query data, and perform analytics workflows through conversational interfaces. *Key highlights:* - *Revised architecture*: Based on learnings from POC development, this updates the original SIP-171 approach to use a standalone FastMCP service rather than ASGI-Flask integration - *Library-first design*: MCP service uses Superset's DAOs, Commands, and models directly without web framework dependencies - *Security reuse*: Leverages existing RBAC, authentication, and authorization infrastructure—no additional permission grants required - *Preview-first workflows*: Optimized for exploratory LLM conversations with cached form data and iterative chart refinement before persistence - *Extensible architecture*: Plugin-based system allows extensions to register custom MCP tools *Why MCP?* The Model Context Protocol (developed by Anthropic) provides a standardized way for AI agents to interact with external tools and services. By implementing MCP for Superset, we enable natural language-driven analytics workflows while maintaining full security controls and code reusability. *Example tools:* - Chart management: generate_chart(), update_chart(), get_chart_preview() - Discovery: list_dashboards(), list_charts(), list_datasets() - SQL Lab: execute_sql(), generate_explore_link() - Dashboard composition: generate_dashboard(), add_chart_to_existing_dashboard() 🔗 *POC Pull Request:* https://github.com/apache/superset/pull/33976 🔗 *Full SIP document:* https://github.com/apache/superset/issues/35498 The POC implementation demonstrates chart listing/info tools, dataset discovery, dashboard management, and SQL execution—all with preview-first workflows optimized for AI agent interactions. *Deployment flexibility:* - Standalone service: superset mcp run --port 5008 - Docker Compose integration with independent scaling - Production-ready with middleware for auth, rate limiting, and audit logging Please feel free to share feedback or suggestions directly on the GitHub thread. I'm particularly interested in thoughts on: 1. The library-first architectural approach 2. Preview-first workflows for exploratory AI interactions 3. Extension/plugin patterns for adding custom MCP tools 4. Security model and permission enforcement approach Looking forward to hearing your thoughts! Thanks, Amin Ghadersohi
