[PR] feat(mcp): PR3 - add dashboard and dataset listing and info tools [superset]

via GitHub Fri, 24 Oct 2025 17:18:43 -0700


aminghadersohi opened a new pull request, #35836:
URL: https://github.com/apache/superset/pull/35836


   ### SUMMARY
   
   This PR adds 6 new MCP tools for dashboard and dataset management, expanding 
the MCP service to provide comprehensive read access to Superset's core 
resources.
   
   **Built on top of:** PR #6 (feat/mcp_service_pr2_chart_listing_and_info) - 
Chart listing and info tools
   
   **What's Included:**
   
   ## Dashboard Tools (3 tools)
   
   **list_dashboards** - Paginated dashboard listing
   - Filter by: dashboard_title, published, favorite
   - Search across dashboard fields
   - Sort by: id, dashboard_title, slug, published, changed_on, created_on
   - Returns: DashboardList with UUID, slug, charts, owners
   - 1-based pagination (page, page_size parameters)
   
   **get_dashboard_info** - Retrieve dashboard details
   - Lookup by integer ID, UUID string, or slug
   - Returns: Complete dashboard metadata (title, position_json, charts, 
owners, roles)
   - Supports all three identifier types (ID/UUID/slug)
   - Consistent error responses with DashboardError schema
   
   **get_dashboard_available_filters** - List filterable columns
   - Returns available filter fields and operators
   - Helps LLMs discover what dashboard fields can be filtered
   - Returns: DashboardAvailableFilters with column/operator metadata
   
   ## Dataset Tools (3 tools)
   
   **list_datasets** - Paginated dataset listing
   - Filter by: table_name, schema, owner, favorite
   - Search across dataset fields
   - Sort by: id, table_name, schema, changed_on, created_on
   - Returns: DatasetList with columns and metrics for each dataset
   - 1-based pagination (page, page_size parameters)
   
   **get_dataset_info** - Retrieve dataset details
   - Lookup by integer ID or UUID string
   - Returns: Complete dataset metadata including:
     - Column definitions with types and expressions
     - Metrics with SQL expressions
     - Schema, database, table information
   - Consistent error responses with DatasetError schema
   
   **get_dataset_available_filters** - List filterable columns
   - Returns available filter fields and operators
   - Helps LLMs discover what dataset fields can be filtered
   - Returns: DatasetAvailableFilters with column/operator metadata
   
   ## Core Infrastructure (579 lines)
   
   **mcp_core.py** (expanded to 506 lines, +209 lines):
   - **ModelGetAvailableFiltersCore**: Generic base class for 
get_*_available_filters tools
     - Type-safe with `Generic[T]` for proper type inference
     - Works with any Superset DAO (DashboardDAO, DatasetDAO, etc.)
     - Returns filterable columns and supported operators
   
   **utils/retry_utils.py** (340 lines):
   - RetryableOperation class for database resilience
   - Exponential backoff for transient failures
   - Used by DAO operations throughout MCP service
   
   **CLAUDE.md** (431 lines):
   - Comprehensive guide for LLM agents working on MCP service
   - Architecture overview and directory structure
   - Tool development patterns and conventions
   - Critical pitfalls to avoid
   - Quick checklist for adding new tools
   
   ## Schemas (1,107 lines)
   
   **dashboard/schemas.py** (418 lines):
   - `DashboardInfo`: Detailed dashboard metadata model
   - `DashboardList`: Paginated list response with PaginationInfo
   - `DashboardError`: Consistent error responses
   - `DashboardFilter`: Filter specification (col, opr, value)
   - `ListDashboardsRequest`, `GetDashboardInfoRequest`: Tool request models
   - `DashboardAvailableFilters`, `GetDashboardAvailableFiltersRequest`: Filter 
metadata
   
   **dataset/schemas.py** (349 lines):
   - `DatasetInfo`: Detailed dataset metadata model with columns/metrics
   - `DatasetList`: Paginated list response with PaginationInfo
   - `DatasetError`: Consistent error responses
   - `DatasetFilter`: Filter specification (col, opr, value)
   - `ListDatasetsRequest`, `GetDatasetInfoRequest`: Tool request models
   - `DatasetAvailableFilters`, `GetDatasetAvailableFiltersRequest`: Filter 
metadata
   
   **chart/schemas.py** (expanded to 793 lines, +510 lines):
   - Added `serialize_chart_object()` helper for dashboard chart serialization
   - Enhanced chart metadata extraction
   - Used by dashboard tools to serialize embedded charts
   
   **system/schemas.py** (expanded to 111 lines, +24 lines):
   - Enhanced type safety with TYPE_CHECKING imports
   - Additional shared schemas used across tools
   
   ## Testing (1,804 lines)
   
   **test_dashboard_tools.py** (573 lines, 15 tests):
   - Schema validation tests for DashboardInfo, DashboardList, DashboardFilter
   - Request/response model validation
   - list_dashboards: Basic, filters, search, custom columns, sorting tests
   - get_dashboard_info: ID lookup, UUID lookup, slug lookup, not found cases
   - Filter specification tests
   
   **test_dataset_tools.py** (1,231 lines, 20 tests):
   - Schema validation tests for DatasetInfo, DatasetList, DatasetFilter
   - Request/response model validation
   - list_datasets: Basic, filters, search, custom columns, sorting tests
   - get_dataset_info: ID lookup, UUID lookup, not found cases, columns/metrics
   - Filter specification tests
   - Default ordering tests
   
   **test_chart_schemas.py** (160 lines, 11 tests):
   - XYChartConfig validation tests (label uniqueness, duplicate detection)
   - TableChartConfig validation tests
   - Chart configuration edge cases
   
   **Total: 62 MCP tests passing** (11 chart + 15 dashboard + 20 dataset + 16 
core)
   
   ## Integration
   
   **app.py** (13 new lines):
   ```python
   from superset.mcp_service.dashboard.tool import (  # noqa: F401, E402
       get_dashboard_available_filters,
       get_dashboard_info,
       list_dashboards,
   )
   from superset.mcp_service.dataset.tool import (  # noqa: F401, E402
       get_dataset_available_filters,
       get_dataset_info,
       list_datasets,
   )
   ```
   
   This follows the established pattern: tool imports in app.py auto-register 
via `@mcp.tool` decorators.
   
   ## Architecture & Patterns
   
   All 6 tools follow the same reusable patterns established in PR #6:
   
   1. **Use core classes**: ModelListCore for listing, ModelGetInfoCore for 
retrieval, ModelGetAvailableFiltersCore for filter metadata
   2. **Pydantic schemas**: Type-safe requests/responses with Field 
descriptions for LLMs
   3. **DAO pattern**: DashboardDAO and DatasetDAO for database access
   4. **Authentication**: All tools use @mcp_auth_hook for security
   5. **Type safety**: Modern Python 3.10+ type hints (`T | None` instead of 
`Optional[T]`)
   6. **Consistent errors**: Structured error responses with timestamps
   
   **Reusability:**
   - Core classes work with any Superset entity (charts, dashboards, datasets, 
etc.)
   - Just provide the DAO class and schema - the rest is handled automatically
   - Same pattern will be used for upcoming chart creation and SQL Lab tools
   
   ### BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
   
   N/A - Backend MCP service infrastructure only, no UI changes.
   
   **Example tool usage via Claude Desktop:**
   
   ```
   User: "List all published dashboards"
   Claude calls: list_dashboards({
     "filters": [{"col": "published", "opr": "eq", "value": true}]
   })
   
   User: "Show me the Sales Dashboard details"
   Claude calls: get_dashboard_info({"identifier": "sales-dashboard"})
   
   User: "List datasets in the analytics schema"
   Claude calls: list_datasets({
     "filters": [{"col": "schema", "opr": "eq", "value": "analytics"}]
   })
   
   User: "What columns are in dataset 42?"
   Claude calls: get_dataset_info({"identifier": 42})
   ```
   
   ### TESTING INSTRUCTIONS
   
   **Prerequisites:**
   - Superset running with PR #6 (chart tools) merged
   - Python 3.10 or 3.11
   - fastmcp installed (`pip install -e .[development]`)
   
   **Setup:**
   
   ```bash
   # 1. Ensure database is initialized
   export FLASK_APP=superset
   superset db upgrade
   superset init
   
   # 2. Create admin user (if not already done)
   superset fab create-admin \
     --username admin \
     --firstname Admin \
     --lastname Admin \
     --email admin@localhost \
     --password admin
   
   # 3. Load example data (for testing)
   superset load-examples
   ```
   
   **Run Tests:**
   
   ```bash
   # Run all MCP service unit tests
   pytest tests/unit_tests/mcp_service/ -v
   
   # Should see:
   # - test_chart_schemas.py: 11 chart schema tests passing
   # - test_list_charts.py: Chart tool tests passing
   # - test_dashboard_tools.py: 15 dashboard tests passing
   # - test_dataset_tools.py: 20 dataset tests passing
   # - test_mcp_core.py: Core infrastructure tests passing
   # Total: 62 tests passing
   ```
   
   **Test with MCP Server:**
   
   ```bash
   # Terminal 1: Start Superset
   superset run -p 9001
   
   # Terminal 2: Start MCP service
   superset mcp run --port 5008 --debug
   
   # Terminal 3: Test with curl
   curl http://localhost:5008/health
   
   # Expected: {"status": "ok", "timestamp": "...", ...}
   ```
   
   **Test Tools via Claude Desktop:**
   
   Configure Claude Desktop with:
   ```json
   {
     "mcpServers": {
       "superset": {
         "command": "superset",
         "args": ["mcp", "run", "--port", "5008"]
       }
     }
   }
   ```
   
   Then test queries:
   - "List all dashboards" → Should call list_dashboards tool
   - "Show me dashboard 1" → Should call get_dashboard_info tool
   - "List all datasets" → Should call list_datasets tool
   - "What columns are in dataset 5?" → Should call get_dataset_info tool
   - "What can I filter dashboards by?" → Should call 
get_dashboard_available_filters
   
   **Verify Error Handling:**
   
   ```python
   # Test in Python shell with Flask app context
   from superset.mcp_service.dashboard.tool import list_dashboards, 
get_dashboard_info
   from superset.mcp_service.dataset.tool import list_datasets, get_dataset_info
   
   # Test pagination
   result = list_dashboards.fn(page=1, page_size=10)
   assert result.pagination.page == 1
   
   # Test filtering
   result = list_dashboards.fn(filters=[{"col": "published", "opr": "eq", 
"value": True}])
   assert all(d.published for d in result.dashboards)
   
   # Test not found
   result = get_dashboard_info.fn(identifier=99999)
   assert result.error is not None
   
   # Test dataset with columns
   result = get_dataset_info.fn(identifier=1)
   assert result.columns is not None
   assert len(result.columns) > 0
   ```
   
   ### ADDITIONAL INFORMATION
   
   - [ ] Has associated issue:
   - [ ] Required feature flags: None
   - [ ] Changes UI: No
   - [ ] Includes DB Migration: No
   - [x] Introduces new feature or API: Yes - MCP dashboard and dataset tools
   - [ ] Removes existing feature or API: No
   
   **Stats:**
   - 24 files changed
   - 5,294 insertions, 19 deletions
   - 1,804 lines of tests (35 test cases: 15 dashboard + 20 dataset)
   - 100% mypy compliant
   - All pre-commit hooks passing
   
   **Future PRs will add:**
   - Chart creation tools (generate_chart, update_chart, generate_chart_preview)
   - Dashboard creation tools (generate_dashboard, add_chart_to_dashboard)
   - SQL Lab integration tools (execute_sql, open_sql_lab_with_context)
   - Advanced authentication (JWT validation, user impersonation)
   - Field-level permissions and audit logging
   
   **Notes:**
   - Builds cleanly on PR #6 (chart tools)
   - Minimal changes to existing code (only app.py imports)
   - No database migrations
   - No UI changes
   - Optional dependency (fastmcp in development requirements only)
   - All tools follow established patterns from PR #6
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] feat(mcp): PR3 - add dashboard and dataset listing and info tools [superset]

Reply via email to