[
https://issues.apache.org/jira/browse/SPARK-52806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vara Bonthu updated SPARK-52806:
--------------------------------
Environment: This solution works with any Spark History server deployment,
irrespective of the cloud provider (was: h2. Environment:
- This solution works with any Spark History server deployment, irrespective of
the cloud provider)
> SPIP: AI-Native Observability for Apache Spark History Server via Model
> Context Protocol
> ----------------------------------------------------------------------------------------
>
> Key: SPARK-52806
> URL: https://issues.apache.org/jira/browse/SPARK-52806
> Project: Spark
> Issue Type: New Feature
> Components: Documentation, Web UI
> Affects Versions: 3.5.6, 4.0.0
> Environment: This solution works with any Spark History server
> deployment, irrespective of the cloud provider
> Reporter: Vara Bonthu
> Priority: Major
> Labels: AI, MCP, SPIP, historyserver, observability
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> This SPIP proposes adding AI-native observability capabilities to Apache
> Spark through a Model Context Protocol (MCP) server that enables natural
> language querying and analysis of Spark History Server data.
> h2. Summary
> We propose creating a bridge between AI assistants (Claude, GPT, Amazon Q)
> and Apache Spark History Server data, enabling users to ask questions like
> "Why is my Spark job slow?" and receive AI-powered analysis with actionable
> recommendations.
> h2. Key Features
> * Natural language interface for Spark diagnostics
> * 17+ pre-built diagnostic tools for common performance scenarios
> * AI-powered root cause analysis and optimization recommendations
> * Zero modifications required to existing Spark installations
> * Compatible with multiple AI assistants via Model Context Protocol
> h2. Community Value
> * 10x faster troubleshooting workflows
> * Lower barrier to entry for Spark performance optimization
> * Positions Apache Spark as AI-ready for next-generation observability
> * Addresses growing demand for AI-powered developer tools
> h2. Implementation Approach
> * Standalone MCP server consuming existing Spark History Server REST APIs
> * No changes to Spark core required
> * Kubernetes-native deployment with Helm charts or on any virtual machine
> * Built on the emerging MCP standard for AI-tool integration
> h2. Related Work
> * No related projects are available for this problem
> * This project is currently under a neutral org
> [https://github.com/DeepDiagnostix-AI/spark-history-server-mcp]
> h2. Who maintains
> - Currently, Vara Bonthu (AWS Open Source Specialist SA), Manabu McCloskey
> (AWS, Open Source Engineer), along with Amazon EMR service teams until we
> build the community.
>
> We have also submitted a proposal to Kubeflow
> [https://github.com/kubeflow/community/issues/872.
> |https://github.com/kubeflow/community/issues/872] We want to hear from
> Apache Spark community on this amazing step forward for AI observability and
> are willing to support this project.
>
> Full SPIP document with detailed technical design, timeline, and success
> metrics will be attached as a comment.
> This proposal aligns with Apache Spark's mission to make big data processing
> accessible while positioning the project at the forefront of AI-native
> tooling.
>
> *NOTE: We are happy to demo this to the community a great solution if you
> provide the opportunity for us to present.*
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]