[ 
https://issues.apache.org/jira/browse/SPARK-52806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vara Bonthu updated SPARK-52806:
--------------------------------
    Environment: This solution works with any Spark History server deployment, 
irrespective of the cloud provider  (was: h2. Environment:

- This solution works with any Spark History server deployment, irrespective of 
the cloud provider)

> SPIP: AI-Native Observability for Apache Spark History Server via Model 
> Context Protocol
> ----------------------------------------------------------------------------------------
>
>                 Key: SPARK-52806
>                 URL: https://issues.apache.org/jira/browse/SPARK-52806
>             Project: Spark
>          Issue Type: New Feature
>          Components: Documentation, Web UI
>    Affects Versions: 3.5.6, 4.0.0
>         Environment: This solution works with any Spark History server 
> deployment, irrespective of the cloud provider
>            Reporter: Vara Bonthu
>            Priority: Major
>              Labels: AI, MCP, SPIP, historyserver, observability
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> This SPIP proposes adding AI-native observability capabilities to Apache 
> Spark through a Model Context Protocol (MCP) server that enables natural 
> language querying and analysis of Spark History Server data.
> h2. Summary
> We propose creating a bridge between AI assistants (Claude, GPT, Amazon Q) 
> and Apache Spark History Server data, enabling users to ask questions like 
> "Why is my Spark job slow?" and receive AI-powered analysis with actionable 
> recommendations.
> h2. Key Features
>  * Natural language interface for Spark diagnostics
>  * 17+ pre-built diagnostic tools for common performance scenarios
>  * AI-powered root cause analysis and optimization recommendations
>  * Zero modifications required to existing Spark installations
>  * Compatible with multiple AI assistants via Model Context Protocol
> h2. Community Value
>  * 10x faster troubleshooting workflows
>  * Lower barrier to entry for Spark performance optimization
>  * Positions Apache Spark as AI-ready for next-generation observability
>  * Addresses growing demand for AI-powered developer tools
> h2. Implementation Approach
>  * Standalone MCP server consuming existing Spark History Server REST APIs
>  * No changes to Spark core required
>  * Kubernetes-native deployment with Helm charts or on any virtual machine
>  * Built on the emerging MCP standard for AI-tool integration
> h2. Related Work
>  * No related projects are available for this problem
>  * This project is currently under a neutral org 
> [https://github.com/DeepDiagnostix-AI/spark-history-server-mcp]
> h2. Who maintains
> - Currently, Vara Bonthu (AWS Open Source Specialist SA), Manabu McCloskey 
> (AWS, Open Source Engineer), along with Amazon EMR service teams until we 
> build the community. 
>  
> We have also submitted a proposal to Kubeflow 
> [https://github.com/kubeflow/community/issues/872. 
> |https://github.com/kubeflow/community/issues/872] We want to hear from 
> Apache Spark community on this amazing step forward for AI observability and 
> are willing to support this project. 
>  
> Full SPIP document with detailed technical design, timeline, and success 
> metrics will be attached as a comment.
> This proposal aligns with Apache Spark's mission to make big data processing 
> accessible while positioning the project at the forefront of AI-native 
> tooling.
>  
> *NOTE: We are happy to demo this to the community a great solution if you 
> provide the opportunity for us to present.*
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to