[jira] [Updated] (SPARK-52806) SPIP: AI-Native Observability for Apache Spark History Server via Model Context Protocol

Vara Bonthu (Jira) Tue, 15 Jul 2025 17:15:04 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-52806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vara Bonthu updated SPARK-52806:
--------------------------------
    Description: 
This SPIP proposes adding AI-native observability capabilities to Apache Spark 
through a Model Context Protocol (MCP) server that enables natural language 
querying and analysis of Spark History Server data.
h2. Summary

We propose creating a bridge between AI assistants (Claude, GPT, Amazon Q) and 
Apache Spark History Server data, enabling users to ask questions like "Why is 
my Spark job slow?" and receive AI-powered analysis with actionable 
recommendations.
h2. Key Features
 * Natural language interface for Spark diagnostics
 * 17+ pre-built diagnostic tools for common performance scenarios
 * AI-powered root cause analysis and optimization recommendations
 * Zero modifications required to existing Spark installations
 * Compatible with multiple AI assistants via Model Context Protocol

h2. Community Value
 * 10x faster troubleshooting workflows
 * Lower barrier to entry for Spark performance optimization
 * Positions Apache Spark as AI-ready for next-generation observability
 * Addresses growing demand for AI-powered developer tools

h2. Implementation Approach
 * Standalone MCP server consuming existing Spark History Server REST APIs
 * No changes to Spark core required
 * Kubernetes-native deployment with Helm charts or on any virtual machine
 * Built on the emerging MCP standard for AI-tool integration

h2. Related Work
 * No related projects are available for this problem
 * This project is currently under a neutral org 
[https://github.com/DeepDiagnostix-AI/spark-history-server-mcp]

h2. Who maintains
 - Currently, Vara Bonthu (AWS Open Source Specialist SA), Manabu McCloskey 
(AWS, Open Source Engineer), along with [AWS Data 
Processing|https://aws.amazon.com/sagemaker/data-processing/] team.

 

We have also submitted a proposal to Kubeflow 
[https://github.com/kubeflow/community/issues/872. 
|https://github.com/kubeflow/community/issues/872] We want to hear from Apache 
Spark community on this amazing step forward for AI observability and are 
willing to support this project. 

 

Full SPIP document with detailed technical design, timeline, and success 
metrics will be attached as a comment.

This proposal aligns with Apache Spark's mission to make big data processing 
accessible while positioning the project at the forefront of AI-native tooling.

 

*NOTE: We are happy to demo this to the community a great solution if you 
provide the opportunity for us to present.*

 

 

 

  was:
This SPIP proposes adding AI-native observability capabilities to Apache Spark 
through a Model Context Protocol (MCP) server that enables natural language 
querying and analysis of Spark History Server data.
h2. Summary

We propose creating a bridge between AI assistants (Claude, GPT, Amazon Q) and 
Apache Spark History Server data, enabling users to ask questions like "Why is 
my Spark job slow?" and receive AI-powered analysis with actionable 
recommendations.
h2. Key Features
 * Natural language interface for Spark diagnostics
 * 17+ pre-built diagnostic tools for common performance scenarios
 * AI-powered root cause analysis and optimization recommendations
 * Zero modifications required to existing Spark installations
 * Compatible with multiple AI assistants via Model Context Protocol

h2. Community Value
 * 10x faster troubleshooting workflows
 * Lower barrier to entry for Spark performance optimization
 * Positions Apache Spark as AI-ready for next-generation observability
 * Addresses growing demand for AI-powered developer tools

h2. Implementation Approach
 * Standalone MCP server consuming existing Spark History Server REST APIs
 * No changes to Spark core required
 * Kubernetes-native deployment with Helm charts or on any virtual machine
 * Built on the emerging MCP standard for AI-tool integration

h2. Related Work
 * No related projects are available for this problem
 * This project is currently under a neutral org 
[https://github.com/DeepDiagnostix-AI/spark-history-server-mcp]

h2. Who maintains

- Currently, Vara Bonthu (AWS Open Source Specialist SA), Manabu McCloskey 
(AWS, Open Source Engineer), along with Amazon EMR service teams until we build 
the community. 

 

We have also submitted a proposal to Kubeflow 
[https://github.com/kubeflow/community/issues/872. 
|https://github.com/kubeflow/community/issues/872] We want to hear from Apache 
Spark community on this amazing step forward for AI observability and are 
willing to support this project. 

 

Full SPIP document with detailed technical design, timeline, and success 
metrics will be attached as a comment.

This proposal aligns with Apache Spark's mission to make big data processing 
accessible while positioning the project at the forefront of AI-native tooling.

 

*NOTE: We are happy to demo this to the community a great solution if you 
provide the opportunity for us to present.*

 

 

 


> SPIP: AI-Native Observability for Apache Spark History Server via Model 
> Context Protocol
> ----------------------------------------------------------------------------------------
>
>                 Key: SPARK-52806
>                 URL: https://issues.apache.org/jira/browse/SPARK-52806
>             Project: Spark
>          Issue Type: New Feature
>          Components: Documentation, Web UI
>    Affects Versions: 3.5.6, 4.0.0
>         Environment: This solution works with any Spark History server 
> deployment, irrespective of the cloud provider
>            Reporter: Vara Bonthu
>            Priority: Major
>              Labels: AI, MCP, SPIP, historyserver, observability
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> This SPIP proposes adding AI-native observability capabilities to Apache 
> Spark through a Model Context Protocol (MCP) server that enables natural 
> language querying and analysis of Spark History Server data.
> h2. Summary
> We propose creating a bridge between AI assistants (Claude, GPT, Amazon Q) 
> and Apache Spark History Server data, enabling users to ask questions like 
> "Why is my Spark job slow?" and receive AI-powered analysis with actionable 
> recommendations.
> h2. Key Features
>  * Natural language interface for Spark diagnostics
>  * 17+ pre-built diagnostic tools for common performance scenarios
>  * AI-powered root cause analysis and optimization recommendations
>  * Zero modifications required to existing Spark installations
>  * Compatible with multiple AI assistants via Model Context Protocol
> h2. Community Value
>  * 10x faster troubleshooting workflows
>  * Lower barrier to entry for Spark performance optimization
>  * Positions Apache Spark as AI-ready for next-generation observability
>  * Addresses growing demand for AI-powered developer tools
> h2. Implementation Approach
>  * Standalone MCP server consuming existing Spark History Server REST APIs
>  * No changes to Spark core required
>  * Kubernetes-native deployment with Helm charts or on any virtual machine
>  * Built on the emerging MCP standard for AI-tool integration
> h2. Related Work
>  * No related projects are available for this problem
>  * This project is currently under a neutral org 
> [https://github.com/DeepDiagnostix-AI/spark-history-server-mcp]
> h2. Who maintains
>  - Currently, Vara Bonthu (AWS Open Source Specialist SA), Manabu McCloskey 
> (AWS, Open Source Engineer), along with [AWS Data 
> Processing|https://aws.amazon.com/sagemaker/data-processing/] team.
>  
> We have also submitted a proposal to Kubeflow 
> [https://github.com/kubeflow/community/issues/872. 
> |https://github.com/kubeflow/community/issues/872] We want to hear from 
> Apache Spark community on this amazing step forward for AI observability and 
> are willing to support this project. 
>  
> Full SPIP document with detailed technical design, timeline, and success 
> metrics will be attached as a comment.
> This proposal aligns with Apache Spark's mission to make big data processing 
> accessible while positioning the project at the forefront of AI-native 
> tooling.
>  
> *NOTE: We are happy to demo this to the community a great solution if you 
> provide the opportunity for us to present.*
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-52806) SPIP: AI-Native Observability for Apache Spark History Server via Model Context Protocol

Reply via email to