Yeah FastMCP is nice, I didn't select fast mcp for this specific reason: - The sheer number of tools that are created using OpenAPI spec doesn't need to be passed to AI every single message. - Instead, we can do a hierarchical tool discovery based on categories. And let AI select a particular category and then get tools only for that category.
python3 -c " > import json > with open('path/to/openapi.json') as f: > spec = json.load(f) > > tags = {} > for path, methods in spec['paths'].items(): > for method, details in methods.items(): > if 'tags' in details: > for tag in details['tags']: > tags[tag] = tags.get(tag, 0) + 1 > > print('Tags and their counts:') > for tag, count in sorted(tags.items(), key=lambda x: x[1], reverse=True): > print(f'{tag}: {count}') > " Tags and their counts: Task Instance: 19 Asset: 13 Connection: 8 DagRun: 8 Backfill: 7 DAG: 7 Pool: 6 Variable: 6 XCom: 4 Config: 2 Event Log: 2 Import Error: 2 Plugin: 2 Task: 2 DagVersion: 2 Login: 2 DagSource: 1 DagStats: 1 DagReport: 1 DagWarning: 1 Extra Links: 1 Job: 1 Provider: 1 DAG Parsing: 1 Monitor: 1 Version: 1 My last attempt to do a hierarchical discovery with FastMCP didn't go as expected. But this could be short term. There is something cooking in the model context protocol repo for search of a tool. Ref: https://github.com/modelcontextprotocol/modelcontextprotocol/pull/322 I'll give this a try with FastMCP to see if I can get the hierarchical discovery working. - Avi On Fri, May 30, 2025 at 1:33 AM Bryan Corder <bryancor...@gmail.com> wrote: > In order to bring value, we might want to think beyond just wrapping the > API. As Kaxil just showed, it's easy to create something with 10 lines of > code and FastMCP. > > However, the Airflow API was made for Airflow operators' consumption, not > necessarily for LLM consumption. When you have an endpoint called "Delete > DAG" with a description "Delete a specific DAG" that's very easy for any > user who has already navigated to the Airflow API spec to understand, but > maybe not the best tool description for an LLM. I think we'd want to either > exclude that or add additional context for the LLM to know it's > destructive. > > In addition, LLMs can struggle with tool selection when you give it 80 > tools to work with. Things in the middle sometimes get lost in the context. > There are ways to customize the FastMCP ( > https://gofastmcp.com/servers/openapi#custom-route-maps) to cut down the > list of options, should you choose. > > However, it may be better to create something more tailored to LLMs. > Thinking about the use case of getting LLM assistance with debugging a > failed run, one of the things my teams do is put the "run book" for prod > support in the doc_md notes right with the DAG, so if a file never shows up > they know exactly what to do in that situation (potentially, do nothing). > We also include other information like, "xx task can be flaky. If you get > this error, rerunning it will usually resolve it." The goal is for any > engineer armed with the stack trace and the run book to be able to solve > any error. My team has all that information right in the UI. To get that > information, the LLM would need to know to hit the DAG Details endpoint for > one minor attribute amongst several for the doc_md and get the correct dag > id, run id, task id and try number to grab the stack trace from the failed > run. It would then need to go elsewhere to find the DAG code to debug. I > think it would be better to just create a "debug_failed_task" tool an LLM > could call from an MCP server that would string those calls together and > serve them up to the LLM on a silver platter. The LLM could focus all its > "reasoning" efforts on solving the problem instead of figuring out how to > get the information it needs to even begin. > > Again, if we just want to wrap the API in FastMCP, we can share Kaxil's 10 > lines of code in a Medium article and be done. I think the real value is in > providing an implementation of a limited set of more complex base tools > like debug_failed_task (described above), pause_all_active_DAGs (because > I'm about to upgrade!), describe_DAG (grabs only the description, > dependencies, converts cron schedule to human readable if applicable, etc) > and giving people a way to extend the server. > > The above is tool focused. As Avi pointed out, there are also resources and > prompts, but I've only personally worked with tools and have nothing to add > there. > > With all the LLM tools quickly advancing on the development side (e.g. code > generation/review), it's great to see the community working on building > tools to help with the operational side. > > Bryan > > > On Thu, May 29, 2025, 4:50 PM Kaxil Naik <kaxiln...@gmail.com> wrote: > > > One more comment: MCP SDKs have advanced quite a bit and I was able to > get > > an Airflow MCP Server working with just the following code block. I was > > successfully able to pause/unpause a DAG from Claude and other MCP client > > as an example. So as much as possible we should utilize higher level > > abstraction like FastMCP which allows creating client from OpenAPI spec > > <https://gofastmcp.com/servers/openapi#openapi-integration>: > > > > import os > > > > import httpx > > from fastmcp import FastMCP > > > > token = os.environ.get("AF_ACCESS_TOKEN") > > client = httpx.AsyncClient( > > base_url="http://localhost:28080", > > headers={"Authorization": f"Bearer {token}"}, > > ) > > > > openapi_spec = httpx.get("http://localhost:28080/openapi.json > ").json() > > > > mcp = FastMCP.from_openapi( > > openapi_spec=openapi_spec, > > client=client, > > name="Airflow 3.0 API Server" > > ) > > > > if __name__ == "__main__": > > mcp.run() > > > > > > > > On Thu, 29 May 2025 at 20:32, Avi <a...@astronomer.io.invalid> wrote: > > > > > @Shahar -- Yes. Definitely. Feel free to reachout if you need anything. > > > > > > I totally agree, it to live as a separate repo. > > > > > > - Avi > > > > > > On Thu, May 29, 2025 at 12:50 PM Kaxil Naik <kaxiln...@gmail.com> > wrote: > > > > > > > @Shahar -- Absolutely, I think you are driving it with this email. > So I > > > > think you can lead it from here and whoever wants to join can co-lead > > or > > > > join in development. > > > > > > > > Please feel free to drive :) > > > > > > > > On Thu, 29 May 2025 at 17:07, Aaron Dantley <aarondant...@gmail.com> > > > > wrote: > > > > > > > > > Hey All! > > > > > > > > > > I’d be grateful to be included in the AIP discussions to help if > > > possible > > > > > too! Like Shahar, I’ve never worked on any of these items so it’d > be > > > > great > > > > > to see how work gets assigned and goes through a whole development > > > cycle! > > > > > > > > > > Looking forward to it! > > > > > Aaron > > > > > > > > > > On Thu, May 29, 2025 at 7:32 AM Shahar Epstein <sha...@apache.org> > > > > wrote: > > > > > > > > > > > If it's ok, I would like to lead the AIP effort (or at least > > > co-lead), > > > > as > > > > > > I've never written an AIP before. I could start drafting it > during > > > the > > > > > next > > > > > > week. > > > > > > Avi - please let me know if it works for you. > > > > > > > > > > > > > > > > > > Shahar > > > > > > > > > > > > > > > > > > On Thu, May 29, 2025, 13:09 Kaxil Naik <kaxiln...@gmail.com> > > wrote: > > > > > > > > > > > > > Yes separate repo, please and we would need someone to lead > this > > > > effort > > > > > > on > > > > > > > the proposal & development too. Avi - you are probably well > > > equipped > > > > to > > > > > > > lead it and I am sure more folks like Aaraon would be eager to > > work > > > > on > > > > > > its > > > > > > > development and on-going maintenance. > > > > > > > > > > > > > > Regards, > > > > > > > Kaxil > > > > > > > > > > > > > > On Thu, 29 May 2025 at 15:25, Jarek Potiuk <ja...@potiuk.com> > > > wrote: > > > > > > > > > > > > > > > Yep. Having MCP is cool and drawing our implementation from > > > > > experiences > > > > > > > and > > > > > > > > usage of other MCP servers out there is even cooler > (especially > > > > that > > > > > we > > > > > > > can > > > > > > > > have some insights how people already use them with Airflow) > - > > if > > > > we > > > > > > can > > > > > > > > bring together a few of those, put some nice, relevant > Airflow > > > > > prompts. > > > > > > > > Ideally we could have some examples of how MCP can be used > > taken > > > > from > > > > > > > those > > > > > > > > who are using airflow (the debugging example by Avi is cool) > > > > > > > > > > > > > > > > I am not sure implementing it as provider is really "the way" > > > > though > > > > > - > > > > > > I > > > > > > > > would rather see `apache-airflow-mcp" separate repo - it's so > > > > > different > > > > > > > and > > > > > > > > distinct from airflow it does not really require any of > Airflow > > > > > > internals > > > > > > > > and code to be implemented - it makes very little sense to be > > the > > > > > part > > > > > > of > > > > > > > > airflow "workspace" where we would develop it together with > > > > airflow - > > > > > > > > because if it will talk over the REST api, all we need is the > > > > > `client` > > > > > > > that > > > > > > > > might be just a dependency. And there is even no reason for > MCP > > > and > > > > > > > airflow > > > > > > > > to be installed and developed together (that's the main > reason > > > why > > > > we > > > > > > > want > > > > > > > > providers to be kept in monorepo. > > > > > > > > > > > > > > > > J. > > > > > > > > > > > > > > > > > > > > > > > > On Thu, May 29, 2025 at 8:37 AM Amogh Desai < > > > > > amoghdesai....@gmail.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Seems like a promising area to invest in given the benefits > > it > > > > can > > > > > > > > provide > > > > > > > > > to > > > > > > > > > the users as mentioned by Shahar and Abhishek. > > > > > > > > > > > > > > > > > > Abhishek also has a promising talk submitted which i am > > looking > > > > > > forward > > > > > > > > to > > > > > > > > > this year at the summit. > > > > > > > > > > > > > > > > > > In any case, this seems to be one of the first of the very > > few > > > > > > > > > implementations of trying > > > > > > > > > to integrate Airflow officially / unofficially with an MCP > > > > server. > > > > > > > > > > > > > > > > > > Thanks & Regards, > > > > > > > > > Amogh Desai > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, May 29, 2025 at 2:56 AM Aaron Dantley < > > > > > > aarondant...@gmail.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hey! > > > > > > > > > > > > > > > > > > > > I also think this is a great idea! > > > > > > > > > > > > > > > > > > > > Would it be possible to be included in the development > > > process? > > > > > > > > > > > > > > > > > > > > Sorry I’m new to this group, but would appreciate any > > > > suggestions > > > > > > on > > > > > > > > how > > > > > > > > > to > > > > > > > > > > contribute to the MCP server development! > > > > > > > > > > > > > > > > > > > > Regards! > > > > > > > > > > Aaron > > > > > > > > > > > > > > > > > > > > On Wed, May 28, 2025 at 2:57 PM Avi > > > <a...@astronomer.io.invalid > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Nice to see the idea to incorporate an official MCP > > server > > > > for > > > > > > > > > > > Airflow. It's been really magical to see what a simple > > LLM > > > > can > > > > > do > > > > > > > > with > > > > > > > > > an > > > > > > > > > > > Airflow MCP server built just from APIs. > > > > > > > > > > > > > > > > > > > > > > A few things that I noticed in my experience: > > > > > > > > > > > - The number of tools that the OpenAPI spec generates > is > > > > quite > > > > > > > huge. > > > > > > > > > Most > > > > > > > > > > > tools (*Claude, VS Code with GitHub Copilot, Cursor, > > > > Windsurf*) > > > > > > > which > > > > > > > > > > uses > > > > > > > > > > > mcp-client limits it to a number of 100 tools. (*The > > > > read-only > > > > > > mode > > > > > > > > > > creates > > > > > > > > > > > less tools in comparison*.) > > > > > > > > > > > - MCP server are just not tools. There are other things > > as > > > > > well, > > > > > > > like > > > > > > > > > > > resources and prompts. Prompts are super helpful in > case > > of > > > > > > > debugging > > > > > > > > > for > > > > > > > > > > > example. It is a way of teaching LLM about Airflow. > Say I > > > > want > > > > > to > > > > > > > > have > > > > > > > > > a > > > > > > > > > > > failing task investigated. A prompt can be helpful in > > > letting > > > > > LLM > > > > > > > > know > > > > > > > > > a > > > > > > > > > > > step-by-step process of carrying out the investigation. > > > > > > > > > > > - Where do you run the MCP server? I wouldn't want my > > > laptop > > > > to > > > > > > do > > > > > > > > the > > > > > > > > > > > heavy processing, which would want us to go for the SSE > > > > instead > > > > > > of > > > > > > > > > stdio. > > > > > > > > > > > > > > > > > > > > > > This is why I chose two different path of using mcp > > server > > > > with > > > > > > > > > airflow, > > > > > > > > > > > which I intend to talk about at the summit. > > > > > > > > > > > > > > > > > > > > > > 1. AI-Augmented Airflow - This helped me add a chat > > > interface > > > > > > > inside > > > > > > > > > > > Airflow using a plugin to talk to an Airflow instance > > (read > > > > > only > > > > > > > > mode). > > > > > > > > > > > > > > > > > > > > > > 2. Airflow-Powered AI - Experimenting with this has > been > > > > > totally > > > > > > > > > magical, > > > > > > > > > > > how powerful AI can become when it has access to > airflow. > > > > > Also, a > > > > > > > > > > directory > > > > > > > > > > > structure to maintain the DAGs, and it can write DAGs > on > > > the > > > > > > fly. I > > > > > > > > > > totally > > > > > > > > > > > see a need where LLMs eventually will need a scheduler, > > > > > although > > > > > > a > > > > > > > > > > complete > > > > > > > > > > > airflow just for an LLM might seem a bit overkill to > the > > > rest > > > > > of > > > > > > > the > > > > > > > > > > > community. > > > > > > > > > > > > > > > > > > > > > > I chose to build this on top of open API is because > that > > > was > > > > > the > > > > > > > only > > > > > > > > > way > > > > > > > > > > > to get proper RBAC enabled. > > > > > > > > > > > > > > > > > > > > > > I have so many points to discuss. Would love to hear > from > > > the > > > > > > > > community > > > > > > > > > > and > > > > > > > > > > > then take it forward. > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > Avi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 28, 2025 at 6:32 PM Aritra Basu < > > > > > > > > aritrabasu1...@gmail.com> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > I definitely think there's potential to interact with > > an > > > > > > airflow > > > > > > > > MCP > > > > > > > > > > > > server. Though I think I'd be interested to see how > > many > > > > and > > > > > > how > > > > > > > > > > > frequently > > > > > > > > > > > > people are making use of MCP servers in the wild > before > > > > > > investing > > > > > > > > > > effort > > > > > > > > > > > in > > > > > > > > > > > > building and maintaining one for airflow. I'm sure > the > > > data > > > > > is > > > > > > > > > > available > > > > > > > > > > > > out there, just needs finding. > > > > > > > > > > > > -- > > > > > > > > > > > > Regards, > > > > > > > > > > > > Aritra Basu > > > > > > > > > > > > > > > > > > > > > > > > On Wed, 28 May 2025, 11:18 pm Julian LaNeve, > > > > > > > > > > > <jul...@astronomer.io.invalid > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > I think this would be interesting now that the > > > Streamable > > > > > > HTTP > > > > > > > > > spec < > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://modelcontextprotocol.io/specification/2025-03-26/basic/transports> > > > > > > > > > > > > > is out. I think in theory we could publish this > first > > > as > > > > an > > > > > > > > Airflow > > > > > > > > > > > > > provider that installs a plugin to expose an MCP > > > > endpoint, > > > > > > as a > > > > > > > > > PoC - > > > > > > > > > > > > this > > > > > > > > > > > > > becomes a much nicer experience than a local stdio > > one. > > > > > > > > > > > > > -- > > > > > > > > > > > > > Julian LaNeve > > > > > > > > > > > > > CTO > > > > > > > > > > > > > > > > > > > > > > > > > > Email: jul...@astronomer.io > > > > > > > > > > > > > <mailto:jul...@astronomer.io>Mobile: 330 509 5792 > > > > > > > > > > > > > > > > > > > > > > > > > > > On May 28, 2025, at 1:25 PM, Shahar Epstein < > > > > > > > sha...@apache.org > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > Dear community, > > > > > > > > > > > > > > > > > > > > > > > > > > > > Following the thread on Slack [1], initiated by > > Jason > > > > > > > Sebastian > > > > > > > > > > > Kusuma, > > > > > > > > > > > > > I'd > > > > > > > > > > > > > > like to start an effort to officially support MCP > > in > > > > > > > Airflow's > > > > > > > > > > > > codebase. > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Some background * > > > > > > > > > > > > > > Model Context Protocol (MCP) is an open standard, > > > > > > open-source > > > > > > > > > > > framework > > > > > > > > > > > > > > that standardizes the way AI models like LLM > > > integrate > > > > > and > > > > > > > > share > > > > > > > > > > data > > > > > > > > > > > > > with > > > > > > > > > > > > > > external tools, systems and data sources. Think > of > > it > > > > as > > > > > a > > > > > > > > "USB-C > > > > > > > > > > for > > > > > > > > > > > > > AI" - > > > > > > > > > > > > > > a universal connector that simplifies and > > > standardizes > > > > AI > > > > > > > > > > > > integrations. A > > > > > > > > > > > > > > notable example of an MCP server is GitHub's > > official > > > > > > > > > > implementation > > > > > > > > > > > > > [3], which > > > > > > > > > > > > > > allows LLMs such as Claude, Copilot, and OpenAI > > (or: > > > > "MCP > > > > > > > > > clients") > > > > > > > > > > > to > > > > > > > > > > > > > > fetch pull request details, analyze code changes, > > and > > > > > > > generate > > > > > > > > > > review > > > > > > > > > > > > > > summaries. > > > > > > > > > > > > > > > > > > > > > > > > > > > > *How could an MCP server be useful in Airflow?* > > > > > > > > > > > > > > Imagine the possibilities when LLMs can > seamlessly > > > > > interact > > > > > > > > with > > > > > > > > > > > > > Airflow’s > > > > > > > > > > > > > > API: triggering DAGs using natural language, > > > retrieving > > > > > DAG > > > > > > > run > > > > > > > > > > > > history, > > > > > > > > > > > > > > enabling smart debugging, and more. This kind of > > > > > > integration > > > > > > > > > opens > > > > > > > > > > > the > > > > > > > > > > > > > door > > > > > > > > > > > > > > to a more intuitive, conversational interface for > > > > > workflow > > > > > > > > > > > > orchestration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Why do we need to support it officially?* > > > > > > > > > > > > > > Quid pro quo - LLMs become an integral part of > the > > > > modern > > > > > > > > > > development > > > > > > > > > > > > > > experience, while Airflow evolves into the go-to > > for > > > > > > > > > orchestrating > > > > > > > > > > AI > > > > > > > > > > > > > > workflows. By officially supporting it, we’ll > > enable > > > > > > multiple > > > > > > > > > users > > > > > > > > > > > to > > > > > > > > > > > > > > interact with Airflow through their LLMs, > > > streamlining > > > > > > > > automation > > > > > > > > > > and > > > > > > > > > > > > > > improving accessibility across diverse workflows. > > All > > > > of > > > > > > that > > > > > > > > is > > > > > > > > > > > viable > > > > > > > > > > > > > > with relatively small development effort (see > next > > > > > > > paragraph). > > > > > > > > > > > > > > > > > > > > > > > > > > > > *How should it be implemented?* > > > > > > > > > > > > > > As of today, there have been several > > implementations > > > of > > > > > MCP > > > > > > > > > servers > > > > > > > > > > > for > > > > > > > > > > > > > > Airflow API, the most visible one [4] made by > > > Abhishek > > > > > > Bhakat > > > > > > > > > from > > > > > > > > > > > > > > Astronomer. > > > > > > > > > > > > > > The efforts of implementing it and maintaining it > > in > > > > our > > > > > > > > codebase > > > > > > > > > > > > > shouldn't > > > > > > > > > > > > > > be too cumbersome (at least in theory), as we > could > > > > > utilize > > > > > > > > > > packages > > > > > > > > > > > > like > > > > > > > > > > > > > > fastmcp to auto-generate the server using the > > > existing > > > > > > > OpenAPI > > > > > > > > > > specs. > > > > > > > > > > > > I'd > > > > > > > > > > > > > > be very happy if Abhishek could share his > > experience > > > in > > > > > > this > > > > > > > > > > thread. > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Where else could we utilize MCP?* > > > > > > > > > > > > > > Beyond the scope of the public API, I could also > > > > imagine > > > > > > > using > > > > > > > > it > > > > > > > > > > to > > > > > > > > > > > > > > communicate with Breeze. > > > > > > > > > > > > > > > > > > > > > > > > > > > > *How do we proceed from here?* > > > > > > > > > > > > > > Feel free to share your thoughts here in this > > > > discussion. > > > > > > > > > > > > > > If there are no objections, I'll be happy to > start > > > > > working > > > > > > on > > > > > > > > an > > > > > > > > > > AIP. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sincerely, > > > > > > > > > > > > > > Shahar Epstein > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *References:* > > > > > > > > > > > > > > [1] Slack discussion, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://apache-airflow.slack.com/archives/C06K9Q5G2UA/p1746121916951569 > > > > > > > > > > > > > > [2] Introducing the model context protocol, > > > > > > > > > > > > > > > > > https://www.anthropic.com/news/model-context-protocol > > > > > > > > > > > > > > [3] GitHub Official MCP server, > > > > > > > > > > > > > https://github.com/github/github-mcp-server > > > > > > > > > > > > > > [4] Unofficial MCP Server made by Abhishek Hakat, > > > > > > > > > > > > > > > > https://github.com/abhishekbhakat/airflow-mcp-server > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >