kaxil opened a new pull request, #67791:
URL: https://github.com/apache/airflow/pull/67791

   common.ai's curated toolsets (`SQLToolset`, `HookToolset`, `MCPToolset`) are 
pydantic-ai `AbstractToolset`s and already work natively with `AgentOperator`. 
This adds the reverse direction: `airflow_toolset_to_langchain_tools(toolset)` 
converts any of them into LangChain `StructuredTool` objects, so a LangChain 
agent or chain running inside an Airflow task can call Airflow's 
connection-managed, validated tools.
   
   ## Why this lives in common.ai, not a separate langchain provider
   
   A toolset bridge is tool interop, not an agent runtime. common.ai already 
ships the `langchain` optional extra, a `LangChainHook` for model access, and 
LangChain example DAGs, so the dependency boundary is already here (langchain 
is imported lazily and gated by the extra). The forward direction (LangChain 
tools into `AgentOperator`) is already covered by pydantic-ai's upstream 
[`pydantic_ai.ext.langchain.LangChainToolset`](https://ai.pydantic.dev/toolsets/),
 so keeping only the reverse bridge in a separate provider would split the two 
halves of one feature. A dedicated provider for a single converter function is 
disproportionate overhead.
   
   This PR does not add a LangChain agent backend or a `LangChainOperator`. 
Frameworks that want to be the agent runtime (LangGraph Platform, LangChain 
Runnable operators) still belong in their own provider.
   
   ## Usage
   
   ```python
   from langchain.agents import create_agent
   
   from airflow.providers.common.ai.hooks.langchain import LangChainHook
   from airflow.providers.common.ai.toolsets import 
airflow_toolset_to_langchain_tools
   from airflow.providers.common.ai.toolsets.sql import SQLToolset
   
   tools = 
airflow_toolset_to_langchain_tools(SQLToolset(db_conn_id="sql_default"))
   model = LangChainHook(llm_conn_id="langchain_default", 
llm_model="openai:gpt-4o").get_chat_model()
   agent = create_agent(model, tools=tools, system_prompt="You are a SQL 
analyst.")
   ```
   
   For the forward direction, no Airflow code is needed: put 
`LangChainToolset([my_tool])` into `AgentOperator(toolsets=[...])`.
   
   ## Notes and tradeoffs
   
   - Outside an agent run there is no live `RunContext`, so the bridge builds a 
minimal one with an inert placeholder model. The bundled toolsets ignore the 
context; a custom toolset that reads `ctx.model`, `ctx.messages`, or 
`ctx.usage` will not behave correctly when bridged standalone. This is 
documented on the function and in the toolsets guide.
   - `get_tools` is invoked eagerly at conversion time, so for `MCPToolset` a 
connection is opened then.
   - The toolset's own args validator runs before each call, so argument 
coercion (for example a string into an int) matches what the tool would get 
inside `AgentOperator`.
   - Requires the `langchain` extra 
(`apache-airflow-providers-common-ai[langchain]`).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to