Evals will be part of it as this will be built on top of PydanticAI that supports it.
On Mon, 29 Dec 2025 at 19:03, Giorgio Zoppi <[email protected]> wrote: > Hey Pavan. > If you are going to introduce this have you thought at the evaluation > framework? > How do you evaluate the LLm operator? > > On Mon, Dec 29, 2025, 09:40 Pavankumar Gopidesu <[email protected]> > wrote: > > > Thanks Jens and Jarek, agree on both points raised in comments. > > > > I am happy to defer the embedding of the HITL to separate AIP. > > > > To Jens: > > Yes it's planned for phases wise, our plan starts with only provider > > changes. > > > > Regards > > Pavan > > > > On Sun, Dec 28, 2025 at 2:03 PM Jarek Potiuk <[email protected]> wrote: > > > > > > I also looked at it and I love it as well. I think of it as a missing > > > abstraction between current Airflow users and current LLM app > > developers, I > > > also proposed something a little bit bolder there, which I think shows > > the > > > true potential of that approach. > > > I added comment in the doc, but I will copy it here for better > visibility > > > > > > --- > > > > > > After thinking quite a bit about the proposal, I actually love it and I > > > think that should be the next frontier of making Airflow abstractions > > more > > > approachable and usable by those who want to implement various patterns > > of > > > interacting with LLMS. > > > > > > And I have a little different opinion than Jens regarding HITL. I see > > those > > > common LLM operators as slightly "higher" level operators that might > > > implement a set of common LLM-related patterns that are currently > either > > > difficult or impossible to express via putting together things via Dag > > and > > > individual tasks. In this sense, the capability of making HITL call-out > > for > > > approval or selection from within such an operator - without completing > > the > > > operator and even running those "call-outs" more than once, actually > even > > > unbounded number of times during a single operator's execution. > > > > > > Actually it's a great way for us to implement some "cyclicness" - > without > > > breaking the "acyclic" property of our Dags (for now at least). Making > > Dag > > > "cyclic" is quite a dramatic change, and possibly we do not even have > to > > do > > > it, because the "cyclic" part can be likely encompassed within the > > > specialized LLM operators. I can imagine an operator that performs LLM > > > querying and refining it via additional interactions with LLMs > > "internally" > > > - during a single operator's execution. And some of those iterations > > might > > > result in HITL "call-out" - even multiple times during one execution. > > > > > > Also one more proposal I have here is to use an API similar to HITL (or > > > maybe repurpose HITL for that) - to report PROGRESS of such a task. > This > > is > > > the typical property of good LLM task that it provides some feedback to > > the > > > user - it might be HITL when it asks for something but also it might be > > > HOOTL (Human Outside Of The Loop) - where the task is simply reporting > > it's > > > progress and allows the user to perform asynchronous actions based on > > that > > > progress → for example abort the execution (to stop the Dag) or mark it > > as > > > "skipped" (to trigger - skip processing path), or mark it as "success" > to > > > simulate things being completed when they are not. While the three > > "async" > > > operations we already have, we do not currently have "progress" > targeted > > > for the kind of actor who is also HITL "actor" - someone who is not > > > interested in detailed logs, but rather want to monitor progress and > > assess > > > quality of the output - even if it is just a partial output in the > > > iterative process). > > > > > > I think that it will be easier and much more "surgical" (and applied in > > the > > > right place) to embed this "iterative" feedback / progress than to > modify > > > the "acyclic" property into our Dags. > > > > > > Also - this kind of Progress interface can also be used to publish the > > > "async" tasks progress as the next step of [WIP] AIP-98: Add async > > support > > > for PythonOperator in Airflow 3: > > > > > > https://cwiki.apache.org/confluence/display/AIRFLOW/%5BWIP%5D+AIP-98%3A+Add+async+support+for+PythonOperator+in+Airflow+3 > > > that we discussed with David . > > > > > > J. > > > > > > > > > > > > On Sun, Dec 28, 2025 at 2:16 PM Jens Scheffler <[email protected]> > > wrote: > > > > > > > I like the AIP very much and in my view can be made completely in a > > > > Provider package... with some comments (I assume non blocking) and > > would > > > > propose to really start in increments and then adjust by learning on > > the > > > > path. > > > > > > > > On 12/27/25 22:00, Pavankumar Gopidesu wrote: > > > > > Thanks Giorgio Zoppi, for reviewing the AIP, yes its already > planned > > > > > part of this AIP, see the [1] example , where you can disable hitl > > > > > step or enable it. So its integrated part of the Operator with the > > > > > help of HITL operator. > > > > > > > > > > ``` > > > > > LLMDataQualityOperator( > > > > > > > > > > task_id="customer_quality_analysis", > > > > > > > > > > data_sources=[customer_s3], > > > > > > > > > > prompt="Generate data quality validation queries", > > > > > > > > > > require_approval=True, # Built-in HITL > > > > > > > > > > approval_timeout=timedelta(hours=2) > > > > > > > > > > ) > > > > > ``` > > > > > > > > > > [1]: > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=406618285 > > > > > > > > > > Regards, > > > > > Pavan > > > > > > > > > > On Sat, Dec 27, 2025 at 9:16 AM Giorgio Zoppi < > > [email protected]> > > > > wrote: > > > > >> Hello, > > > > >> Just 1c, skimming AIP, > > > > >> You might want to explore on how to avoid human approval for > > generated > > > > >> query using llm as judge to eval the quality. The nice thing of > data > > > > >> pipelines is automation > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> On Wed, Dec 24, 2025, 10:23 Pavankumar Gopidesu < > > > > [email protected]> > > > > >> wrote: > > > > >> > > > > >>> Hello everyone, > > > > >>> > > > > >>> The thread has been quiet for some time, and I would like to > > restart > > > > >>> the discussion with the AIP. > > > > >>> > > > > >>> First, a sincere thank you to Kaxil for presenting the idea at > > Airflow > > > > >>> Summit 2025. The session was very well received, and many > attendees > > > > >>> expressed strong interest in the proposal. Unfortunately, I was > > unable > > > > >>> to attend the summit due to visa issues, but I am hopeful I will > be > > > > >>> able to join next year. > > > > >>> > > > > >>> The demo included well-structured prototypes. For those who were > > > > >>> unable to attend the session, please refer to the recorded talk > > here > > > > >>> [1]. > > > > >>> > > > > >>> I have also drafted the complete AIP proposal, which is available > > here > > > > >>> [2]. I would greatly appreciate your reviews and look forward to > > > > >>> feedback and further discussion. > > > > >>> > > > > >>> Finally, to those celebrating Christmas, I wish you a very happy > > > > >>> Christmas and a wonderful holiday season. > > > > >>> > > > > >>> Regards > > > > >>> Pavan > > > > >>> > > > > >>> [1] https://www.youtube.com/watch?v=XSAzSDVUi2o > > > > >>> [2] > > > > >>> > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=406618285 > > > > >>> > > > > >>> On Wed, Oct 15, 2025 at 6:13 AM Amogh Desai < > [email protected] > > > > > > > wrote: > > > > >>>> Thanks Pavan and Kaxil, seems like an interesting idea and a > > pretty > > > > >>>> reasonable problem to solve. > > > > >>>> > > > > >>>> I also like the idea of starting with > > > > >>> `apache-airflow-providers-common-ai` > > > > >>>> and expanding as / when needed. > > > > >>>> > > > > >>>> Looking forward to when the recording will be out, missed > > attending > > > > this > > > > >>>> session at the Airflow Summit. > > > > >>>> > > > > >>>> Thanks & Regards, > > > > >>>> Amogh Desai > > > > >>>> > > > > >>>> > > > > >>>> On Thu, Oct 9, 2025 at 10:49 AM Kaxil Naik <[email protected] > > > > > > wrote: > > > > >>>> > > > > >>>>> Yea I think it should be apache-airflow-providers-common-ai > > > > >>>>> > > > > >>>>> On Wed, 8 Oct 2025 at 02:04, Pavankumar Gopidesu < > > > > >>> [email protected]> > > > > >>>>> wrote: > > > > >>>>> > > > > >>>>>> Yes its new provider starting with completely experimental, we > > dont > > > > >>>>>> want to break functionalities with existing providers :) > > > > >>>>>> > > > > >>>>>> Mostly its sql based operators, so named it as sql-ai but > agree > > we > > > > >>> can > > > > >>>>>> make it generic without specifying sql in it :) > > > > >>>>>> > > > > >>>>>> Pavan > > > > >>>>>> > > > > >>>>>> On Tue, Oct 7, 2025 at 3:48 PM Ryan Hatter via dev > > > > >>>>>> <[email protected]> wrote: > > > > >>>>>>> Would this really necessitate a new provider? Should this > just > > be > > > > >>> baked > > > > >>>>>>> into the common SQL provider? > > > > >>>>>>> > > > > >>>>>>> Alternatively, instead of a narrow `sql-ai` provider, why not > > have > > > > >>> a > > > > >>>>>>> generic common ai provider with a SQL package, which would > > allow > > > > >>> for us > > > > >>>>>> to > > > > >>>>>>> build AI-based subpackages into the provider other than just > > SQL? > > > > >>>>>>> > > > > >>>>>>> On Mon, Oct 6, 2025 at 4:31 PM Pavankumar Gopidesu < > > > > >>>>>> [email protected]> > > > > >>>>>>> wrote: > > > > >>>>>>> > > > > >>>>>>>> @Giorgio Yes indeed that's also a good thought to > integrate. I > > > > >>> will > > > > >>>>>> keep in > > > > >>>>>>>> mind to think about when I draft AIP and message about this > a > > bit > > > > >>>>> more > > > > >>>>>> :) > > > > >>>>>>>> Yes please join. We have great demos packed on this topic :) > > > > >>>>>>>> > > > > >>>>>>>> @kaxil , Yes that's a great blog post from the wren AI and > > > > >>> leveraging > > > > >>>>>> the > > > > >>>>>>>> Apache DataFusion as a query engine to connect to different > > data > > > > >>>>>> sources. > > > > >>>>>>>> Pavan > > > > >>>>>>>> > > > > >>>>>>>> On Tue, Sep 30, 2025 at 7:37 PM Giorgio Zoppi < > > > > >>>>> [email protected] > > > > >>>>>>>> wrote: > > > > >>>>>>>> > > > > >>>>>>>>> Hey Pavan, > > > > >>>>>>>>> Some notes: > > > > >>>>>>>>> 1. LLM can be also very useful in detecting root causes of > > your > > > > >>>>> error > > > > >>>>>>>> while > > > > >>>>>>>>> developing and design a pipeline. I explain me better, we'd > > in > > > > >>> the > > > > >>>>>> past > > > > >>>>>>>>> several > > > > >>>>>>>>> Spark processes, when it is all green is ok, but when on > > > > >>> fails, it > > > > >>>>>> will > > > > >>>>>>>> be > > > > >>>>>>>>> nice to have a tool integrated to ask why. > > > > >>>>>>>>> 2. Ideally such operator could be a > > > > >>> ModelContextProtocolOperator > > > > >>>>> and > > > > >>>>>> you > > > > >>>>>>>>> would not need nothing else that put an LLM as parameter > with > > > > >>> that > > > > >>>>>>>>> operator, > > > > >>>>>>>>> and just call for tools, execute query, and so on. This > would > > > > >>> be > > > > >>>>> more > > > > >>>>>>>>> powerful, because you create an abstraction between > devices, > > > > >>>>>> databases, > > > > >>>>>>>>> server and so on, so each source of data can be injected on > > the > > > > >>>>>> pipeline. > > > > >>>>>>>>> 3. Good job! Looking forward to see the presentation. > > > > >>>>>>>>> Best Regards, > > > > >>>>>>>>> Giorgio > > > > >>>>>>>>> > > > > >>>>>>>>> Il giorno mar 30 set 2025 alle ore 14:51 Pavankumar > Gopidesu > > < > > > > >>>>>>>>> [email protected]> ha scritto: > > > > >>>>>>>>> > > > > >>>>>>>>>> Hi everyone, > > > > >>>>>>>>>> > > > > >>>>>>>>>> We're exploring adding LLM-powered SQL operators to > Airflow > > > > >>> and > > > > >>>>>> would > > > > >>>>>>>>> love > > > > >>>>>>>>>> community input before writing an AIP. > > > > >>>>>>>>>> > > > > >>>>>>>>>> The idea: Let users write natural language prompts like > > "find > > > > >>>>>> customers > > > > >>>>>>>>>> with missing emails" and have Airflow generate safe SQL > > > > >>> queries > > > > >>>>>> with > > > > >>>>>>>> full > > > > >>>>>>>>>> context about your database schema, connections, and data > > > > >>>>>> sensitivity. > > > > >>>>>>>>>> Why this matters: > > > > >>>>>>>>>> > > > > >>>>>>>>>> > > > > >>>>>>>>>> Most of us spend too much time on schema drift detection > and > > > > >>>>> manual > > > > >>>>>>>> data > > > > >>>>>>>>>> quality checks. Meanwhile, AI agents are getting powerful > > but > > > > >>>>> lack > > > > >>>>>>>>>> production-ready data integrations. Airflow could bridge > > this > > > > >>>>> gap. > > > > >>>>>>>>>> Here's what we're dealing with at Tavant: > > > > >>>>>>>>>> > > > > >>>>>>>>>> > > > > >>>>>>>>>> Our team works with multiple data domain teams producing > > > > >>> data in > > > > >>>>>>>>> different > > > > >>>>>>>>>> formats and storage across S3, PostgreSQL, Iceberg, and > > > > >>> Aurora. > > > > >>>>>> When > > > > >>>>>>>> data > > > > >>>>>>>>>> assets become available for consumption, we need: > > > > >>>>>>>>>> > > > > >>>>>>>>>> - Detection of breaking schema changes between systems > > > > >>>>>>>>>> > > > > >>>>>>>>>> - Data quality assessments between snapshots > > > > >>>>>>>>>> > > > > >>>>>>>>>> - Validation that assets meet mandatory metadata > > requirements > > > > >>>>>>>>>> > > > > >>>>>>>>>> - Lookup validation against existing data (comparing file > > > > >>> feeds > > > > >>>>>> with > > > > >>>>>>>>>> different formats to existing data in Iceberg/Aurora) > > > > >>>>>>>>>> > > > > >>>>>>>>>> This is exactly the type of work that LLMs could automate > > > > >>> while > > > > >>>>>>>>>> maintaining governance. > > > > >>>>>>>>>> > > > > >>>>>>>>>> What we're thinking: > > > > >>>>>>>>>> > > > > >>>>>>>>>> ```python > > > > >>>>>>>>>> > > > > >>>>>>>>>> # Instead of writing complex SQL by hand... > > > > >>>>>>>>>> > > > > >>>>>>>>>> quality_check = LLMSQLQueryOperator( > > > > >>>>>>>>>> > > > > >>>>>>>>>> task_id="find_data_issues", > > > > >>>>>>>>>> > > > > >>>>>>>>>> prompt="Find customers with invalid email formats and > > > > >>> missing > > > > >>>>>> phone > > > > >>>>>>>>>> numbers", > > > > >>>>>>>>>> > > > > >>>>>>>>>> data_sources=[customer_asset], # Airflow knows the > > > > >>> schema > > > > >>>>>>>>>> automatically > > > > >>>>>>>>>> > > > > >>>>>>>>>> # Built-in safety: won't generate DROP/DELETE > > statements > > > > >>>>>>>>>> > > > > >>>>>>>>>> ) > > > > >>>>>>>>>> > > > > >>>>>>>>>> ``` > > > > >>>>>>>>>> > > > > >>>>>>>>>> The operator would: > > > > >>>>>>>>>> > > > > >>>>>>>>>> - Auto-inject database schema, sample data, and connection > > > > >>>>> details > > > > >>>>>>>>>> - Generate safe SQL (blocks dangerous operations) > > > > >>>>>>>>>> > > > > >>>>>>>>>> - Work across PostgreSQL, Snowflake, BigQuery with dialect > > > > >>>>>> awareness > > > > >>>>>>>>>> - Support schema drift detection between systems > > > > >>>>>>>>>> > > > > >>>>>>>>>> - Handle multi-cloud data via Apache DataFusion[1] (Did > some > > > > >>>>>>>> experiments > > > > >>>>>>>>>> with 50M+ records and results are in 10-15 > seconds > > > > >>> for > > > > >>>>>> common > > > > >>>>>>>>>> aggregations) > > > > >>>>>>>>>> > > > > >>>>>>>>>> for more info on benchmarks [2] > > > > >>>>>>>>>> > > > > >>>>>>>>>> Key benefit: Assets become smarter with structured > metadata > > > > >>>>>> (schema, > > > > >>>>>>>>>> sensitivity, format) instead of just throwing everything > in > > > > >>>>>> `extra`. > > > > >>>>>>>>>> Implementation plan: > > > > >>>>>>>>>> > > > > >>>>>>>>>> Start with a separate provider > > > > >>>>> (`apache-airflow-providers-sql-ai`) > > > > >>>>>> so > > > > >>>>>>>> we > > > > >>>>>>>>>> can iterate without touching the Airflow core. No breaking > > > > >>>>> changes, > > > > >>>>>>>> works > > > > >>>>>>>>>> with existing connections and hooks. > > > > >>>>>>>>>> > > > > >>>>>>>>>> I am presenting this at Airflow Summit 2025 in Seattle > with > > > > >>>>> Kaxil - > > > > >>>>>>>> come > > > > >>>>>>>>>> see the live demo! > > > > >>>>>>>>>> > > > > >>>>>>>>>> Next steps: > > > > >>>>>>>>>> > > > > >>>>>>>>>> If this resonates after the Summit, we'll write a proper > AIP > > > > >>> with > > > > >>>>>>>>> technical > > > > >>>>>>>>>> details and further build a working prototype. > > > > >>>>>>>>>> > > > > >>>>>>>>>> Thoughts? Concerns? Better ideas? > > > > >>>>>>>>>> > > > > >>>>>>>>>> > > > > >>>>>>>>>> [1]: https://datafusion.apache.org/ > > > > >>>>>>>>>> > > > > >>>>>>>>>> [2]: > > > > >>>>>>>>>> > > > > >>>>>>>>>> > > > > >>> > > > > > > > https://datafusion.apache.org/blog/2024/11/18/datafusion-fastest-single-node-parquet-clickbench/ > > > > >>>>>>>>>> Thanks, > > > > >>>>>>>>>> > > > > >>>>>>>>>> Pavan > > > > >>>>>>>>>> > > > > >>>>>>>>>> P.S. - Happy to share more technical details with anyone > > > > >>>>>> interested. > > > > >>>>>>>>> > > > > >>>>>>>>> -- > > > > >>>>>>>>> Life is a chess game - Anonymous. > > > > >>>>>>>>> > > > > >>>>>> > > > > --------------------------------------------------------------------- > > > > >>>>>> To unsubscribe, e-mail: [email protected] > > > > >>>>>> For additional commands, e-mail: [email protected] > > > > >>>>>> > > > > >>>>>> > > > > >>> > > --------------------------------------------------------------------- > > > > >>> To unsubscribe, e-mail: [email protected] > > > > >>> For additional commands, e-mail: [email protected] > > > > >>> > > > > >>> > > > > > > --------------------------------------------------------------------- > > > > > To unsubscribe, e-mail: [email protected] > > > > > For additional commands, e-mail: [email protected] > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: [email protected] > > > > For additional commands, e-mail: [email protected] > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > >
