Hi Jason,

This is a great project idea. The scenarios you describe (static checks
pass, unit tests pass in Breeze, system behavior verified) are exactly the
kind of testable user stories that would benefit from a shared,
reproducible format for evaluation being intentional on what "smart enough"
means.

We're discussing this space in an email thread on a proposed AIP-102 [1]
(which might work better as pat of the ecosystem, based on feedback), which
proposes a benchmark and conformance format for AI capabilities in Airflow.
A Breeze contribution skill would be a great candidate for an exam: can the
agent correctly distinguish host vs container context, run the right
commands in the right environment, and verify outcomes?

Have you already defined pass/fail criteria for those scenarios you
mention, or is it more of a manual "looks right" check today?

Would love to somehow be involved from the evals and system test side.

[1] https://lists.apache.org/thread/sxnjv27cpm9yr5d0rbqobgvcgmhn7yfd

Alex Guglielmone Nemi

On Tue, Feb 24, 2026, 16:58 Jarek Potiuk <[email protected]> wrote:

> Jason,
>
> I am 100% for that. I've been thinking about this very thing but never had
> time to act on it. I totally agree that makes perfect sense. Having some of
> the GSoC people to work on it would be great because they might come with
> new perspectives - and even improve or change Breeze if adapting to it
> proves too difficult for agents.
>
> I would even love to co-mentor it with you if you take the lead. Having a
> co-mentor is always great for coverage when people are unavailable or very
> busy.
>
> J.
>
>
>
> On Tue, Feb 24, 2026 at 10:02 AM Zhe-You Liu <[email protected]> wrote:
>
> > Hi Jarek,
> >
> > I have a small project idea similar to the recent “airflow-translation”
> > agent skill: an “airflow-breeze-contribution” /
> > “airflow-contribution-verification” agent skill (maybe a better name
> would
> > also mention “prek”).
> >
> > Breeze is definitely one of the most powerful CI and developer tools we
> > have. However, in my experience, these agents (Claude Code, Gemini CLI,
> > GitHub Copilot–like IDE or CLI tooling) aren’t smart enough to use Breeze
> > as an environment that matches the correct, reproducible GitHub CI
> > environment. Even though we have added `AGENTS.md` and mention the
> > contribution docs in it, it doesn’t really seem to work. It mostly serves
> > as extra context and just increases the context window IMHO.
> >
> > The expected results of the project would be:
> >
> > 1. The AI tools should be smart enough to leverage Breeze.
> > 2. The AI tools should **respect the Breeze environment** and **be able
> to
> > distinguish whether the current session is inside Breeze or not**, so
> they
> > can decide whether to run host commands (e.g. ‎`breeze start-airflow`),
> > commands inside the container (e.g. ‎`pytest` or ‎`airflow ...`), or even
> > jump out of Breeze container to run some host commands then jump back
> into
> > the Breeze container.
> > 3. Ensure consistency between the new skills and the Breeze CLI via
> > automated static checks (maybe using the “prek” mechanism to
> automatically
> > sync Breeze CLI docstrings to the correct paths for the agent skills), so
> > that the Breeze CLI remains the single source of truth.
> >
> > Here’s the typical workflow of my development journey after making all
> the
> > changes in a PR, which might be helpful when drafting the agent skills:
> >
> > Scenario 1) Make sure all the static checks pass
> >
> > 1. Stage all the changes with ‎`git`.
> > 2. Run ‎`prek`, then fix all the static check errors.
> >
> > Scenario 2) Make sure all the relevant unit tests in the current PR pass
> >
> > 1. Run ‎`breeze shell` to start the Breeze container as a clean testing
> > environment.
> > 2. Run ‎`pytest` with a partial path to the modules/classes instead of
> > running the full test suite in the same terminal session.
> >
> > Scenario 3) Verify the system behavior
> >
> > 1. Add a new Dag related to the new feature or bug.
> > 2. Run ‎`breeze start-airflow` (possibly with third-party system
> > integration via the ‎`--integration` flag).
> > 3. Trigger the DagRun in the UI (although for the agent mode we should
> use
> > a CLI trigger instead, for simplicity purposes).
> > 4. Verify whether there are any errors across the components.
> >
> > I’m not sure whether adding this agent skill and making our AI tools
> > respect the Breeze environment would be a suitable project for GSoC or
> > not.I would appreciate any suggestions on this project idea and whether
> the
> > overall direction makes sense to everyone.
> >
> > Thanks!
> >
> > Best,
> > Jason
> >
> > On Mon, Feb 23, 2026 at 4:22 PM Jarek Potiuk <[email protected]> wrote:
> >
> > > Hello dear Airflow community,
> > >
> > > Apache Software Foundattion has been officially accepted as a Google
> > > Summer of Code organisation and if you would have an idea for a
> > > project, that could be done by participants of the GSOC -  there is
> > > still time to volunteer and add some project that you would like to
> > > run.
> > >
> > > Mentoring in GSOC is really something that is best suited for
> > > committers who have some small-ish projects in mind, with clear ideas
> > > of what needs to be done. These projects should not require extensive
> > > Airflow knowledge from those participants, and failure to complete
> > > them should not be critical, although completion would be beneficial.
> > >
> > > Mentoring usually requires some time, but not much - and I personally
> > > would say - this is a very rewarding experience. I've personally
> > > gained many friendships from mentorships I've done, people grew when I
> > > was mentoring them and I have tear-shedding stories about some of the
> > > mentorships I run. This includes a talk at Community Over Code where
> > > my mentee from Peru (and a few other PMC members' mentees) described
> > > her story: she went from being low and depressed while supporting her
> > > mother to becoming an experienced developer advocate with good job and
> > > great stability—on a UK talent visa. At the end of the talk she
> > > thanked her mother for supporting her—she brought her mother to the
> > > conference and her mother witnessed the talk in person.
> > >
> > > Those are things you can't buy with money, or learn, you need to
> > > experience them and let them happen. And for that you need to give it
> > > a chance.
> > >
> > > So if you would like to participate, submit your project here and read
> > > more about GSOC:
> > >
> > > https://community.apache.org/gsoc/guide-to-being-a-mentor.html
> > >
> > > Also, for those who would like to be mentors, I offer something
> > > myself. Since I've been a mentor quite a few times, I am super happy
> > > to help new mentors. I volunteer to "mentor the mentors" and am happy
> > > to privately discuss and meet with those who want to take on
> > > mentorship and help them become great mentors.
> > >
> > > Also maybe other past mentors would join me in that. We had quite a
> > > few mentors in various past programs, and I am sure their experience
> > > is similar to mine.
> > >
> > > J.
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [email protected]
> > > For additional commands, e-mail: [email protected]
> > >
> > >
> >
>

Reply via email to