Hi Jason, This is a great project idea. The scenarios you describe (static checks pass, unit tests pass in Breeze, system behavior verified) are exactly the kind of testable user stories that would benefit from a shared, reproducible format for evaluation being intentional on what "smart enough" means.
We're discussing this space in an email thread on a proposed AIP-102 [1] (which might work better as pat of the ecosystem, based on feedback), which proposes a benchmark and conformance format for AI capabilities in Airflow. A Breeze contribution skill would be a great candidate for an exam: can the agent correctly distinguish host vs container context, run the right commands in the right environment, and verify outcomes? Have you already defined pass/fail criteria for those scenarios you mention, or is it more of a manual "looks right" check today? Would love to somehow be involved from the evals and system test side. [1] https://lists.apache.org/thread/sxnjv27cpm9yr5d0rbqobgvcgmhn7yfd Alex Guglielmone Nemi On Tue, Feb 24, 2026, 16:58 Jarek Potiuk <[email protected]> wrote: > Jason, > > I am 100% for that. I've been thinking about this very thing but never had > time to act on it. I totally agree that makes perfect sense. Having some of > the GSoC people to work on it would be great because they might come with > new perspectives - and even improve or change Breeze if adapting to it > proves too difficult for agents. > > I would even love to co-mentor it with you if you take the lead. Having a > co-mentor is always great for coverage when people are unavailable or very > busy. > > J. > > > > On Tue, Feb 24, 2026 at 10:02 AM Zhe-You Liu <[email protected]> wrote: > > > Hi Jarek, > > > > I have a small project idea similar to the recent “airflow-translation” > > agent skill: an “airflow-breeze-contribution” / > > “airflow-contribution-verification” agent skill (maybe a better name > would > > also mention “prek”). > > > > Breeze is definitely one of the most powerful CI and developer tools we > > have. However, in my experience, these agents (Claude Code, Gemini CLI, > > GitHub Copilot–like IDE or CLI tooling) aren’t smart enough to use Breeze > > as an environment that matches the correct, reproducible GitHub CI > > environment. Even though we have added `AGENTS.md` and mention the > > contribution docs in it, it doesn’t really seem to work. It mostly serves > > as extra context and just increases the context window IMHO. > > > > The expected results of the project would be: > > > > 1. The AI tools should be smart enough to leverage Breeze. > > 2. The AI tools should **respect the Breeze environment** and **be able > to > > distinguish whether the current session is inside Breeze or not**, so > they > > can decide whether to run host commands (e.g. `breeze start-airflow`), > > commands inside the container (e.g. `pytest` or `airflow ...`), or even > > jump out of Breeze container to run some host commands then jump back > into > > the Breeze container. > > 3. Ensure consistency between the new skills and the Breeze CLI via > > automated static checks (maybe using the “prek” mechanism to > automatically > > sync Breeze CLI docstrings to the correct paths for the agent skills), so > > that the Breeze CLI remains the single source of truth. > > > > Here’s the typical workflow of my development journey after making all > the > > changes in a PR, which might be helpful when drafting the agent skills: > > > > Scenario 1) Make sure all the static checks pass > > > > 1. Stage all the changes with `git`. > > 2. Run `prek`, then fix all the static check errors. > > > > Scenario 2) Make sure all the relevant unit tests in the current PR pass > > > > 1. Run `breeze shell` to start the Breeze container as a clean testing > > environment. > > 2. Run `pytest` with a partial path to the modules/classes instead of > > running the full test suite in the same terminal session. > > > > Scenario 3) Verify the system behavior > > > > 1. Add a new Dag related to the new feature or bug. > > 2. Run `breeze start-airflow` (possibly with third-party system > > integration via the `--integration` flag). > > 3. Trigger the DagRun in the UI (although for the agent mode we should > use > > a CLI trigger instead, for simplicity purposes). > > 4. Verify whether there are any errors across the components. > > > > I’m not sure whether adding this agent skill and making our AI tools > > respect the Breeze environment would be a suitable project for GSoC or > > not.I would appreciate any suggestions on this project idea and whether > the > > overall direction makes sense to everyone. > > > > Thanks! > > > > Best, > > Jason > > > > On Mon, Feb 23, 2026 at 4:22 PM Jarek Potiuk <[email protected]> wrote: > > > > > Hello dear Airflow community, > > > > > > Apache Software Foundattion has been officially accepted as a Google > > > Summer of Code organisation and if you would have an idea for a > > > project, that could be done by participants of the GSOC - there is > > > still time to volunteer and add some project that you would like to > > > run. > > > > > > Mentoring in GSOC is really something that is best suited for > > > committers who have some small-ish projects in mind, with clear ideas > > > of what needs to be done. These projects should not require extensive > > > Airflow knowledge from those participants, and failure to complete > > > them should not be critical, although completion would be beneficial. > > > > > > Mentoring usually requires some time, but not much - and I personally > > > would say - this is a very rewarding experience. I've personally > > > gained many friendships from mentorships I've done, people grew when I > > > was mentoring them and I have tear-shedding stories about some of the > > > mentorships I run. This includes a talk at Community Over Code where > > > my mentee from Peru (and a few other PMC members' mentees) described > > > her story: she went from being low and depressed while supporting her > > > mother to becoming an experienced developer advocate with good job and > > > great stability—on a UK talent visa. At the end of the talk she > > > thanked her mother for supporting her—she brought her mother to the > > > conference and her mother witnessed the talk in person. > > > > > > Those are things you can't buy with money, or learn, you need to > > > experience them and let them happen. And for that you need to give it > > > a chance. > > > > > > So if you would like to participate, submit your project here and read > > > more about GSOC: > > > > > > https://community.apache.org/gsoc/guide-to-being-a-mentor.html > > > > > > Also, for those who would like to be mentors, I offer something > > > myself. Since I've been a mentor quite a few times, I am super happy > > > to help new mentors. I volunteer to "mentor the mentors" and am happy > > > to privately discuss and meet with those who want to take on > > > mentorship and help them become great mentors. > > > > > > Also maybe other past mentors would join me in that. We had quite a > > > few mentors in various past programs, and I am sure their experience > > > is similar to mine. > > > > > > J. > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: [email protected] > > > For additional commands, e-mail: [email protected] > > > > > > > > >
