Ah ... and one good thing about the auto-mapping idea. You know that
saying: T*he world is a slightly better place with every single line of
yaml removed or not even created in the first place. *This is almost
literally the quote from our "Monorepo" talk with Amogh in Talk Python To
Me :).

On Mon, May 11, 2026 at 4:33 PM Jarek Potiuk <[email protected]> wrote:

> 1. Agree with the grouping idea. I think even originally when you
> discussed it Omkar - there were some "groups" of exceptions.
> AERR-DAG-NOTFOUND-BACKFILL seems like a more suitable short name than 0001,
> provided it is descriptive enough for you to easily understand what each
> error means. I would hate always having to look up the error code in a
> table or YAML file. We coud have such table generated and in docs, but
> essentially after seeing enough logs you should know what the short code
> means without memorizing the number. It's almost inhuman to force people to
> associate numeric values with meaning.
>
> 2. I think 1-1 mapping exception to the code would be good. While a short
> error code is useful in logs, seeing the short name in the code when you
> "raise" them is counterproductive because it adds noise to something we
> already have: the Exception Class name. On the other hand, such a class
> name looks way worse in the logs./
>
> 3. *Idea:* Why don't we just keep the correct naming convention for our
> Exceptions and map them into IDs automatically (e.g.,
> AirflowDagNotFoundBackfillException -> AERR-DAG-NOT-FOUND-BACKFILL). I
> think it ticks all the boxes:
>
> * 0 maintenance (just a hook to check if all exceptions follow the right
> conventions
> * 0 mapping
> * Code friendly
> * Log friendly
> * You see what you get by looking at either the exception class or ID
> * We can build an exception hierarchy that allows us to catch several
> exceptions (e.g., `AirflowDagNotFoundException` being an abstract
> (non-instantiable) parent of AirflowDagNotFoundBackfillExceptions and
> AirflowDagNotFoundParsingException for example
> * Grouping works naturally and without conscious thought—in both exception
> classes and IDs
>
> Essentially, no SKILL is needed for that.
>
> And BTW. I think none of our "coding" should really "Requiire" using
> SKILLS and "impair" those who do not use agents. Even though I'm known as
> an AI and Agent enthusiast, we should avoid making standard code parts or
> development workflows inaccessible to those who don't want to use agents,
> especially if it's easy.
>
> It's one thing to empower maintainers and contributors with SKILLS to
> review or triage PRs if they want to or for someone doing translation to
> add a new phrase in a language. However, it's a different story when
> discussing basic "code" tasks, like adding new exceptions. Ideally, those
> tasks should not **require** you to use Agents or be "difficult" without
> them. We should totally respect people who choose not to use agents
> themselves and ensure they do not feel like "lesser" people. Promoting
> something and giving people new tools is one thing; making it a mandatory
> part of the regular workflow when it isn't truly required is another.
>
> J.
>
>
>
> On Mon, May 11, 2026 at 3:30 PM Ash Berlin-Taylor <[email protected]> wrote:
>
>> Maybe we should not have sequential IDs at all and do something similar
>> to what SQLA does: https://sqlalche.me/e/20/xd2s for example (That’s
>> `/e/<major><minor>/<code>` which redirects)
>>
>> Some of the example(?) errors are internal to a single component and
>> never exposed to users, so shouldn’t be in the registry -
>> AERR009/DagCodeNotFound for instance, is likely thrown by the ORM layer and
>> caught by the API server, which is to say it is entirely invisible to the
>> user? I imagine there are many more in this category.
>>
>>
>> AERR010 and AERR011 are both DagNotFound, but 11 is specifically for
>> "Requested DAG could not be found for backfill operation” — that seems very
>> odd to have a different error code for that.
>>
>> We also have provider specific error codes in the main registry which
>> isn’t a pattern that will work (`user_facing_error_message: Google Ads link
>> not found for the specified property`) etc.
>>
>> -ash
>>
>>
>> > On 11 May 2026, at 14:20, Ash Berlin-Taylor <[email protected]> wrote:
>> >
>> > If we do this (and I’m still not sure what I think overall) +1 to some
>> kind of grouping. Right now for instance the registry has AERR002 for
>> connection not found, but no space to add  Variable not found, or State not
>> found in the future.
>> >
>> >> On 11 May 2026, at 12:25, Dev-iL <[email protected]> wrote:
>> >>
>> >> (please assume there's a "In my opinion, " prefix to every sentence)
>> >>
>> >> 0. Since the dev workflow is very structured, it can/should be made
>> into a
>> >> SKILL.
>> >> 1. Long term yes, but while we refactor the existing code we should
>> allow
>> >> it (assuming it trip hooks or CI)
>> >> 2. YAML seems suitable at first glance
>> >> 3. One code per exception makes sense to me. Depending on how we want
>> the
>> >> exception taxonomy to evolve, perhaps we want to have codes like
>> ###.###
>> >> for "parent" and "subclass" exceptions, or Ruff-style #00 will be a
>> family
>> >> of similar exceptions.
>> >>
>> >>
>> >> On Mon, 11 May 2026, 12:15 Omkar P, <[email protected]> wrote:
>> >>
>> >>> Hi team,
>> >>>
>> >>> Starting this thread to discuss the design of Airflow error codes.
>> These
>> >>> are LLM-friendly strings starting with AERR, which airflow devs can
>> use
>> >>> when raising exceptions, to convey the error context to dag users in a
>> >>> succinct way. Providing current design details below.
>> >>>
>> >>> PR: https://github.com/apache/airflow/pull/65423
>> >>>
>> >>> Feature flow:
>> >>> 1. airflow dev identifies error case and defines a new error code in
>> the
>> >>> error mapping yaml (say AERR002).
>> >>> 2. dev then adds AirflowErrorCodeMixin to respective exception class
>> >>> that they'd want to raise with an error_code.
>> >>> 3. dev then specifies the error_code in raise in code (e.g.  raise
>> >>> AirflowNotFoundException(..., error_code="AERR002")).
>> >>> 4. dev runs breeze build-docs that generates a new docs page
>> AERR002.rst
>> >>> 5. breeze static check takes care of validating if error code is
>> mapped
>> >>> to correct exception class.
>> >>>
>> >>> User side:
>> >>> On airflow users' side, they now see airflow error code as
>> >>> part of the stack trace, which they can use for communicating problems
>> >>> instead of pasting verbose stack traces. Error codes also improve
>> >>> LLM-based discovery of airflow errors as codes are much more
>> >>> deterministic/well-defined than plain stack traces.
>> >>>
>> >>> Open questions:
>> >>> 1. Should the error code be mandatory for all raises of an exception
>> >>> class that uses them?
>> >>> 2. Where should the error code info be stored? Is a YAML-based
>> registry
>> >>> good enough?
>> >>> 3. Shall we have a 1:1 mapping between an error code and exception
>> >>> class? e.g. AirflowNotFoundException mapped only to AERR002 i.e. only
>> one
>> >>> error code. (current implementation in PR has supports many to one
>> mapping,
>> >>> one exception class <-> multiple error codes based on respective
>> context).
>> >>>
>> >>> Look forward to your thoughts on above open questions or any other
>> >>> design suggestions you'd like to add, thanks!
>> >>>
>> >>> Regards,
>> >>> Omkar
>> >>>
>> >
>>
>>

Reply via email to