Started a lazy consensus on this one.

Thanks & Regards,
Amogh Desai


On Tue, Nov 11, 2025 at 10:37 AM Amogh Desai <[email protected]> wrote:

> Short reminder: About 10 hours left till I wind this discussion up and
> start a lazy consensus for the same.
>
> Thanks & Regards,
> Amogh Desai
>
>
> On Fri, Nov 7, 2025 at 12:58 PM Amogh Desai <[email protected]> wrote:
>
>> I will be waiting for responses on this discussion before creating a lazy
>> consensus till *Tue, Nov 11, 3:00 PM UTC*
>>
>> So, if you have thoughts, feel free to chime in now :)
>>
>> Thanks & Regards,
>> Amogh Desai
>>
>>
>> On Fri, Nov 7, 2025 at 4:57 AM Buğra Öztürk <[email protected]>
>> wrote:
>>
>>> Great initiative Amogh, thanks! I agree with others on 1 and not
>>> encouraging for 2 as well.
>>>
>>> Idea of filling the gaps with adding more endpoints would enable more
>>> automation with a secure environment in the long run. In addition, we can
>>> consider providing some more granular clean up/db functionality on CLI
>>> too
>>> where those could be automated on server side with Admin commands and not
>>> from Dags, just an idea.
>>>
>>> I hope we will add airflowctl there soon, of course with limited
>>> opwrations. 🤞
>>>
>>> Bugra Ozturk
>>>
>>> On Thu, 6 Nov 2025, 14:32 Amogh Desai, <[email protected]> wrote:
>>>
>>> > Looking for some more eyes on this one.
>>> >
>>> > Thanks & Regards,
>>> > Amogh Desai
>>> >
>>> >
>>> > On Thu, Nov 6, 2025 at 12:55 PM Amogh Desai <[email protected]>
>>> wrote:
>>> >
>>> > > > Yes, API could do this with 5-times more code including the limits
>>> per
>>> > > response where you need to loop over all pages until you have a full
>>> > > list (e.g. API limited to 100 results). Not impossible but a lot of
>>> > > re-implementation.
>>> > >
>>> > > Just wondering, why not vanilla task mapping?
>>> > >
>>> > > > Might be something that could be a potential contributionto
>>> "airflow db
>>> > > clean"
>>> > >
>>> > > Maybe, yes.
>>> > >
>>> > > Thanks & Regards,
>>> > > Amogh Desai
>>> > >
>>> > >
>>> > > On Thu, Nov 6, 2025 at 12:53 PM Amogh Desai <[email protected]>
>>> > wrote:
>>> > >
>>> > >> > I think our efforts should be way more focused on adding some
>>> missing
>>> > >> API
>>> > >> calls in Task SDK that our users miss, rather than in allowing them
>>> to
>>> > use
>>> > >> "old ways". Every time someone says "I cannot migrate because i did
>>> > this",
>>> > >> our first thought should be:
>>> > >>
>>> > >> * is it a valid way?
>>> > >> * is it acceptable to have an API call for it in SDK?
>>> > >> * should we do it ?
>>> > >>
>>> > >>
>>> > >> That is currently a grey zone we need to define better I think.
>>> Certain
>>> > >> use cases might be general
>>> > >> enough that we need an execution API endpoint for that, and we can
>>> > >> certainly do that. But there will
>>> > >> also be cases when the use case is niche and we will NOT want to
>>> have
>>> > >> execution API endpoints
>>> > >> for that for various reasons. The harder problem to solve is the
>>> latter.
>>> > >>
>>> > >> But you make a fair point here.
>>> > >>
>>> > >>
>>> > >>
>>> > >> Thanks & Regards,
>>> > >> Amogh Desai
>>> > >>
>>> > >>
>>> > >> On Thu, Nov 6, 2025 at 2:33 AM Jens Scheffler <[email protected]>
>>> > >> wrote:
>>> > >>
>>> > >>> > Thanks for your comments too, Jens.
>>> > >>> >
>>> > >>> >>    * Aggregate status of tasks in the upstream of same Dag
>>> (pass,
>>> > >>> fail,
>>> > >>> >>      listing)
>>> > >>> >>
>>> > >>> >> Does the DAG run page not show that?
>>> > >>> Partly yes, but in our environment it is a bit more complex than
>>> > >>> "pass/fail". Bit more complex story, we want to know more details
>>> of
>>> > the
>>> > >>> failed and aggregate details. So high-level saying get the XCom
>>> from
>>> > >>> failed and then aggregate details. Imagine all tasks ahve an owner
>>> and
>>> > >>> we want to send a notification to each owner but if 10 tasks from
>>> one
>>> > >>> owner fail we want to send 1 notification with 10 failed in the
>>> text.
>>> > >>> And, yes, can be done via API.
>>> > >>> >>    * Custom mass-triggering of other dags and collection of
>>> results
>>> > >>> from
>>> > >>> >>     triggered dags as scale-out option for dynamic task mapping
>>> > >>> >>
>>> > >>> >> Can't an API do that?
>>> > >>> Yes, API could do this with 5-times more code including the limits
>>> per
>>> > >>> response where you need to loop over all pages until you have a
>>> full
>>> > >>> list (e.g. API limited to 100 results). Not impossible but a lot of
>>> > >>> re-implementation.
>>> > >>> >>    * And the famous: Partial database clean on a per Dag level
>>> with
>>> > >>> >>      different retention
>>> > >>> >>
>>> > >>> >> Can you elaborate this one a bit :D
>>> > >>>
>>> > >>> Yes. We have some Dag that is called 50k-100k times per day and
>>> others
>>> > >>> that are called 12 times a day. And a lot of others in-between
>>> like 25k
>>> > >>> runs per month. The Dag with 100k runs per day we want to archive
>>> ASAP
>>> > >>> probably after 3 days for all not failed calls to reduce DB
>>> overhead.
>>> > >>> The failed ones we keep for 14 days for potential re-processing if
>>> > there
>>> > >>> was an outage.
>>> > >>>
>>> > >>> Most other Dag Runs we keep for a month. And some we cap that we
>>> > archive
>>> > >>> if more than 25k runs
>>> > >>>
>>> > >>> Might be something that could be a potential contributionto
>>> "airflow db
>>> > >>> clean"
>>> > >>>
>>> > >>> >>
>>> > >>> >> Thanks & Regards,
>>> > >>> >> Amogh Desai
>>> > >>> >>
>>> > >>> >>
>>> > >>> >> On Wed, Nov 5, 2025 at 3:12 AM Jens Scheffler <
>>> [email protected]>
>>> > >>> wrote:
>>> > >>> >>
>>> > >>> >> Thanks Amough for adding docs for migration hints.
>>> > >>> >>
>>> > >>> >> We actually suffer a lot of integrations that had been built in
>>> the
>>> > >>> past
>>> > >>> >> which now makes it hard and serious effort to migrate to
>>> version 3.
>>> > So
>>> > >>> >> most probably we ourself need to take option 2 but knowing
>>> (like in
>>> > >>> the
>>> > >>> >> past) that you can not ask for support. But at least this
>>> un-blocks
>>> > us
>>> > >>> >> from staying with 2.x
>>> > >>> >>
>>> > >>> >> I'd love to take route 1 as well but then a lot of code needs
>>> to be
>>> > re
>>> > >>> >> written. This will take time, And in mid term we will migrate to
>>> > (1).
>>> > >>> >>
>>> > >>> >> As in the dev call I'd love if in Airflow 3.2 we could have
>>> option 1
>>> > >>> >> supported out-of-the-box - knowing that some security
>>> discussion is
>>> > >>> >> implied, so maybe need to be turned on and not be enabled by
>>> > default.
>>> > >>> >>
>>> > >>> >> The use cases we have and which requires some kind of DB access
>>> > where
>>> > >>> >> TaskSDK is not helping with support
>>> > >>> >>
>>> > >>> >>    * Adding task and dag run notes to tasks as better readable
>>> > status
>>> > >>> >>      while and after execution
>>> > >>> >>    * Aggregate status of tasks in the upstream of same Dag
>>> (pass,
>>> > >>> fail,
>>> > >>> >>      listing)
>>> > >>> >>    * Custom mass-triggering of other dags and collection of
>>> results
>>> > >>> from
>>> > >>> >>      triggered dags as scale-out option for dynamic task mapping
>>> > >>> >>    * Adjusting Pools based on available workers
>>> > >>> >>    * Checking results of pass/fail per edge worker and
>>> depending on
>>> > >>> >>      stability adjusting Queues on Edge workers based on status
>>> and
>>> > >>> >>      errors of workers
>>> > >>> >>    * Adjust Pools based on time of day
>>> > >>> >>    * And the famous: Partial database clean on a per Dag level
>>> with
>>> > >>> >>      different retention
>>> > >>> >>
>>> > >>> >> I would be okay removing option 3 and a clear warning to option
>>> 2 is
>>> > >>> >> also okay.
>>> > >>> >>
>>> > >>> >> Jens
>>> > >>> >>
>>> > >>> >> On 11/4/25 13:06, Jarek Potiuk wrote:
>>> > >>> >>> My take (and details can be found in the discussion):
>>> > >>> >>>
>>> > >>> >>> 2. Don't make the impression it is something that we will
>>> support -
>>> > >>> and
>>> > >>> >>> explain to the users that it **WILL** break in the future and
>>> it's
>>> > on
>>> > >>> >>> **THEM** to fix when it breaks.
>>> > >>> >>>
>>> > >>> >>> The 2 is **kinda** possible but we should strongly discourage
>>> this
>>> > >>> and
>>> > >>> >> say
>>> > >>> >>> "this will break any time and it's you who have to adapt to any
>>> > >>> future
>>> > >>> >>> changes in schema" - we had a lot of similar cases in the past
>>> > where
>>> > >>> our
>>> > >>> >>> users felt entitled to get **something** they felt as "valid
>>> way of
>>> > >>> using
>>> > >>> >>> things" broken by our changes. If we say "recommended" they
>>> will
>>> > >>> take it
>>> > >>> >> as
>>> > >>> >>> "and all the usage there is expected to work when Airlfow gets
>>> a
>>> > new
>>> > >>> >>> version so I should be fully entitled to open a valid issue
>>> when
>>> > >>> things
>>> > >>> >>> change".  I think "recommended" in this case is far too strong
>>> from
>>> > >>> our
>>> > >>> >>> side.
>>> > >>> >>>
>>> > >>> >>> 3. Absolutely remove.
>>> > >>> >>>
>>> > >>> >>> Sounds like we are going back to Airflow 2 behaviour. And we've
>>> > made
>>> > >>> all
>>> > >>> >>> the effort to break out of that. Various things will start
>>> breaking
>>> > >>> in
>>> > >>> >>> Airflow 3.2 and beyond. Once we complete the task isolation
>>> work,
>>> > >>> Airflow
>>> > >>> >>> workers will NOT have sqlalchemy package installed by default
>>> - it
>>> > >>> simply
>>> > >>> >>> will not be task-sdk dependency. The fact that you **can** use
>>> > >>> sqlalchemy
>>> > >>> >>> now is mostly a by-product of the fact that we have not
>>> completed
>>> > the
>>> > >>> >> split
>>> > >>> >>> yet - but it was not even **SUPPOSED** to work.
>>> > >>> >>>
>>> > >>> >>> J.
>>> > >>> >>>
>>> > >>> >>>
>>> > >>> >>>
>>> > >>> >>> On Tue, Nov 4, 2025 at 10:03 AM Amogh Desai<
>>> [email protected]>
>>> > >>> >> wrote:
>>> > >>> >>>> Hi All,
>>> > >>> >>>>
>>> > >>> >>>> I'm working on expanding the Airflow 3 upgrade documentation
>>> to
>>> > >>> address
>>> > >>> >> a
>>> > >>> >>>> frequently asked question from users
>>> > >>> >>>> migrating from Airflow 2.x: "How do I access the metadata
>>> database
>>> > >>> from
>>> > >>> >> my
>>> > >>> >>>> tasks now that direct database
>>> > >>> >>>> access is blocked?"
>>> > >>> >>>>
>>> > >>> >>>> Currently, Step 5 of the upgrade guide[1] only mentions that
>>> > direct
>>> > >>> DB
>>> > >>> >>>> access is blocked and points to a GitHub issue.
>>> > >>> >>>> However, users need concrete guidance on migration options.
>>> > >>> >>>>
>>> > >>> >>>> I've drafted documentation via [2] describing three
>>> approaches,
>>> > but
>>> > >>> >> before
>>> > >>> >>>> proceeding to finalising this, I'd like to get community
>>> > >>> >>>> consensus on how we should present these options, especially
>>> given
>>> > >>> the
>>> > >>> >>>> architectural principles we've established with
>>> > >>> >>>> Airflow 3.
>>> > >>> >>>>
>>> > >>> >>>> ## Proposed Approaches
>>> > >>> >>>>
>>> > >>> >>>> Approach 1: Airflow Python Client (REST API)
>>> > >>> >>>> - Uses `apache-airflow-client` [3] to interact via REST API
>>> > >>> >>>> - Pros: No DB drivers needed, aligned with Airflow 3
>>> architecture,
>>> > >>> >>>> API-first
>>> > >>> >>>> - Cons: Requires package installation, API server dependency,
>>> auth
>>> > >>> token
>>> > >>> >>>> management, limited operations possible
>>> > >>> >>>>
>>> > >>> >>>> Approach 2: Database Hooks (PostgresHook/MySqlHook)
>>> > >>> >>>> - Create a connection to metadata DB and use DB hooks to
>>> execute
>>> > SQL
>>> > >>> >>>> directly
>>> > >>> >>>> - Pros: Uses Airflow connection management, simple SQL
>>> interface
>>> > >>> >>>> - Cons: Requires DB drivers, direct network access, bypasses
>>> > >>> Airflow API
>>> > >>> >>>> server and connects to DB directly
>>> > >>> >>>>
>>> > >>> >>>> Approach 3: Direct SQLAlchemy Access (last resort)
>>> > >>> >>>> - Use environment variable with DB connection string and
>>> create
>>> > >>> >> SQLAlchemy
>>> > >>> >>>> session directly
>>> > >>> >>>> - Pros: Maximum flexibility
>>> > >>> >>>> - Cons: Bypasses all Airflow protections, schema coupling,
>>> manual
>>> > >>> >>>> connection management, worst possible option.
>>> > >>> >>>>
>>> > >>> >>>> I was expecting some pushback regarding these approaches and
>>> there
>>> > >>> were
>>> > >>> >>>> (rightly) some important concerns raised
>>> > >>> >>>> by Jarek about Approaches 2 and 3:
>>> > >>> >>>>
>>> > >>> >>>> 1. Breaks Task Isolation - Contradicts Airflow 3's core
>>> promise
>>> > >>> >>>> 2. DB as Public Interface - Schema changes would require
>>> release
>>> > >>> notes
>>> > >>> >> and
>>> > >>> >>>> break user code
>>> > >>> >>>> 3. Performance Impact - Using Approach 2 creates direct DB
>>> access
>>> > >>> and
>>> > >>> >> can
>>> > >>> >>>> bring back Airflow 2's
>>> > >>> >>>> connection-per-task overhead
>>> > >>> >>>> 4. Security Model Violation - Contradicts documented isolation
>>> > >>> >> principles
>>> > >>> >>>> Considering these comments, this is what I want to document
>>> now:
>>> > >>> >>>>
>>> > >>> >>>> 1. Approach 1 - Keep as primary/recommended solution (aligns
>>> with
>>> > >>> >> Airflow 3
>>> > >>> >>>> architecture)
>>> > >>> >>>> 2. Approach 2 - Present as "known workaround" (not
>>> recommendation)
>>> > >>> with
>>> > >>> >>>> explicit warnings
>>> > >>> >>>> about breaking isolation, schema not being public API,
>>> performance
>>> > >>> >>>> implications, and no support guarantees
>>> > >>> >>>> 3. Approach 3 - Remove entirely, or keep with strongest
>>> possible
>>> > >>> >> warnings
>>> > >>> >>>> (would love to hear what others think for
>>> > >>> >>>> this one particularly)
>>> > >>> >>>>
>>> > >>> >>>> Once we arrive at some discussion points on this one, I would
>>> like
>>> > >>> to
>>> > >>> >> call
>>> > >>> >>>> for a lazy consensus for posterity and visibility
>>> > >>> >>>> of the community.
>>> > >>> >>>>
>>> > >>> >>>> Looking forward to your feedback!
>>> > >>> >>>>
>>> > >>> >>>> [1]
>>> > >>> >>>>
>>> > >>> >>>>
>>> > >>> >>
>>> > >>>
>>> >
>>> https://github.com/apache/airflow/blob/main/airflow-core/docs/installation/upgrading_to_airflow3.rst#step-5-review-custom-operators-for-direct-db-access
>>> > >>> >>>> [2]https://github.com/apache/airflow/pull/57479
>>> > >>> >>>> [3]https://github.com/apache/airflow-client-python
>>> > >>> >>>>
>>> > >>>
>>> > >>>
>>> ---------------------------------------------------------------------
>>> > >>> To unsubscribe, e-mail: [email protected]
>>> > >>> For additional commands, e-mail: [email protected]
>>> > >>>
>>> > >>>
>>> >
>>>
>>

Reply via email to