Re: [DISCUSS] How to document DB access options in Airflow 3 upgrade docs

Buğra Öztürk Thu, 06 Nov 2025 15:28:40 -0800

Great initiative Amogh, thanks! I agree with others on 1 and not
encouraging for 2 as well.


Idea of filling the gaps with adding more endpoints would enable more
automation with a secure environment in the long run. In addition, we can
consider providing some more granular clean up/db functionality on CLI too
where those could be automated on server side with Admin commands and not
from Dags, just an idea.

I hope we will add airflowctl there soon, of course with limited
opwrations. 🤞

Bugra Ozturk

On Thu, 6 Nov 2025, 14:32 Amogh Desai, <[email protected]> wrote:

> Looking for some more eyes on this one.
>
> Thanks & Regards,
> Amogh Desai
>
>
> On Thu, Nov 6, 2025 at 12:55 PM Amogh Desai <[email protected]> wrote:
>
> > > Yes, API could do this with 5-times more code including the limits per
> > response where you need to loop over all pages until you have a full
> > list (e.g. API limited to 100 results). Not impossible but a lot of
> > re-implementation.
> >
> > Just wondering, why not vanilla task mapping?
> >
> > > Might be something that could be a potential contributionto "airflow db
> > clean"
> >
> > Maybe, yes.
> >
> > Thanks & Regards,
> > Amogh Desai
> >
> >
> > On Thu, Nov 6, 2025 at 12:53 PM Amogh Desai <[email protected]>
> wrote:
> >
> >> > I think our efforts should be way more focused on adding some missing
> >> API
> >> calls in Task SDK that our users miss, rather than in allowing them to
> use
> >> "old ways". Every time someone says "I cannot migrate because i did
> this",
> >> our first thought should be:
> >>
> >> * is it a valid way?
> >> * is it acceptable to have an API call for it in SDK?
> >> * should we do it ?
> >>
> >>
> >> That is currently a grey zone we need to define better I think. Certain
> >> use cases might be general
> >> enough that we need an execution API endpoint for that, and we can
> >> certainly do that. But there will
> >> also be cases when the use case is niche and we will NOT want to have
> >> execution API endpoints
> >> for that for various reasons. The harder problem to solve is the latter.
> >>
> >> But you make a fair point here.
> >>
> >>
> >>
> >> Thanks & Regards,
> >> Amogh Desai
> >>
> >>
> >> On Thu, Nov 6, 2025 at 2:33 AM Jens Scheffler <[email protected]>
> >> wrote:
> >>
> >>> > Thanks for your comments too, Jens.
> >>> >
> >>> >>    * Aggregate status of tasks in the upstream of same Dag (pass,
> >>> fail,
> >>> >>      listing)
> >>> >>
> >>> >> Does the DAG run page not show that?
> >>> Partly yes, but in our environment it is a bit more complex than
> >>> "pass/fail". Bit more complex story, we want to know more details of
> the
> >>> failed and aggregate details. So high-level saying get the XCom from
> >>> failed and then aggregate details. Imagine all tasks ahve an owner and
> >>> we want to send a notification to each owner but if 10 tasks from one
> >>> owner fail we want to send 1 notification with 10 failed in the text.
> >>> And, yes, can be done via API.
> >>> >>    * Custom mass-triggering of other dags and collection of results
> >>> from
> >>> >>     triggered dags as scale-out option for dynamic task mapping
> >>> >>
> >>> >> Can't an API do that?
> >>> Yes, API could do this with 5-times more code including the limits per
> >>> response where you need to loop over all pages until you have a full
> >>> list (e.g. API limited to 100 results). Not impossible but a lot of
> >>> re-implementation.
> >>> >>    * And the famous: Partial database clean on a per Dag level with
> >>> >>      different retention
> >>> >>
> >>> >> Can you elaborate this one a bit :D
> >>>
> >>> Yes. We have some Dag that is called 50k-100k times per day and others
> >>> that are called 12 times a day. And a lot of others in-between like 25k
> >>> runs per month. The Dag with 100k runs per day we want to archive ASAP
> >>> probably after 3 days for all not failed calls to reduce DB overhead.
> >>> The failed ones we keep for 14 days for potential re-processing if
> there
> >>> was an outage.
> >>>
> >>> Most other Dag Runs we keep for a month. And some we cap that we
> archive
> >>> if more than 25k runs
> >>>
> >>> Might be something that could be a potential contributionto "airflow db
> >>> clean"
> >>>
> >>> >>
> >>> >> Thanks & Regards,
> >>> >> Amogh Desai
> >>> >>
> >>> >>
> >>> >> On Wed, Nov 5, 2025 at 3:12 AM Jens Scheffler <[email protected]>
> >>> wrote:
> >>> >>
> >>> >> Thanks Amough for adding docs for migration hints.
> >>> >>
> >>> >> We actually suffer a lot of integrations that had been built in the
> >>> past
> >>> >> which now makes it hard and serious effort to migrate to version 3.
> So
> >>> >> most probably we ourself need to take option 2 but knowing (like in
> >>> the
> >>> >> past) that you can not ask for support. But at least this un-blocks
> us
> >>> >> from staying with 2.x
> >>> >>
> >>> >> I'd love to take route 1 as well but then a lot of code needs to be
> re
> >>> >> written. This will take time, And in mid term we will migrate to
> (1).
> >>> >>
> >>> >> As in the dev call I'd love if in Airflow 3.2 we could have option 1
> >>> >> supported out-of-the-box - knowing that some security discussion is
> >>> >> implied, so maybe need to be turned on and not be enabled by
> default.
> >>> >>
> >>> >> The use cases we have and which requires some kind of DB access
> where
> >>> >> TaskSDK is not helping with support
> >>> >>
> >>> >>    * Adding task and dag run notes to tasks as better readable
> status
> >>> >>      while and after execution
> >>> >>    * Aggregate status of tasks in the upstream of same Dag (pass,
> >>> fail,
> >>> >>      listing)
> >>> >>    * Custom mass-triggering of other dags and collection of results
> >>> from
> >>> >>      triggered dags as scale-out option for dynamic task mapping
> >>> >>    * Adjusting Pools based on available workers
> >>> >>    * Checking results of pass/fail per edge worker and depending on
> >>> >>      stability adjusting Queues on Edge workers based on status and
> >>> >>      errors of workers
> >>> >>    * Adjust Pools based on time of day
> >>> >>    * And the famous: Partial database clean on a per Dag level with
> >>> >>      different retention
> >>> >>
> >>> >> I would be okay removing option 3 and a clear warning to option 2 is
> >>> >> also okay.
> >>> >>
> >>> >> Jens
> >>> >>
> >>> >> On 11/4/25 13:06, Jarek Potiuk wrote:
> >>> >>> My take (and details can be found in the discussion):
> >>> >>>
> >>> >>> 2. Don't make the impression it is something that we will support -
> >>> and
> >>> >>> explain to the users that it **WILL** break in the future and it's
> on
> >>> >>> **THEM** to fix when it breaks.
> >>> >>>
> >>> >>> The 2 is **kinda** possible but we should strongly discourage this
> >>> and
> >>> >> say
> >>> >>> "this will break any time and it's you who have to adapt to any
> >>> future
> >>> >>> changes in schema" - we had a lot of similar cases in the past
> where
> >>> our
> >>> >>> users felt entitled to get **something** they felt as "valid way of
> >>> using
> >>> >>> things" broken by our changes. If we say "recommended" they will
> >>> take it
> >>> >> as
> >>> >>> "and all the usage there is expected to work when Airlfow gets a
> new
> >>> >>> version so I should be fully entitled to open a valid issue when
> >>> things
> >>> >>> change".  I think "recommended" in this case is far too strong from
> >>> our
> >>> >>> side.
> >>> >>>
> >>> >>> 3. Absolutely remove.
> >>> >>>
> >>> >>> Sounds like we are going back to Airflow 2 behaviour. And we've
> made
> >>> all
> >>> >>> the effort to break out of that. Various things will start breaking
> >>> in
> >>> >>> Airflow 3.2 and beyond. Once we complete the task isolation work,
> >>> Airflow
> >>> >>> workers will NOT have sqlalchemy package installed by default - it
> >>> simply
> >>> >>> will not be task-sdk dependency. The fact that you **can** use
> >>> sqlalchemy
> >>> >>> now is mostly a by-product of the fact that we have not completed
> the
> >>> >> split
> >>> >>> yet - but it was not even **SUPPOSED** to work.
> >>> >>>
> >>> >>> J.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> On Tue, Nov 4, 2025 at 10:03 AM Amogh Desai<[email protected]>
> >>> >> wrote:
> >>> >>>> Hi All,
> >>> >>>>
> >>> >>>> I'm working on expanding the Airflow 3 upgrade documentation to
> >>> address
> >>> >> a
> >>> >>>> frequently asked question from users
> >>> >>>> migrating from Airflow 2.x: "How do I access the metadata database
> >>> from
> >>> >> my
> >>> >>>> tasks now that direct database
> >>> >>>> access is blocked?"
> >>> >>>>
> >>> >>>> Currently, Step 5 of the upgrade guide[1] only mentions that
> direct
> >>> DB
> >>> >>>> access is blocked and points to a GitHub issue.
> >>> >>>> However, users need concrete guidance on migration options.
> >>> >>>>
> >>> >>>> I've drafted documentation via [2] describing three approaches,
> but
> >>> >> before
> >>> >>>> proceeding to finalising this, I'd like to get community
> >>> >>>> consensus on how we should present these options, especially given
> >>> the
> >>> >>>> architectural principles we've established with
> >>> >>>> Airflow 3.
> >>> >>>>
> >>> >>>> ## Proposed Approaches
> >>> >>>>
> >>> >>>> Approach 1: Airflow Python Client (REST API)
> >>> >>>> - Uses `apache-airflow-client` [3] to interact via REST API
> >>> >>>> - Pros: No DB drivers needed, aligned with Airflow 3 architecture,
> >>> >>>> API-first
> >>> >>>> - Cons: Requires package installation, API server dependency, auth
> >>> token
> >>> >>>> management, limited operations possible
> >>> >>>>
> >>> >>>> Approach 2: Database Hooks (PostgresHook/MySqlHook)
> >>> >>>> - Create a connection to metadata DB and use DB hooks to execute
> SQL
> >>> >>>> directly
> >>> >>>> - Pros: Uses Airflow connection management, simple SQL interface
> >>> >>>> - Cons: Requires DB drivers, direct network access, bypasses
> >>> Airflow API
> >>> >>>> server and connects to DB directly
> >>> >>>>
> >>> >>>> Approach 3: Direct SQLAlchemy Access (last resort)
> >>> >>>> - Use environment variable with DB connection string and create
> >>> >> SQLAlchemy
> >>> >>>> session directly
> >>> >>>> - Pros: Maximum flexibility
> >>> >>>> - Cons: Bypasses all Airflow protections, schema coupling, manual
> >>> >>>> connection management, worst possible option.
> >>> >>>>
> >>> >>>> I was expecting some pushback regarding these approaches and there
> >>> were
> >>> >>>> (rightly) some important concerns raised
> >>> >>>> by Jarek about Approaches 2 and 3:
> >>> >>>>
> >>> >>>> 1. Breaks Task Isolation - Contradicts Airflow 3's core promise
> >>> >>>> 2. DB as Public Interface - Schema changes would require release
> >>> notes
> >>> >> and
> >>> >>>> break user code
> >>> >>>> 3. Performance Impact - Using Approach 2 creates direct DB access
> >>> and
> >>> >> can
> >>> >>>> bring back Airflow 2's
> >>> >>>> connection-per-task overhead
> >>> >>>> 4. Security Model Violation - Contradicts documented isolation
> >>> >> principles
> >>> >>>> Considering these comments, this is what I want to document now:
> >>> >>>>
> >>> >>>> 1. Approach 1 - Keep as primary/recommended solution (aligns with
> >>> >> Airflow 3
> >>> >>>> architecture)
> >>> >>>> 2. Approach 2 - Present as "known workaround" (not recommendation)
> >>> with
> >>> >>>> explicit warnings
> >>> >>>> about breaking isolation, schema not being public API, performance
> >>> >>>> implications, and no support guarantees
> >>> >>>> 3. Approach 3 - Remove entirely, or keep with strongest possible
> >>> >> warnings
> >>> >>>> (would love to hear what others think for
> >>> >>>> this one particularly)
> >>> >>>>
> >>> >>>> Once we arrive at some discussion points on this one, I would like
> >>> to
> >>> >> call
> >>> >>>> for a lazy consensus for posterity and visibility
> >>> >>>> of the community.
> >>> >>>>
> >>> >>>> Looking forward to your feedback!
> >>> >>>>
> >>> >>>> [1]
> >>> >>>>
> >>> >>>>
> >>> >>
> >>>
> https://github.com/apache/airflow/blob/main/airflow-core/docs/installation/upgrading_to_airflow3.rst#step-5-review-custom-operators-for-direct-db-access
> >>> >>>> [2]https://github.com/apache/airflow/pull/57479
> >>> >>>> [3]https://github.com/apache/airflow-client-python
> >>> >>>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: [email protected]
> >>> For additional commands, e-mail: [email protected]
> >>>
> >>>
>

Re: [DISCUSS] How to document DB access options in Airflow 3 upgrade docs

Reply via email to