Started a lazy consensus on this one. Thanks & Regards, Amogh Desai
On Tue, Nov 11, 2025 at 10:37 AM Amogh Desai <[email protected]> wrote: > Short reminder: About 10 hours left till I wind this discussion up and > start a lazy consensus for the same. > > Thanks & Regards, > Amogh Desai > > > On Fri, Nov 7, 2025 at 12:58 PM Amogh Desai <[email protected]> wrote: > >> I will be waiting for responses on this discussion before creating a lazy >> consensus till *Tue, Nov 11, 3:00 PM UTC* >> >> So, if you have thoughts, feel free to chime in now :) >> >> Thanks & Regards, >> Amogh Desai >> >> >> On Fri, Nov 7, 2025 at 4:57 AM Buğra Öztürk <[email protected]> >> wrote: >> >>> Great initiative Amogh, thanks! I agree with others on 1 and not >>> encouraging for 2 as well. >>> >>> Idea of filling the gaps with adding more endpoints would enable more >>> automation with a secure environment in the long run. In addition, we can >>> consider providing some more granular clean up/db functionality on CLI >>> too >>> where those could be automated on server side with Admin commands and not >>> from Dags, just an idea. >>> >>> I hope we will add airflowctl there soon, of course with limited >>> opwrations. 🤞 >>> >>> Bugra Ozturk >>> >>> On Thu, 6 Nov 2025, 14:32 Amogh Desai, <[email protected]> wrote: >>> >>> > Looking for some more eyes on this one. >>> > >>> > Thanks & Regards, >>> > Amogh Desai >>> > >>> > >>> > On Thu, Nov 6, 2025 at 12:55 PM Amogh Desai <[email protected]> >>> wrote: >>> > >>> > > > Yes, API could do this with 5-times more code including the limits >>> per >>> > > response where you need to loop over all pages until you have a full >>> > > list (e.g. API limited to 100 results). Not impossible but a lot of >>> > > re-implementation. >>> > > >>> > > Just wondering, why not vanilla task mapping? >>> > > >>> > > > Might be something that could be a potential contributionto >>> "airflow db >>> > > clean" >>> > > >>> > > Maybe, yes. >>> > > >>> > > Thanks & Regards, >>> > > Amogh Desai >>> > > >>> > > >>> > > On Thu, Nov 6, 2025 at 12:53 PM Amogh Desai <[email protected]> >>> > wrote: >>> > > >>> > >> > I think our efforts should be way more focused on adding some >>> missing >>> > >> API >>> > >> calls in Task SDK that our users miss, rather than in allowing them >>> to >>> > use >>> > >> "old ways". Every time someone says "I cannot migrate because i did >>> > this", >>> > >> our first thought should be: >>> > >> >>> > >> * is it a valid way? >>> > >> * is it acceptable to have an API call for it in SDK? >>> > >> * should we do it ? >>> > >> >>> > >> >>> > >> That is currently a grey zone we need to define better I think. >>> Certain >>> > >> use cases might be general >>> > >> enough that we need an execution API endpoint for that, and we can >>> > >> certainly do that. But there will >>> > >> also be cases when the use case is niche and we will NOT want to >>> have >>> > >> execution API endpoints >>> > >> for that for various reasons. The harder problem to solve is the >>> latter. >>> > >> >>> > >> But you make a fair point here. >>> > >> >>> > >> >>> > >> >>> > >> Thanks & Regards, >>> > >> Amogh Desai >>> > >> >>> > >> >>> > >> On Thu, Nov 6, 2025 at 2:33 AM Jens Scheffler <[email protected]> >>> > >> wrote: >>> > >> >>> > >>> > Thanks for your comments too, Jens. >>> > >>> > >>> > >>> >> * Aggregate status of tasks in the upstream of same Dag >>> (pass, >>> > >>> fail, >>> > >>> >> listing) >>> > >>> >> >>> > >>> >> Does the DAG run page not show that? >>> > >>> Partly yes, but in our environment it is a bit more complex than >>> > >>> "pass/fail". Bit more complex story, we want to know more details >>> of >>> > the >>> > >>> failed and aggregate details. So high-level saying get the XCom >>> from >>> > >>> failed and then aggregate details. Imagine all tasks ahve an owner >>> and >>> > >>> we want to send a notification to each owner but if 10 tasks from >>> one >>> > >>> owner fail we want to send 1 notification with 10 failed in the >>> text. >>> > >>> And, yes, can be done via API. >>> > >>> >> * Custom mass-triggering of other dags and collection of >>> results >>> > >>> from >>> > >>> >> triggered dags as scale-out option for dynamic task mapping >>> > >>> >> >>> > >>> >> Can't an API do that? >>> > >>> Yes, API could do this with 5-times more code including the limits >>> per >>> > >>> response where you need to loop over all pages until you have a >>> full >>> > >>> list (e.g. API limited to 100 results). Not impossible but a lot of >>> > >>> re-implementation. >>> > >>> >> * And the famous: Partial database clean on a per Dag level >>> with >>> > >>> >> different retention >>> > >>> >> >>> > >>> >> Can you elaborate this one a bit :D >>> > >>> >>> > >>> Yes. We have some Dag that is called 50k-100k times per day and >>> others >>> > >>> that are called 12 times a day. And a lot of others in-between >>> like 25k >>> > >>> runs per month. The Dag with 100k runs per day we want to archive >>> ASAP >>> > >>> probably after 3 days for all not failed calls to reduce DB >>> overhead. >>> > >>> The failed ones we keep for 14 days for potential re-processing if >>> > there >>> > >>> was an outage. >>> > >>> >>> > >>> Most other Dag Runs we keep for a month. And some we cap that we >>> > archive >>> > >>> if more than 25k runs >>> > >>> >>> > >>> Might be something that could be a potential contributionto >>> "airflow db >>> > >>> clean" >>> > >>> >>> > >>> >> >>> > >>> >> Thanks & Regards, >>> > >>> >> Amogh Desai >>> > >>> >> >>> > >>> >> >>> > >>> >> On Wed, Nov 5, 2025 at 3:12 AM Jens Scheffler < >>> [email protected]> >>> > >>> wrote: >>> > >>> >> >>> > >>> >> Thanks Amough for adding docs for migration hints. >>> > >>> >> >>> > >>> >> We actually suffer a lot of integrations that had been built in >>> the >>> > >>> past >>> > >>> >> which now makes it hard and serious effort to migrate to >>> version 3. >>> > So >>> > >>> >> most probably we ourself need to take option 2 but knowing >>> (like in >>> > >>> the >>> > >>> >> past) that you can not ask for support. But at least this >>> un-blocks >>> > us >>> > >>> >> from staying with 2.x >>> > >>> >> >>> > >>> >> I'd love to take route 1 as well but then a lot of code needs >>> to be >>> > re >>> > >>> >> written. This will take time, And in mid term we will migrate to >>> > (1). >>> > >>> >> >>> > >>> >> As in the dev call I'd love if in Airflow 3.2 we could have >>> option 1 >>> > >>> >> supported out-of-the-box - knowing that some security >>> discussion is >>> > >>> >> implied, so maybe need to be turned on and not be enabled by >>> > default. >>> > >>> >> >>> > >>> >> The use cases we have and which requires some kind of DB access >>> > where >>> > >>> >> TaskSDK is not helping with support >>> > >>> >> >>> > >>> >> * Adding task and dag run notes to tasks as better readable >>> > status >>> > >>> >> while and after execution >>> > >>> >> * Aggregate status of tasks in the upstream of same Dag >>> (pass, >>> > >>> fail, >>> > >>> >> listing) >>> > >>> >> * Custom mass-triggering of other dags and collection of >>> results >>> > >>> from >>> > >>> >> triggered dags as scale-out option for dynamic task mapping >>> > >>> >> * Adjusting Pools based on available workers >>> > >>> >> * Checking results of pass/fail per edge worker and >>> depending on >>> > >>> >> stability adjusting Queues on Edge workers based on status >>> and >>> > >>> >> errors of workers >>> > >>> >> * Adjust Pools based on time of day >>> > >>> >> * And the famous: Partial database clean on a per Dag level >>> with >>> > >>> >> different retention >>> > >>> >> >>> > >>> >> I would be okay removing option 3 and a clear warning to option >>> 2 is >>> > >>> >> also okay. >>> > >>> >> >>> > >>> >> Jens >>> > >>> >> >>> > >>> >> On 11/4/25 13:06, Jarek Potiuk wrote: >>> > >>> >>> My take (and details can be found in the discussion): >>> > >>> >>> >>> > >>> >>> 2. Don't make the impression it is something that we will >>> support - >>> > >>> and >>> > >>> >>> explain to the users that it **WILL** break in the future and >>> it's >>> > on >>> > >>> >>> **THEM** to fix when it breaks. >>> > >>> >>> >>> > >>> >>> The 2 is **kinda** possible but we should strongly discourage >>> this >>> > >>> and >>> > >>> >> say >>> > >>> >>> "this will break any time and it's you who have to adapt to any >>> > >>> future >>> > >>> >>> changes in schema" - we had a lot of similar cases in the past >>> > where >>> > >>> our >>> > >>> >>> users felt entitled to get **something** they felt as "valid >>> way of >>> > >>> using >>> > >>> >>> things" broken by our changes. If we say "recommended" they >>> will >>> > >>> take it >>> > >>> >> as >>> > >>> >>> "and all the usage there is expected to work when Airlfow gets >>> a >>> > new >>> > >>> >>> version so I should be fully entitled to open a valid issue >>> when >>> > >>> things >>> > >>> >>> change". I think "recommended" in this case is far too strong >>> from >>> > >>> our >>> > >>> >>> side. >>> > >>> >>> >>> > >>> >>> 3. Absolutely remove. >>> > >>> >>> >>> > >>> >>> Sounds like we are going back to Airflow 2 behaviour. And we've >>> > made >>> > >>> all >>> > >>> >>> the effort to break out of that. Various things will start >>> breaking >>> > >>> in >>> > >>> >>> Airflow 3.2 and beyond. Once we complete the task isolation >>> work, >>> > >>> Airflow >>> > >>> >>> workers will NOT have sqlalchemy package installed by default >>> - it >>> > >>> simply >>> > >>> >>> will not be task-sdk dependency. The fact that you **can** use >>> > >>> sqlalchemy >>> > >>> >>> now is mostly a by-product of the fact that we have not >>> completed >>> > the >>> > >>> >> split >>> > >>> >>> yet - but it was not even **SUPPOSED** to work. >>> > >>> >>> >>> > >>> >>> J. >>> > >>> >>> >>> > >>> >>> >>> > >>> >>> >>> > >>> >>> On Tue, Nov 4, 2025 at 10:03 AM Amogh Desai< >>> [email protected]> >>> > >>> >> wrote: >>> > >>> >>>> Hi All, >>> > >>> >>>> >>> > >>> >>>> I'm working on expanding the Airflow 3 upgrade documentation >>> to >>> > >>> address >>> > >>> >> a >>> > >>> >>>> frequently asked question from users >>> > >>> >>>> migrating from Airflow 2.x: "How do I access the metadata >>> database >>> > >>> from >>> > >>> >> my >>> > >>> >>>> tasks now that direct database >>> > >>> >>>> access is blocked?" >>> > >>> >>>> >>> > >>> >>>> Currently, Step 5 of the upgrade guide[1] only mentions that >>> > direct >>> > >>> DB >>> > >>> >>>> access is blocked and points to a GitHub issue. >>> > >>> >>>> However, users need concrete guidance on migration options. >>> > >>> >>>> >>> > >>> >>>> I've drafted documentation via [2] describing three >>> approaches, >>> > but >>> > >>> >> before >>> > >>> >>>> proceeding to finalising this, I'd like to get community >>> > >>> >>>> consensus on how we should present these options, especially >>> given >>> > >>> the >>> > >>> >>>> architectural principles we've established with >>> > >>> >>>> Airflow 3. >>> > >>> >>>> >>> > >>> >>>> ## Proposed Approaches >>> > >>> >>>> >>> > >>> >>>> Approach 1: Airflow Python Client (REST API) >>> > >>> >>>> - Uses `apache-airflow-client` [3] to interact via REST API >>> > >>> >>>> - Pros: No DB drivers needed, aligned with Airflow 3 >>> architecture, >>> > >>> >>>> API-first >>> > >>> >>>> - Cons: Requires package installation, API server dependency, >>> auth >>> > >>> token >>> > >>> >>>> management, limited operations possible >>> > >>> >>>> >>> > >>> >>>> Approach 2: Database Hooks (PostgresHook/MySqlHook) >>> > >>> >>>> - Create a connection to metadata DB and use DB hooks to >>> execute >>> > SQL >>> > >>> >>>> directly >>> > >>> >>>> - Pros: Uses Airflow connection management, simple SQL >>> interface >>> > >>> >>>> - Cons: Requires DB drivers, direct network access, bypasses >>> > >>> Airflow API >>> > >>> >>>> server and connects to DB directly >>> > >>> >>>> >>> > >>> >>>> Approach 3: Direct SQLAlchemy Access (last resort) >>> > >>> >>>> - Use environment variable with DB connection string and >>> create >>> > >>> >> SQLAlchemy >>> > >>> >>>> session directly >>> > >>> >>>> - Pros: Maximum flexibility >>> > >>> >>>> - Cons: Bypasses all Airflow protections, schema coupling, >>> manual >>> > >>> >>>> connection management, worst possible option. >>> > >>> >>>> >>> > >>> >>>> I was expecting some pushback regarding these approaches and >>> there >>> > >>> were >>> > >>> >>>> (rightly) some important concerns raised >>> > >>> >>>> by Jarek about Approaches 2 and 3: >>> > >>> >>>> >>> > >>> >>>> 1. Breaks Task Isolation - Contradicts Airflow 3's core >>> promise >>> > >>> >>>> 2. DB as Public Interface - Schema changes would require >>> release >>> > >>> notes >>> > >>> >> and >>> > >>> >>>> break user code >>> > >>> >>>> 3. Performance Impact - Using Approach 2 creates direct DB >>> access >>> > >>> and >>> > >>> >> can >>> > >>> >>>> bring back Airflow 2's >>> > >>> >>>> connection-per-task overhead >>> > >>> >>>> 4. Security Model Violation - Contradicts documented isolation >>> > >>> >> principles >>> > >>> >>>> Considering these comments, this is what I want to document >>> now: >>> > >>> >>>> >>> > >>> >>>> 1. Approach 1 - Keep as primary/recommended solution (aligns >>> with >>> > >>> >> Airflow 3 >>> > >>> >>>> architecture) >>> > >>> >>>> 2. Approach 2 - Present as "known workaround" (not >>> recommendation) >>> > >>> with >>> > >>> >>>> explicit warnings >>> > >>> >>>> about breaking isolation, schema not being public API, >>> performance >>> > >>> >>>> implications, and no support guarantees >>> > >>> >>>> 3. Approach 3 - Remove entirely, or keep with strongest >>> possible >>> > >>> >> warnings >>> > >>> >>>> (would love to hear what others think for >>> > >>> >>>> this one particularly) >>> > >>> >>>> >>> > >>> >>>> Once we arrive at some discussion points on this one, I would >>> like >>> > >>> to >>> > >>> >> call >>> > >>> >>>> for a lazy consensus for posterity and visibility >>> > >>> >>>> of the community. >>> > >>> >>>> >>> > >>> >>>> Looking forward to your feedback! >>> > >>> >>>> >>> > >>> >>>> [1] >>> > >>> >>>> >>> > >>> >>>> >>> > >>> >> >>> > >>> >>> > >>> https://github.com/apache/airflow/blob/main/airflow-core/docs/installation/upgrading_to_airflow3.rst#step-5-review-custom-operators-for-direct-db-access >>> > >>> >>>> [2]https://github.com/apache/airflow/pull/57479 >>> > >>> >>>> [3]https://github.com/apache/airflow-client-python >>> > >>> >>>> >>> > >>> >>> > >>> >>> --------------------------------------------------------------------- >>> > >>> To unsubscribe, e-mail: [email protected] >>> > >>> For additional commands, e-mail: [email protected] >>> > >>> >>> > >>> >>> > >>> >>
