Re: [D] Implementing a Locking Mechanism in Airflow Variables to Prevent Race Conditions [airflow]

2025-02-06 Thread via GitHub


GitHub user akomisarek added a comment to the discussion: Implementing a 
Locking Mechanism in Airflow Variables to Prevent Race Conditions

Nice interesting! Thanks for sharing, yes as mentioned above I don't think 
Airflow offers anything out of the box. Seems like external locking/queueing is 
required :( 

GitHub link: 
https://github.com/apache/airflow/discussions/46482#discussioncomment-12080639


This is an automatically sent email for commits@airflow.apache.org.
To unsubscribe, please send an email to: commits-unsubscr...@airflow.apache.org



Re: [D] Implementing a Locking Mechanism in Airflow Variables to Prevent Race Conditions [airflow]

2025-02-05 Thread via GitHub


GitHub user PraveenKumarM21 added a comment to the discussion: Implementing a 
Locking Mechanism in Airflow Variables to Prevent Race Conditions

I have multiple DAGs processing different customer accounts, and I need to 
ensure that:
✅ DAGs can run concurrently for different customers (i.e., if Customer A and 
Customer B both have pending jobs, they can process in parallel).
❌ Only one DAG run should be active per customer at any given time (i.e., if a 
DAG is already running for Customer A, another instance of the DAG should not 
start for Customer A until the first one finishes).

Why Not Use a Single DAG?

Each customer’s processing is independent and triggered dynamically.

A single DAG would require complex branching logic to check and handle multiple 
customers, making it harder to maintain, debug, and monitor.

Dataset-aware scheduling might help, but it doesn’t inherently prevent multiple 
DAG runs for the same customer at the same time.


Why a Locking Mechanism?

I originally tried using Airflow Variables to track running customers, but 
since Airflow Variables lack atomic operations, there is a race condition where 
multiple DAGs may check the variable simultaneously and both start running.

I want to enforce this locking within Airflow itself without relying on 
external systems like Redis, etcd, or DynamoDB.

GitHub link: 
https://github.com/apache/airflow/discussions/46482#discussioncomment-12076246


This is an automatically sent email for commits@airflow.apache.org.
To unsubscribe, please send an email to: commits-unsubscr...@airflow.apache.org



Re: [D] Implementing a Locking Mechanism in Airflow Variables to Prevent Race Conditions [airflow]

2025-02-05 Thread via GitHub


GitHub user PraveenKumarM21 added a comment to the discussion: Implementing a 
Locking Mechanism in Airflow Variables to Prevent Race Conditions

I was hoping to keep the locking mechanism within Airflow itself to avoid 
adding external dependencies.

GitHub link: 
https://github.com/apache/airflow/discussions/46482#discussioncomment-12076187


This is an automatically sent email for commits@airflow.apache.org.
To unsubscribe, please send an email to: commits-unsubscr...@airflow.apache.org



Re: [D] Implementing a Locking Mechanism in Airflow Variables to Prevent Race Conditions [airflow]

2025-02-05 Thread via GitHub


GitHub user akomisarek added a comment to the discussion: Implementing a 
Locking Mechanism in Airflow Variables to Prevent Race Conditions

Can you elaborate on your business use-case? Why not design them to be part of 
single DAG? or Dataset-aware scheduling? 

Alternatively maybe you can use ACID compliant data storage iceberg/delta and 
you can run them in parallel? I would rather consider designing the pipelines 
differently than trying to force tool to do what it is not designed of doing! 

GitHub link: 
https://github.com/apache/airflow/discussions/46482#discussioncomment-12072638


This is an automatically sent email for commits@airflow.apache.org.
To unsubscribe, please send an email to: commits-unsubscr...@airflow.apache.org



Re: [D] Implementing a Locking Mechanism in Airflow Variables to Prevent Race Conditions [airflow]

2025-02-05 Thread via GitHub


GitHub user potiuk added a comment to the discussion: Implementing a Locking 
Mechanism in Airflow Variables to Prevent Race Conditions

No such mechanism AFAIK - you should likely use external system for that. 
Depending on the place you run it with you might have different options (for 
example dynamodb in AWS) - that's probably the best advice you can get.

GitHub link: 
https://github.com/apache/airflow/discussions/46482#discussioncomment-12072637


This is an automatically sent email for commits@airflow.apache.org.
To unsubscribe, please send an email to: commits-unsubscr...@airflow.apache.org