GitHub user PraveenKumarM21 added a comment to the discussion: Implementing a 
Locking Mechanism in Airflow Variables to Prevent Race Conditions

I have multiple DAGs processing different customer accounts, and I need to 
ensure that:
✅ DAGs can run concurrently for different customers (i.e., if Customer A and 
Customer B both have pending jobs, they can process in parallel).
❌ Only one DAG run should be active per customer at any given time (i.e., if a 
DAG is already running for Customer A, another instance of the DAG should not 
start for Customer A until the first one finishes).

Why Not Use a Single DAG?

Each customer’s processing is independent and triggered dynamically.

A single DAG would require complex branching logic to check and handle multiple 
customers, making it harder to maintain, debug, and monitor.

Dataset-aware scheduling might help, but it doesn’t inherently prevent multiple 
DAG runs for the same customer at the same time.


Why a Locking Mechanism?

I originally tried using Airflow Variables to track running customers, but 
since Airflow Variables lack atomic operations, there is a race condition where 
multiple DAGs may check the variable simultaneously and both start running.

I want to enforce this locking within Airflow itself without relying on 
external systems like Redis, etcd, or DynamoDB.

GitHub link: 
https://github.com/apache/airflow/discussions/46482#discussioncomment-12076246

----
This is an automatically sent email for commits@airflow.apache.org.
To unsubscribe, please send an email to: commits-unsubscr...@airflow.apache.org

Reply via email to