Re: [D] Implementing a Locking Mechanism in Airflow Variables to Prevent Race Conditions [airflow]
GitHub user akomisarek added a comment to the discussion: Implementing a Locking Mechanism in Airflow Variables to Prevent Race Conditions Nice interesting! Thanks for sharing, yes as mentioned above I don't think Airflow offers anything out of the box. Seems like external locking/queueing is required :( GitHub link: https://github.com/apache/airflow/discussions/46482#discussioncomment-12080639 This is an automatically sent email for commits@airflow.apache.org. To unsubscribe, please send an email to: commits-unsubscr...@airflow.apache.org
Re: [D] Implementing a Locking Mechanism in Airflow Variables to Prevent Race Conditions [airflow]
GitHub user PraveenKumarM21 added a comment to the discussion: Implementing a Locking Mechanism in Airflow Variables to Prevent Race Conditions I have multiple DAGs processing different customer accounts, and I need to ensure that: ✅ DAGs can run concurrently for different customers (i.e., if Customer A and Customer B both have pending jobs, they can process in parallel). ❌ Only one DAG run should be active per customer at any given time (i.e., if a DAG is already running for Customer A, another instance of the DAG should not start for Customer A until the first one finishes). Why Not Use a Single DAG? Each customer’s processing is independent and triggered dynamically. A single DAG would require complex branching logic to check and handle multiple customers, making it harder to maintain, debug, and monitor. Dataset-aware scheduling might help, but it doesn’t inherently prevent multiple DAG runs for the same customer at the same time. Why a Locking Mechanism? I originally tried using Airflow Variables to track running customers, but since Airflow Variables lack atomic operations, there is a race condition where multiple DAGs may check the variable simultaneously and both start running. I want to enforce this locking within Airflow itself without relying on external systems like Redis, etcd, or DynamoDB. GitHub link: https://github.com/apache/airflow/discussions/46482#discussioncomment-12076246 This is an automatically sent email for commits@airflow.apache.org. To unsubscribe, please send an email to: commits-unsubscr...@airflow.apache.org
Re: [D] Implementing a Locking Mechanism in Airflow Variables to Prevent Race Conditions [airflow]
GitHub user PraveenKumarM21 added a comment to the discussion: Implementing a Locking Mechanism in Airflow Variables to Prevent Race Conditions I was hoping to keep the locking mechanism within Airflow itself to avoid adding external dependencies. GitHub link: https://github.com/apache/airflow/discussions/46482#discussioncomment-12076187 This is an automatically sent email for commits@airflow.apache.org. To unsubscribe, please send an email to: commits-unsubscr...@airflow.apache.org
Re: [D] Implementing a Locking Mechanism in Airflow Variables to Prevent Race Conditions [airflow]
GitHub user akomisarek added a comment to the discussion: Implementing a Locking Mechanism in Airflow Variables to Prevent Race Conditions Can you elaborate on your business use-case? Why not design them to be part of single DAG? or Dataset-aware scheduling? Alternatively maybe you can use ACID compliant data storage iceberg/delta and you can run them in parallel? I would rather consider designing the pipelines differently than trying to force tool to do what it is not designed of doing! GitHub link: https://github.com/apache/airflow/discussions/46482#discussioncomment-12072638 This is an automatically sent email for commits@airflow.apache.org. To unsubscribe, please send an email to: commits-unsubscr...@airflow.apache.org
Re: [D] Implementing a Locking Mechanism in Airflow Variables to Prevent Race Conditions [airflow]
GitHub user potiuk added a comment to the discussion: Implementing a Locking Mechanism in Airflow Variables to Prevent Race Conditions No such mechanism AFAIK - you should likely use external system for that. Depending on the place you run it with you might have different options (for example dynamodb in AWS) - that's probably the best advice you can get. GitHub link: https://github.com/apache/airflow/discussions/46482#discussioncomment-12072637 This is an automatically sent email for commits@airflow.apache.org. To unsubscribe, please send an email to: commits-unsubscr...@airflow.apache.org