nookcreed opened a new issue, #36920:
URL: https://github.com/apache/airflow/issues/36920

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### If "Other Airflow 2 version" selected, which one?
   
   2.7.3
   
   ### What happened?
   
   We are encountering an issue in our Apache Airflow setup where, after a few 
successful DagRuns, the scheduler stops scheduling new runs. The scheduler logs 
indicate:
   
   `{scheduler_job_runner.py:1426} INFO - DAG dag-test scheduling was skipped, 
probably because the DAG record was locked.`
   
   This problem persists despite running a single scheduler pod. Notably, 
reverting the changes from [PR 
#31414](https://github.com/apache/airflow/pull/31414) resolves this issue. A 
similar issue has been discussed on Stack Overflow: [Airflow Kubernetes 
Executor Scheduling Skipped Because Dag Record Was 
Locked](https://stackoverflow.com/questions/77405009/airflow-kubernetes-executor-scheduling-skipped-because-dag-record-was-locked).
   
   
   
   ### What you think should happen instead?
   
   The scheduler should consistently schedule new DagRuns as per DAG 
configurations, without interruption due to DAG record locks.
   
   ### How to reproduce
   
   Run airflow v.2.7.3 on kubernetes. HA is not required.
   Trigger multiple DagRuns (We have about 10 DAGs that run every minute).
   Observe scheduler behavior and logs after a few successful runs. The error 
shows up after a few minutes
   
   ### Operating System
   
   centos7
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-amazon==8.10.0
   apache-airflow-providers-apache-hive==6.2.0
   apache-airflow-providers-apache-livy==3.6.0
   apache-airflow-providers-cncf-kubernetes==7.8.0
   apache-airflow-providers-common-sql==1.8.0
   apache-airflow-providers-ftp==3.6.0
   apache-airflow-providers-google==10.11.0
   apache-airflow-providers-http==4.6.0
   apache-airflow-providers-imap==3.4.0
   apache-airflow-providers-papermill==3.4.0
   apache-airflow-providers-postgres==5.7.1
   apache-airflow-providers-presto==5.2.1
   apache-airflow-providers-salesforce==5.5.0
   apache-airflow-providers-snowflake==5.1.0
   apache-airflow-providers-sqlite==3.5.0
   apache-airflow-providers-trino==5.4.0
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   We have wrappers around the official airflow helm chart and docker images.
   
   Environment:
   
       Airflow Version: 2.7.3
       Kubernetes Version: 1.24
       Executor: KubernetesExecutor
       Database: PostgreSQL (metadata database)
       Environment/Infrastructure: Kubernetes cluster running Airflow in Docker 
containers
   
   ### Anything else?
   
   Actual Behavior:
   The scheduler stops scheduling new runs after a few DagRuns, with log 
messages about the DAG record being locked.
   
   Workaround:
   Restarting the scheduler pod releases the lock and allows normal scheduling 
to resume, but this is not viable in production. Reverting the changes in [PR 
#31414](https://github.com/apache/airflow/pull/31414) also resolves the issue.
   
   
   Questions/Request for Information:
   
   1. Under what scenarios is the lock on a DAG record typically not released?
   2. Are there known issues in Airflow 2.7.3, or specific configurations, that 
might cause the DAG record to remain locked, thereby preventing new run 
scheduling?
   3. Could the changes made in [PR 
#31414](https://github.com/apache/airflow/pull/31414) be related to this issue? 
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to