Abdulrehman-PIAIC80387 commented on issue #68699:
URL: https://github.com/apache/airflow/issues/68699#issuecomment-4738816856

   Root cause looks like an atomicity issue rather than anything 
SQLite-specific: `_create_backfill` commits the `Backfill` row 
([backfill.py#L678](https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/models/backfill.py#L678))
 **before** creating its runs. If run creation then fails (the `database is 
locked` here, but any error would do), the committed row survives with 
no/partial runs, and the `num_active > 0` check at L652 blocks all future 
backfills for that dag.
   
   Two ways to fix it — which would you prefer?
   
   1. **Atomic creation** (`flush` instead of the early `commit` so it all 
rolls back together) **+ a real "one active backfill per dag" guard** (row lock 
/ unique index) — since that early commit currently doubles as a concurrency 
guard.
   2. **Cleanup-on-failure** — keep the commit, but delete the orphaned 
`Backfill` row if run creation fails.
   
   Happy to open a PR for whichever direction you prefer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to