The GitHub Actions job "Tests (AMD)" on 
airflow.git/fix/sensor-reschedule-lock-timeout has failed.
Run started by GitHub user 1fanwang (triggered by 1fanwang).

Head commit for run:
6892618432640e2db6e487b509aede2f118aec6c / 1fanwang <[email protected]>
Avoid long row locks in sensor reschedule path under high concurrency

Under high sensor concurrency, the execution API's reschedule write contends

on the TaskInstance row lock. With MySQL's default `innodb_lock_wait_timeout` of

50 s, a blocked worker keeps its DB connection idle for nearly a minute, which

stacks up against the connection pool and cascades into 5xx responses for the

rest of the workload.

Route the UP_FOR_RESCHEDULE branch of `PATCH /task-instances/{id}/state` through

a retry-wrapped helper that, on MySQL, sets a short per-session

`innodb_lock_wait_timeout` for the duration of the lock-and-write. Blocked

writes now either succeed quickly, deadlock-retry through

`@retry_db_transaction`, or fail fast after a few seconds — never block a

worker for 50 s. The timeout is configurable via the new

`[scheduler] reschedule_lock_timeout_seconds` setting (default 4 s).

Closes #66778

Signed-off-by: 1fanwang <[email protected]>

Report URL: https://github.com/apache/airflow/actions/runs/25744194950

With regards,
GitHub Actions via GitBox


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to