GitHub user hezeclark added a comment to the discussion: Why is
AIRFLOW__CELERY__TASK_ACKS_LATE True by default?
TASK_ACKS_LATE=True is the default because it provides better reliability
semantics for Airflow tasks.
Here is the reasoning:
With TASK_ACKS_LATE=False (acknowledge before execution):
- Task is removed from the queue immediately when a worker picks it up
- If the worker crashes during execution, the task is LOST — no retry, no
failure signal
- Simpler and faster, but dangerous for production workflows
With TASK_ACKS_LATE=True (acknowledge after execution, the default):
- Task stays in the queue until the worker successfully acknowledges it
- If the worker crashes, the message visibility timeout expires and the task
goes back to the queue for another worker
- Combined with acks_on_failure_or_timeout=False, this provides at-least-once
execution
The visibility timeout issue you are hitting: if your task takes longer than
the broker visibility timeout (default varies — 1 hour for Redis), the message
becomes visible again and a second worker picks it up, causing duplicate
execution.
Fix:
1. Increase visibility timeout to be longer than your longest task:
CELERY__BROKER_TRANSPORT_OPTIONS = {"visibility_timeout": 43200} # 12 hours
2. For Redis as broker, set this in airflow.cfg:
[celery_broker_transport_options]
visibility_timeout = 43200
3. Make tasks idempotent so duplicate execution does not cause data corruption.
GitHub link:
https://github.com/apache/airflow/discussions/57348#discussioncomment-16112340
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]