venkatamandavilli-code commented on issue #63374: URL: https://github.com/apache/airflow/issues/63374#issuecomment-4723741557
Thanks for the detailed reproduction steps. From an operations perspective, DAG-level failure callbacks are very important in production environments because they are often used for centralized alerting, incident notifications, escalation workflows, and business-critical failure handling. The distinction between task-level callbacks working and DAG-level callbacks not firing is useful because it suggests the issue may be in the DagRun state transition or callback scheduling path rather than the callback function itself. The observation that the serialized DAG contains `has_on_failure_callback`, but the scheduler logs show `callback is empty`, seems like an important clue. It may be worth checking where `DagRun.update_state()` decides whether to return a callback request and whether the DAG-level callback is being reconstructed correctly from serialized DAG data in Airflow 3.1.x. I would be interested in helping investigate this further if maintainers think this is a suitable area for a first contribution. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
