karenbraganz commented on issue #51969:
URL: https://github.com/apache/airflow/issues/51969#issuecomment-3293721335

   I have also observed a similar issue. This eventually resulted in a 
successful task getting marked as failed. Below, I will explain what I believe 
happened:
   
   1. A single try of a task instance is detected as a zombie twice due to the 
zombie_detection_interval race condition described by Alan. This produces two 
callback requests.
   2. The first callback request is processed and executed a  few seconds later 
resulting in the task getting marked in the `up_for_retry` state.
   3. The task is retried and succeeds.
   4. Shortly after the success, the second callback request (from the second 
zombie detection of the first try) is processed and executed. This results in 
the task instance getting marked as failed. Even though this callback request 
corresponds to the first try, the most recent try gets marked as failed. 
Therefore, the second try, which was initially in the success state, got marked 
in the failed state.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to