[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475115#comment-17475115
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-----------------------------------------

ghostbody commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1011812804


   we reviewed the code and found that in `local_task_job.py`, the parent 
process has a `heatbeat_callback`, and will check the state and child process 
return code of the `task_instance`.
   
   However, theses lines may cover a bug?
   
   
![image](https://user-images.githubusercontent.com/8371330/149270821-45da67da-186e-409b-8f3e-072fe8e0491c.png)
   
   
![image](https://user-images.githubusercontent.com/8371330/149271933-4ae6c8d1-defc-45c6-ba21-89a46016c3d2.png)
   
   
   **The raw task command write back the taskintance's state(like sucess) 
doesn't mean the child process is finished(returned)?**
   
   So, in this heatbeat callback, there maybe a race condition when task state 
is filled back while the child process is not returned.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-5071
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: DAG, scheduler
>    Affects Versions: 1.10.3
>            Reporter: msempere
>            Priority: Critical
>             Fix For: 1.10.12
>
>         Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance <TaskInstance: X 
> 2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance <TaskInstance: X 
> 2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to