[ 
https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17467682#comment-17467682
 ] 

ASF GitHub Bot commented on AIRFLOW-4922:
-----------------------------------------

derkuci commented on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-1003754230


   > In the 2.1.1 version, I tried to modify the 
airflow/utils/log/file_task_handler.py file to obtain the hostname information 
by reading the log table. I confirmed through debug that I could get the host 
information in this way, 
   
   @xuemengran could you kindly point to how this could be done?
   
   With airflow 2.2.2 + Celery, I am seeing error messages like below due to 
`TaskInstance.hostname` being always the latest and not relying on the 
`try_number`.
   ```
   "Failed to fetch log file from worker. Client error '404 NOT FOUND' for url 
..."
   ```
   If we try really hard, the logs can be found from the local storage of 
_some_ celery workers.  But that is a huge burden for operational and/or 
debugging.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If a task crashes, host name is not committed to the database so logs aren't 
> able to be seen in the UI
> ------------------------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-4922
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4922
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: logging
>    Affects Versions: 1.10.3
>            Reporter: Andrew Harmon
>            Assignee: wanghong-T
>            Priority: Major
>
> Sometimes when a task fails, the log show the following
> {code}
> *** Log file does not exist: 
> /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Fetching from: 
> http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Failed to fetch log file from worker. Invalid URL 
> 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host 
> supplied
> {code}
> I believe this is due to the fact that the row is not committed to the 
> database until after the task finishes. 
> https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to