[ https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17467682#comment-17467682 ]
ASF GitHub Bot commented on AIRFLOW-4922: ----------------------------------------- derkuci commented on pull request #6722: URL: https://github.com/apache/airflow/pull/6722#issuecomment-1003754230 > In the 2.1.1 version, I tried to modify the airflow/utils/log/file_task_handler.py file to obtain the hostname information by reading the log table. I confirmed through debug that I could get the host information in this way, @xuemengran could you kindly point to how this could be done? With airflow 2.2.2 + Celery, I am seeing error messages like below due to `TaskInstance.hostname` being always the latest and not relying on the `try_number`. ``` "Failed to fetch log file from worker. Client error '404 NOT FOUND' for url ..." ``` If we try really hard, the logs can be found from the local storage of _some_ celery workers. But that is a huge burden for operational and/or debugging. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > If a task crashes, host name is not committed to the database so logs aren't > able to be seen in the UI > ------------------------------------------------------------------------------------------------------ > > Key: AIRFLOW-4922 > URL: https://issues.apache.org/jira/browse/AIRFLOW-4922 > Project: Apache Airflow > Issue Type: Bug > Components: logging > Affects Versions: 1.10.3 > Reporter: Andrew Harmon > Assignee: wanghong-T > Priority: Major > > Sometimes when a task fails, the log show the following > {code} > *** Log file does not exist: > /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** > Fetching from: > http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** > Failed to fetch log file from worker. Invalid URL > 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host > supplied > {code} > I believe this is due to the fact that the row is not committed to the > database until after the task finishes. > https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857 -- This message was sent by Atlassian Jira (v8.20.1#820001)