kobethuwis commented on issue #29393: URL: https://github.com/apache/airflow/issues/29393#issuecomment-1438402879
Due to this issue, I tried understanding the way Airflow is (remotely) logging tasks. I am using S3 for remote log storage, but would like to access live logs for running tasks without having to use a persistent volume. > In the Airflow UI, remote logs take precedence over local logs when remote logging is enabled. If remote logs can not be found or accessed, local logs will be displayed. Note that logs are only sent to remote storage once a task is complete (including failure). In other words, remote logs for running tasks are unavailable (but local logs are available). However, the error log tells us that on the pod's level log retrieval is not possible, `*** Log file does not exist: /opt/airflow/logs/.../attempt=1.log *** ` This is because I am using the CeleryWorker in my deployment and is elaborated in the documentation [here](https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/logging-tasks.html#serving-logs-from-workers) > Airflow automatically starts an HTTP server to serve the logs ... The server is running on the port specified by worker_log_server_port option in [logging] section. By default, it is 8793 ... This behaviour is confirmed by the second part of the error, where we are trying to fetch the from the server: `Fetching from: http://airflow-worker-46838b34-rq7r8:8793/log/.../attempt=1.log ` I can thus reduce my issue to not being able to fetch the logs from the server. Using a shell inside the webserver I tried resolving/pinging `http://airflow-worker-46838b34-rq7r8:879`, without result. We're actually looking at a 'simple' DNS resolving error, which led me to this [solution ](https://stackoverflow.com/questions/62905221/dns-for-kubernetes-pods-and-airflow-worker-logs). The suggested resolution works! This way I'm able to fall back to live local logs served by the worker for running tasks. Specify the following in your helm chart: ``` extraEnv: | - name: AIRFLOW__CORE__HOSTNAME_CALLABLE value: 'airflow.utils.net:get_host_ip_address' ``` To be clear, this does not resolve @pdebelak 's original issue but did enable me to view live local logs for running tasks while still storing the finished tasks logs remotely. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org