dstandish commented on issue #31059: URL: https://github.com/apache/airflow/issues/31059#issuecomment-1535892947
@juraa14 let me know if i understand this correctly. you have airflow configured running celery. each celery worker runs on a distinct VM. webserver runs on distinct VM. when task is running, webserver reads through log server. when task is done, webserver does not read through logs server. correct? this makes sense. the current logic is to only attempt to read from the worker (over the log server running on celery)while the task is running and otherwise to just read local or remote. i think in general the assumption is that celery workers cannot be relied upon to stick around forever. so you should either set up your logs to go to a folder shared by all of your services (e.g. celery, webserver, scheduler) or you should set up remote logging. doing otherwise is not a good practice because your logs can easily be lost. and since it's not a good practice, it's not expected that people will do it like this, and therefore we can optimize webserver performance by not attempting to read from the logs server in this case. so my recommendation would be that you set up a shared drive for logs or enable remote logging (it's very easy to do) let me know your thoughts -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
