potiuk commented on issue #36963: URL: https://github.com/apache/airflow/issues/36963#issuecomment-1985742361
So maybe just to explain how logging works when REMOTE logging uses GCS - because you might not be aware of it. GCS is an object storage and streaming to it is impossible as mentioned. So the way Airflow "GCS logging handler" works is: 1) while tasks are running logs are produced locallly on the worker task is running 2) Airflow UI knows the host name of that worker and worker exposes an API where UI can actually stream the logs directly from the worker (and while task is runnig this is what airflow UI does - it does not use GCS whatsoever) 3) after task is completed., the complete log is uploaded from the worker to the GCS bucket (this is the first time there is an interaction with GCS for the task). This is (again) because streaming to GCS is not possible, you can only upload a complete object to GCS and once you upload it, you can't append to it, you can only replace it with a new complete object. 4) then Airflow UI - knowing that task is completed will attempt to download the log from GCS. Here it actually does use partial retrieval (for efficient reading only parts of the log that it displays) - this is possible with object storage, but it's not live streaming -it's merely retriving parts of the object that is there, knowing the complete size of it and it's impossible to read parts of the object until it is fully uploaded to GCS. BTW. This is not a jargon, you need to understand how Object storage works and how airflow works in all those different cases when object storage is used as logging backend. If we want Cloud Run logs to be available "live", the only good way to make this works "properly" for all different configurations of remote logging is to be able to stream Cloud Run logs to the worker. and let the worker stream it back to the UI. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org