Hi, Community The following is the latest progress on this feature:
[Feature-13331](https://github.com/apache/dolphinscheduler/pull/13332) adds support for writing task logs to OSS. Any comments or suggestions are welcome. [image: 截屏2023-01-18 14.20.57.png] Here are some brief changes: ### Task log writing * master / worker will send the task log to the remote storage asynchronously after the task is finished if `remote.logging.enable=true` (By default it's false) ### Task log reading * master / worker will first read the task log if the log file exists on the local file system * if the task log file does not exist, master / worker will download the task log file from the remote storage to the local file system and then read the task log file ### Log Retention * Log retention can be directly configured using the retention policy provided by the remote storage * E.g., [log retention on OSS]( https://help.aliyun.com/document_detail/326319.html) Best Regards, Rick Cheng Rick Cheng <[email protected]> 于2022年12月8日周四 10:31写道: > Hi community, > > Here are some discussions on the weekly meeting about this feature: > > **Q1: In the k8s environment, users can choose to mount persistent volumes > (E.g., [OSS](https://help.aliyun.com/document_detail/130911.html)) to > synchronize task logs to remote storage.** > R1: This is indeed a way to synchronize logs to remote storage, and only > need to mount persistent volumes (PV). But there are still some > shortcomings as below: > * **Efficiency**: Since the PV is connected to the remote storage, the > speed of log writing will be reduced, which will further affect the task > execution of the worker. On the contrary, uploading the task log to the > remote storage asynchronously through the remote logging mechanism will not > affect the execution of the task. > * **Generality**: PV is not suitable for some remote storage, such as > elasticsearch. And also it is not applicable to DS deployed in non-k8s > environment. > > **Q2: Users can configure whether to use remote storage for task logs** > R2: Yes, users can decide whether to enable log remote storage through > configuration, and specify the corresponding configuration of remote > logging. > > **Q3: The master-server also has task logs, which need to be uploaded to > remote storage in a unified manner.** > R3: Yes, users can set the master's task log related remote storage > configuration in Master's configuration. > > **Q4: Is it possible to set the task log retention policy through the > configuration supported by the remote storage itself?** > R4: This is a good idea and it can simplify the design of remote logging, > I'll look into it. > > Related issue: https://github.com/apache/dolphinscheduler/issues/13017 > > Thanks again for all the suggestions at the weekly meeting, please correct > me if I'm wrong. > > Best Regards, > Rick Cheng > > > Rick Cheng <[email protected]> 于2022年11月28日周一 13:24写道: > >> Hi community, >> >> Related issue: https://github.com/apache/dolphinscheduler/issues/13017 >> >> Currently, DS only supports writing task logs to the local file system in >> worker. So this issue discusses the feature design of remote logging. >> >> # Why remote logging? >> * Avoid task log loss after worker is torn down >> * Easier to obtain logs and troubleshoot after logs are aggregated in >> remote storage >> * Enhanced cloud-native support for DS >> >> # Feature Design >> >> ## Connect to different remote targets >> DS can support a variety of common remote storage, and has strong >> scalability to support other types of remote storage >> * S3 >> * OSS >> * ElasticSearch >> * Azure Blob Storage >> * Google Cloud Storage >> * ... >> >> ## When to write logs to remote storage >> Like airflow, DS writes the task logs to remote storage after the task >> completes (success or fail). >> >> ## How to read logs >> Since the task log is stored in both the worker's local and remote >> storage, when the `api-server` needs to read the log of a certain task >> instance, it needs to determine the reading strategy. >> >> Airflow first tries to read the logs stored remotely, and if it fails, >> reads the local logs. But I prefer to try to read the local log first, and >> then read the remote log if the local log file does not exist. >> >> We could discuss this further. >> >> ## Log retention strategy >> >> For example, the maximum capacity of remote storage can be set, and old >> logs can be deleted by rolling. >> >> # Sub-tasks >> WIP >> >> Any comments or suggestions are welcome. >> >> Best Regards, >> Rick Cheng >> >
