Zhiting Guo created KYLIN-5636:
----------------------------------
Summary: automatically clean up dependent files after the build
task
Key: KYLIN-5636
URL: https://issues.apache.org/jira/browse/KYLIN-5636
Project: Kylin
Issue Type: Improvement
Components: Tools, Build and Test
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
Fix For: 5.0-alpha
*question:*
The files uploaded under the path spark.kubernetes.file.upload.path are not
automatically deleted
1: When spark creates a driverPod, it uploads dependencies to the specified
path. The build task is in cluster mode and needs to create a driverPod.
Running the build task multiple times results in a large path file.
2: At present, the upload.path path we configured (s3a://kylin/spark-on-k8s) is
a fixed path, and spark will create a subdirectory in this directory, the
spark-upload-uuid directory, and then store the dependencies in it.
*dev design*
Core idea, add dynamic subdirectory under the original upload.path path, delete
the entire subdirectory when the task is over
Build task: upload.path + jobId (e.g. s3a://kylin/spark-on-k8s/uuid)
Delete the dependency directory when the build task is finished
Automatically delete dependent function is called, kill-9 situation will lead
to the deletion function is not called, garbage cleaning function needs to be
added to the bottom of the policy, such as greater than three months before the
directory is automatically deleted
--
This message was sent by Atlassian Jira
(v8.20.10#820010)