[ https://issues.apache.org/jira/browse/SPARK-52124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
huangsheng updated SPARK-52124: ------------------------------- Description: When submitting applications using Spark in standalone mode, a folder is generated under the {{work}} directory on each node every time a application is submitted. The naming convention for these folders is, for example, {{{}app-20250212191730-0249{}}}. These folders contain the resource files that each node downloads from the master node when the application is submitted. Although there is a scheduled cleanup mechanism ({{{}spark.worker.cleanup.enabled{}}}), it is not immediate. {color:#ff0000}If a large number of tasks are submitted in a short period of time, and each application depends on a significant amount of external resources, the disk space can be quickly exhausted.{color} Therefore, I suggest actively deleting the disk space occupied under the {{work}} directory after each application is completed. was: When submitting applications using Spark in standalone mode, a folder is generated under the {{work}} directory on each node every time a application is submitted. The naming convention for these folders is, for example, {{{}app-20250212191730-0249{}}}. These folders contain the resource files that each node downloads from the master node when the task is submitted. Although there is a scheduled cleanup mechanism ({{{}spark.worker.cleanup.enabled{}}}), it is not immediate. {color:#ff0000}If a large number of tasks are submitted in a short period of time, and each application depends on a significant amount of external resources, the disk space can be quickly exhausted.{color} Therefore, I suggest actively deleting the disk space occupied under the {{work}} directory after each task is completed. > Actively Releasing Disk Space After Application Completion in Spark > Standalone Mode > ----------------------------------------------------------------------------------- > > Key: SPARK-52124 > URL: https://issues.apache.org/jira/browse/SPARK-52124 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 3.5.5 > Reporter: huangsheng > Priority: Minor > > When submitting applications using Spark in standalone mode, a folder is > generated under the {{work}} directory on each node every time a application > is submitted. The naming convention for these folders is, for example, > {{{}app-20250212191730-0249{}}}. These folders contain the resource files > that each node downloads from the master node when the application is > submitted. Although there is a scheduled cleanup mechanism > ({{{}spark.worker.cleanup.enabled{}}}), it is not immediate. > {color:#ff0000}If a large number of tasks are submitted in a short period of > time, and each application depends on a significant amount of external > resources, the disk space can be quickly exhausted.{color} > > Therefore, I suggest actively deleting the disk space occupied under the > {{work}} directory after each application is completed. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org