Hi All,

I have a 24x7 running Streaming Process, which runs on 2 hour windowed data

The issue i am facing is my worker machines are running OUT OF DISK space

I checked that the SHUFFLE FILES are not getting cleaned up.

/log/spark-2b875d98-1101-4e61-86b4-67c9e71954cc/executor-5bbb53c1-cee9-4438-87a2-b0f2becfac6f/blockmgr-c905b93b-c817-4124-a774-be1e706768c1//00/shuffle_2739_5_0.data

Ultimately the machines runs out of Disk Spac


i read about *spark.cleaner.ttl *config param which what i can understand
from the documentation, says cleans up all the metadata beyond the time
limit.

I went through https://issues.apache.org/jira/browse/SPARK-5836
it says resolved, but there is no code commit

Can anyone please throw some light on the issue.

Reply via email to