Worker Machine running out of disk for Long running Streaming process

gaurav sharma Fri, 21 Aug 2015 03:00:07 -0700

Hi All,


I have a 24x7 running Streaming Process, which runs on 2 hour windowed data

The issue i am facing is my worker machines are running OUT OF DISK space

I checked that the SHUFFLE FILES are not getting cleaned up.

/log/spark-2b875d98-1101-4e61-86b4-67c9e71954cc/executor-5bbb53c1-cee9-4438-87a2-b0f2becfac6f/blockmgr-c905b93b-c817-4124-a774-be1e706768c1//00/shuffle_2739_5_0.data

Ultimately the machines runs out of Disk Spac


i read about *spark.cleaner.ttl *config param which what i can understand
from the documentation, says cleans up all the metadata beyond the time
limit.

I went through https://issues.apache.org/jira/browse/SPARK-5836
it says resolved, but there is no code commit

Can anyone please throw some light on the issue.

Worker Machine running out of disk for Long running Streaming process

Reply via email to