RDD Cleanup

premdass Wed, 09 Jul 2014 07:04:51 -0700

Hi,

I using spark 1.0.0  , using Ooyala Job Server, for a low latency query
system. Basically a long running context is created, which enables to run
multiple jobs under the same context, and hence sharing of the data.


It was working fine in 0.9.1. However in spark 1.0 release, the RDD's
created and cached by a Job-1 gets cleaned up by BlockManager (can see log
statements saying cleaning up RDD) and so the cached RDD's are not available
for Job-2, though Both Job-1 and Job-2 are running under same spark context.

I tried using the spark.cleaner.referenceTracking = false setting, how-ever
this causes the issue that unpersisted RDD's are not cleaned up properly,
and occupying the Spark's memory..


Had anybody faced issue like this before? If so, any advice would be greatly
appreicated.


Also is there any way, to mark an RDD as being used under a context, event
though the job using that had been finished (so subsequent jobs can use that
RDD).


Thanks,
Prem



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/RDD-Cleanup-tp9182.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

RDD Cleanup

Reply via email to