Hi, I using spark 1.0.0 , using Ooyala Job Server, for a low latency query system. Basically a long running context is created, which enables to run multiple jobs under the same context, and hence sharing of the data.
It was working fine in 0.9.1. However in spark 1.0 release, the RDD's created and cached by a Job-1 gets cleaned up by BlockManager (can see log statements saying cleaning up RDD) and so the cached RDD's are not available for Job-2, though Both Job-1 and Job-2 are running under same spark context. I tried using the spark.cleaner.referenceTracking = false setting, how-ever this causes the issue that unpersisted RDD's are not cleaned up properly, and occupying the Spark's memory.. Had anybody faced issue like this before? If so, any advice would be greatly appreicated. Also is there any way, to mark an RDD as being used under a context, event though the job using that had been finished (so subsequent jobs can use that RDD). Thanks, Prem -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/RDD-Cleanup-tp9182.html Sent from the Apache Spark User List mailing list archive at Nabble.com.