Do we have any guarantees on the maximum duration? I've seen RDDs kept around for 7-10 minutes on batches of 20 secs and checkpoint of 100 secs. No windows, just updateStateByKey.
t's not a memory issue but on checkpoint recovery it goes back to Kafka for 10 minutes of data, any idea why? -adrian Sent from my iPhone On 06 Nov 2015, at 09:45, Tathagata Das <t...@databricks.com<mailto:t...@databricks.com>> wrote: Spark streaming automatically takes care of unpersisting any RDDs generated by DStream. You can set the StreamingContext.remember() to set the minimum persistence duration. Any persisted RDD older than that will be automatically unpersisted On Thu, Nov 5, 2015 at 9:12 AM, swetha kasireddy <swethakasire...@gmail.com<mailto:swethakasire...@gmail.com>> wrote: Its just in the same thread for a particular RDD, I need to uncache it every 2 minutes to clear out the data that is present in a Map inside that. On Wed, Nov 4, 2015 at 11:54 PM, Saisai Shao <sai.sai.s...@gmail.com<mailto:sai.sai.s...@gmail.com>> wrote: Hi Swetha, Would you mind elaborating your usage scenario of DStream unpersisting? >From my understanding: 1. Spark Streaming will automatically unpersist outdated data (you already mentioned about the configurations). 2. If streaming job is started, I think you may lose the control of the job, when do you call this unpersist, how to call this unpersist (from another thread)? Thanks Saisai On Thu, Nov 5, 2015 at 3:13 PM, swetha kasireddy <swethakasire...@gmail.com<mailto:swethakasire...@gmail.com>> wrote: Other than setting the following. sparkConf.set("spark.streaming.unpersist", "true") sparkConf.set("spark.cleaner.ttl", "7200s") On Wed, Nov 4, 2015 at 5:03 PM, swetha <swethakasire...@gmail.com<mailto:swethakasire...@gmail.com>> wrote: Hi, How to unpersist a DStream in Spark Streaming? I know that we can persist using dStream.persist() or dStream.cache. But, I don't see any method to unPersist. Thanks, Swetha -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-unpersist-a-DStream-in-Spark-Streaming-tp25281.html Sent from the Apache Spark User List mailing list archive at Nabble.com<http://nabble.com>. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org<mailto:user-h...@spark.apache.org>