Re: How to unpersist a DStream in Spark Streaming

2015-11-06 Thread Adrian Tanase
Do we have any guarantees on the maximum duration?

I've seen RDDs kept around for 7-10 minutes on batches of 20 secs and 
checkpoint of 100 secs. No windows, just updateStateByKey.

 t's not a memory issue but on checkpoint recovery it goes back to Kafka for 10 
minutes of data, any idea why?

-adrian

Sent from my iPhone

On 06 Nov 2015, at 09:45, Tathagata Das 
> wrote:

Spark streaming automatically takes care of unpersisting any RDDs generated by 
DStream. You can set the StreamingContext.remember() to set the minimum 
persistence duration. Any persisted RDD older than that will be automatically 
unpersisted

On Thu, Nov 5, 2015 at 9:12 AM, swetha kasireddy 
> wrote:
Its just in the same thread for a particular RDD, I need to uncache it every 2 
minutes to clear out the data that is present in a Map inside that.

On Wed, Nov 4, 2015 at 11:54 PM, Saisai Shao 
> wrote:
Hi Swetha,

Would you mind elaborating your usage scenario of DStream unpersisting?

>From my understanding:

1. Spark Streaming will automatically unpersist outdated data (you already 
mentioned about the configurations).
2. If streaming job is started, I think you may lose the control of the job, 
when do you call this unpersist, how to call this unpersist (from another 
thread)?

Thanks
Saisai


On Thu, Nov 5, 2015 at 3:13 PM, swetha kasireddy 
> wrote:
Other than setting the following.


sparkConf.set("spark.streaming.unpersist", "true")
sparkConf.set("spark.cleaner.ttl", "7200s")

On Wed, Nov 4, 2015 at 5:03 PM, swetha 
> wrote:
Hi,

How to unpersist a DStream in Spark Streaming? I know that we can persist
using dStream.persist() or dStream.cache. But, I don't see any method to
unPersist.

Thanks,
Swetha



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-unpersist-a-DStream-in-Spark-Streaming-tp25281.html
Sent from the Apache Spark User List mailing list archive at 
Nabble.com.

-
To unsubscribe, e-mail: 
user-unsubscr...@spark.apache.org
For additional commands, e-mail: 
user-h...@spark.apache.org







Re: How to unpersist a DStream in Spark Streaming

2015-11-05 Thread swetha kasireddy
Its just in the same thread for a particular RDD, I need to uncache it
every 2 minutes to clear out the data that is present in a Map inside that.

On Wed, Nov 4, 2015 at 11:54 PM, Saisai Shao  wrote:

> Hi Swetha,
>
> Would you mind elaborating your usage scenario of DStream unpersisting?
>
> From my understanding:
>
> 1. Spark Streaming will automatically unpersist outdated data (you already
> mentioned about the configurations).
> 2. If streaming job is started, I think you may lose the control of the
> job, when do you call this unpersist, how to call this unpersist (from
> another thread)?
>
> Thanks
> Saisai
>
>
> On Thu, Nov 5, 2015 at 3:13 PM, swetha kasireddy <
> swethakasire...@gmail.com> wrote:
>
>> Other than setting the following.
>>
>> sparkConf.set("spark.streaming.unpersist", "true")
>> sparkConf.set("spark.cleaner.ttl", "7200s")
>>
>>
>> On Wed, Nov 4, 2015 at 5:03 PM, swetha  wrote:
>>
>>> Hi,
>>>
>>> How to unpersist a DStream in Spark Streaming? I know that we can persist
>>> using dStream.persist() or dStream.cache. But, I don't see any method to
>>> unPersist.
>>>
>>> Thanks,
>>> Swetha
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-unpersist-a-DStream-in-Spark-Streaming-tp25281.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>


Re: How to unpersist a DStream in Spark Streaming

2015-11-05 Thread Tathagata Das
Spark streaming automatically takes care of unpersisting any RDDs generated
by DStream. You can set the StreamingContext.remember() to set the minimum
persistence duration. Any persisted RDD older than that will be
automatically unpersisted

On Thu, Nov 5, 2015 at 9:12 AM, swetha kasireddy 
wrote:

> Its just in the same thread for a particular RDD, I need to uncache it
> every 2 minutes to clear out the data that is present in a Map inside that.
>
> On Wed, Nov 4, 2015 at 11:54 PM, Saisai Shao 
> wrote:
>
>> Hi Swetha,
>>
>> Would you mind elaborating your usage scenario of DStream unpersisting?
>>
>> From my understanding:
>>
>> 1. Spark Streaming will automatically unpersist outdated data (you
>> already mentioned about the configurations).
>> 2. If streaming job is started, I think you may lose the control of the
>> job, when do you call this unpersist, how to call this unpersist (from
>> another thread)?
>>
>> Thanks
>> Saisai
>>
>>
>> On Thu, Nov 5, 2015 at 3:13 PM, swetha kasireddy <
>> swethakasire...@gmail.com> wrote:
>>
>>> Other than setting the following.
>>>
>>> sparkConf.set("spark.streaming.unpersist", "true")
>>> sparkConf.set("spark.cleaner.ttl", "7200s")
>>>
>>>
>>> On Wed, Nov 4, 2015 at 5:03 PM, swetha 
>>> wrote:
>>>
 Hi,

 How to unpersist a DStream in Spark Streaming? I know that we can
 persist
 using dStream.persist() or dStream.cache. But, I don't see any method to
 unPersist.

 Thanks,
 Swetha



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/How-to-unpersist-a-DStream-in-Spark-Streaming-tp25281.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org


>>>
>>
>


Re: How to unpersist a DStream in Spark Streaming

2015-11-04 Thread Saisai Shao
Hi Swetha,

Would you mind elaborating your usage scenario of DStream unpersisting?

>From my understanding:

1. Spark Streaming will automatically unpersist outdated data (you already
mentioned about the configurations).
2. If streaming job is started, I think you may lose the control of the
job, when do you call this unpersist, how to call this unpersist (from
another thread)?

Thanks
Saisai


On Thu, Nov 5, 2015 at 3:13 PM, swetha kasireddy 
wrote:

> Other than setting the following.
>
> sparkConf.set("spark.streaming.unpersist", "true")
> sparkConf.set("spark.cleaner.ttl", "7200s")
>
>
> On Wed, Nov 4, 2015 at 5:03 PM, swetha  wrote:
>
>> Hi,
>>
>> How to unpersist a DStream in Spark Streaming? I know that we can persist
>> using dStream.persist() or dStream.cache. But, I don't see any method to
>> unPersist.
>>
>> Thanks,
>> Swetha
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-unpersist-a-DStream-in-Spark-Streaming-tp25281.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


Re: How to unpersist a DStream in Spark Streaming

2015-11-04 Thread Yashwanth Kumar
Hi,

DStream->Discretized Streams are made up of multiple RDDs
You can unpersist each RDD by accessing the individual RDD's using 

dstreamrdd.foreachRDD
{

rdd.unpersist().


}



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-unpersist-a-DStream-in-Spark-Streaming-tp25281p25284.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: How to unpersist a DStream in Spark Streaming

2015-11-04 Thread swetha kasireddy
Other than setting the following.

sparkConf.set("spark.streaming.unpersist", "true")
sparkConf.set("spark.cleaner.ttl", "7200s")


On Wed, Nov 4, 2015 at 5:03 PM, swetha  wrote:

> Hi,
>
> How to unpersist a DStream in Spark Streaming? I know that we can persist
> using dStream.persist() or dStream.cache. But, I don't see any method to
> unPersist.
>
> Thanks,
> Swetha
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-unpersist-a-DStream-in-Spark-Streaming-tp25281.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>