Re: Sessionization using updateStateByKey

2015-07-15 Thread algermissen1971
), and output. Some more discussion in my talk - https://www.youtube.com/watch?v=d5UJonrruHk On Tue, Jul 14, 2015 at 4:13 PM, swetha swethakasire...@gmail.com wrote: Hi, I have a question regarding sessionization using updateStateByKey. If near real time state needs to be maintained

Re: Sessionization using updateStateByKey

2015-07-15 Thread Silvio Fiorito
Subject: Re: Sessionization using updateStateByKey An in-memory hash key data structure of some kind so that you're close to linear on the number of items in a batch, not the number of outstanding keys. That's more complex, because you have to deal with expiration for keys that never get hit

Re: Sessionization using updateStateByKey

2015-07-15 Thread Sean McNamara
: Sessionization using updateStateByKey An in-memory hash key data structure of some kind so that you're close to linear on the number of items in a batch, not the number of outstanding keys. That's more complex, because you have to deal with expiration for keys that never get hit, and for unusually

Re: Sessionization using updateStateByKey

2015-07-15 Thread Cody Koeninger
talk - https://www.youtube.com/watch?v=d5UJonrruHk On Tue, Jul 14, 2015 at 4:13 PM, swetha swethakasire...@gmail.com wrote: Hi, I have a question regarding sessionization using updateStateByKey. If near real time state needs to be maintained in a Streaming application, what

Re: Sessionization using updateStateByKey

2015-07-15 Thread algermissen1971
at every stage - receiving, shuffles (updateStateByKey included), and output. Some more discussion in my talk - https://www.youtube.com/watch?v=d5UJonrruHk On Tue, Jul 14, 2015 at 4:13 PM, swetha swethakasire...@gmail.com wrote: Hi, I have a question regarding sessionization using

Re: Sessionization using updateStateByKey

2015-07-15 Thread Cody Koeninger
regarding sessionization using updateStateByKey. If near real time state needs to be maintained in a Streaming application, what happens when the number of RDDs to maintain the state becomes very large? Does it automatically get saved to HDFS and reload when needed or do I have to use any code like

Re: Sessionization using updateStateByKey

2015-07-15 Thread Cody Koeninger
://www.youtube.com/watch?v=d5UJonrruHk On Tue, Jul 14, 2015 at 4:13 PM, swetha swethakasire...@gmail.com wrote: Hi, I have a question regarding sessionization using updateStateByKey. If near real time state needs to be maintained in a Streaming application, what happens when the number

Re: Sessionization using updateStateByKey

2015-07-14 Thread Tathagata Das
...@gmail.com wrote: Hi, I have a question regarding sessionization using updateStateByKey. If near real time state needs to be maintained in a Streaming application, what happens when the number of RDDs to maintain the state becomes very large? Does it automatically get saved to HDFS

Sessionization using updateStateByKey

2015-07-14 Thread swetha
Hi, I have a question regarding sessionization using updateStateByKey. If near real time state needs to be maintained in a Streaming application, what happens when the number of RDDs to maintain the state becomes very large? Does it automatically get saved to HDFS and reload when needed or do I