You can use a reduceByKeyAndWindow with your specific time window. You can
specify the inverse function in reduceByKeyAndWindow.

On Tue, Feb 24, 2015 at 1:36 PM, Ashish Sharma <ashishonl...@gmail.com>
wrote:

> So say I want to calculate top K users visiting a page in the past 2 hours
> updated every 5 mins.
>
> so here I want to maintain something like this
>
> Page_01 => {user_01:32, user_02:3, user_03:7...}
> ...
>
> Basically a count of number of times a user visited a page. Here my key is
> page name/id and state is the hashmap.
>
> Now in updateStateByKey I get the previous state and new events coming
> *in* the window. Is there a way to also get the events going *out* of the
> window? This was I can incrementally update the state over a rolling window.
>
> What is the efficient way to do it in spark streaming?
>
> Thanks
> Ashish
>



-- 

[image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com>

*Arush Kharbanda* || Technical Teamlead

ar...@sigmoidanalytics.com || www.sigmoidanalytics.com

Reply via email to