For this sort of long term aggregations you should use a dedicated data storage systems. Like a database, or a key-value store. Spark Streaming would just aggregate and push the necessary data to the data store.
TD On Sat, Nov 14, 2015 at 9:32 PM, Sandip Mehta <sandip.mehta....@gmail.com> wrote: > Hi, > > I am working on requirement of calculating real time metrics and building > prototype on Spark streaming. I need to build aggregate at Seconds, > Minutes, Hours and Day level. > > I am not sure whether I should calculate all these aggregates as > different Windowed function on input DStream or shall I use > updateStateByKey function for the same. If I have to use updateStateByKey > for these time series aggregation, how can I remove keys from the state > after different time lapsed? > > Please suggest. > > Regards > SM > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >