Re: Calculating Timeseries Aggregation

Tathagata Das Tue, 17 Nov 2015 18:16:45 -0800

For this sort of long term aggregations you should use a dedicated data
storage systems. Like a database, or a key-value store. Spark Streaming
would just aggregate and push the necessary data to the data store.


TD

On Sat, Nov 14, 2015 at 9:32 PM, Sandip Mehta <sandip.mehta....@gmail.com>
wrote:

> Hi,
>
> I am working on requirement of calculating real time metrics and building
> prototype  on Spark streaming. I need to build aggregate at Seconds,
> Minutes, Hours and Day level.
>
> I am not sure whether I should calculate all these aggregates as
> different Windowed function on input DStream or shall I use
> updateStateByKey function for the same. If I have to use updateStateByKey
> for these time series aggregation, how can I remove keys from the state
> after different time lapsed?
>
> Please suggest.
>
> Regards
> SM
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Calculating Timeseries Aggregation

Reply via email to