Are you talking about reduceByKeyAndWindow with or without inverse reduce?

TD

On Fri, Jul 10, 2015 at 2:07 AM, Imran Alam <im...@newscred.com> wrote:

> We have a streaming job that makes use of reduceByKeyAndWindow
> <https://github.com/apache/spark/blob/v1.4.0/streaming/src/main/scala/org/apache/spark/streaming/dstream/PairDStreamFunctions.scala#L334-L341>.
> We want this to work with an initial state. The idea is to avoid losing
> state if the streaming job is restarted, also to take historical data into
> account for the windows. But reduceByKeyAndWindow doesn't accept any
> "initialRDD" parameter like updateStateByKey
> <https://github.com/apache/spark/blob/v1.4.0/streaming/src/main/scala/org/apache/spark/streaming/dstream/PairDStreamFunctions.scala#L435-L445>
> does.
>
> The plan is to extend reduceByKeyAndWindow to accept an "initalRDDs"
> parameter, so that the DStream starts with those RDDs as initial value of
> "generatedRDD" rather than an empty map. But the "generatedRDD" is a
> private variable, so I'm bit confused on how to proceed with the plan.
>
>

Reply via email to