setting initial state for mapGroupsWithState

2020-02-24 Thread dpristin
Hi, I'm in the process of migrating our DStream jobs to structured streaming and I'm looking for an advise on how to provide initial state for the mapGroupsWithState, similarly to what DStream's mapWithState does: stream.mapWithState(StateSpec.function(updateState _

RE: Initial State

2015-11-22 Thread Bryan
. Is there an alternative method to initialize state? InputQueueStream joined to window would seem to work, but InputQueueStream does not allow checkpointing Sent from Outlook Mail From: Tathagata Das Sent: Sunday, November 22, 2015 8:01 PM To: Bryan Cc: user Subject: Re: Initial State There is

Re: Initial State

2015-11-22 Thread Tathagata Das
There is a way. Please see the scala docs. http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.streaming.dstream.PairDStreamFunctions The first version of updateStateByKey has the parameter "initialRDD" On Fri, Nov 20, 2015 at 6:52 PM, Bryan wrote: > All, > > Is there a wa

Initial State

2015-11-20 Thread Bryan
All, Is there a way to introduce an initial RDD without doing updateStateByKey? I have an initial set of counts, and the algorithm I am using requires that I accumulate additional counts from streaming data, age off older counts, and make some calculations on them. The accumulation of counts us

Re: reduceByKeyAndWindow with initial state

2015-07-12 Thread Imran Alam
0, 2015 at 2:07 AM, Imran Alam wrote: > >> We have a streaming job that makes use of reduceByKeyAndWindow >> <https://github.com/apache/spark/blob/v1.4.0/streaming/src/main/scala/org/apache/spark/streaming/dstream/PairDStreamFunctions.scala#L334-L341>. >> We want this t

Re: reduceByKeyAndWindow with initial state

2015-07-10 Thread Tathagata Das
streaming/dstream/PairDStreamFunctions.scala#L334-L341>. > We want this to work with an initial state. The idea is to avoid losing > state if the streaming job is restarted, also to take historical data into > account for the windows. But reduceByKeyAndWindow doesn't accept any >

reduceByKeyAndWindow with initial state

2015-07-10 Thread Imran Alam
We have a streaming job that makes use of reduceByKeyAndWindow <https://github.com/apache/spark/blob/v1.4.0/streaming/src/main/scala/org/apache/spark/streaming/dstream/PairDStreamFunctions.scala#L334-L341>. We want this to work with an initial state. The idea is to avoid losing state

Initial State of updateStateByKey

2015-01-08 Thread Asim Jalis
In Spark Streaming, is there a way to initialize the state of updateStateByKey before it starts processing RDDs? I noticed that there is an overload of updateStateByKey that takes an initialRDD in the latest sources (although not in the 1.2.0 release). Is there another way to do this until this fea