Hi Yana,

Thanks for your kindly response. My question is indeed unclear.

What I wanna do is to join a state stream, which is the
*updateStateByKey *output
of last-run.

*updateStateByKey *is useful if application logic doesn't (heavily) rely on
states. So that you can run application without knowing current states, and
finally update states by *updateStateByKey.*

However, if application logic relies on state, it is better to treat states
as input, and join states in the beginning of application.

I am unsure if Spark Streaming supports this functionality.

Thanks,
Chia-Chun

2014-10-01 21:56 GMT+08:00 Yana Kadiyska <yana.kadiy...@gmail.com>:

> I don't think your question is very clear -- *updateStateByKey* usually
> updates the previous state.
>
> For example, the StatefulNetworkWordCount example that ships with Spark
> show the following snippet:
>
> val updateFunc = (values: Seq[Int], state: Option[Int]) => {
>       val currentCount = values.sum
>       val previousCount = state.getOrElse(0)
>       Some(currentCount + previousCount)
>     }
>
> ​
> So if you have a state (K,V) the latest iteration will produce (K,V+V1)
> where the V1 is the update from the new batch...And I'm using + since the
> example shows simple addition/counting but your state could really be any
> operation (e.g.append or something). The assingment of previousCount shows
> how you retrieve or initialize the state for a key
>
> So I think what you seek is what happens "out of the box" (unless I'm
> misunderstanding the question)
>
> On Wed, Oct 1, 2014 at 4:13 AM, Chia-Chun Shih <chiachun.s...@gmail.com>
> wrote:
>
>> Hi,
>>
>> Are there any code examples demonstrating spark streaming applications
>> which depend on states? That is, last-run *updateStateByKey* results are
>> used as inputs.
>>
>> Thanks.
>>
>>
>>
>>
>>
>>
>

Reply via email to