Re: [Spark 1.6][Streaming] About the behavior of mapWithState

2016-01-17 Thread Terry Hoo
Hi Ryan, Thanks for your comments! Using reduceByKey() before the mapWithState can get the expected result. Do we ever consider that mapWithState only outputs the changed key one time in every batch interval, just like the updateStateByKey. For some cases, user may only care about the final

Re: [Spark 1.6][Streaming] About the behavior of mapWithState

2016-01-15 Thread Shixiong(Ryan) Zhu
Hey Terry, That's expected. If you want to only output (1, 3), you can use "reduceByKey" before "mapWithState" like this: dstream.reduceByKey(_ + _).mapWithState(spec) On Fri, Jan 15, 2016 at 1:21 AM, Terry Hoo wrote: > Hi, > I am doing a simple test with mapWithState,

[Spark 1.6][Streaming] About the behavior of mapWithState

2016-01-15 Thread Terry Hoo
Hi, I am doing a simple test with mapWithState, and get some events unexpected, is this correct? The test is very simple: sum the value of each key val mappingFunction = (key: Int, value: Option[Int], state: State[Int]) => { state.update(state.getOption().getOrElse(0) + value.getOrElse(0))