Re: Spark Streaming: Does mapWithState implicitly partition the dsteram?

2016-01-18 Thread Shixiong(Ryan) Zhu
mapWithState uses HashPartitioner by default. You can use "StateSpec.partitioner" to set your custom partitioner. On Sun, Jan 17, 2016 at 11:00 AM, Lin Zhao wrote: > When the state is passed to the task that handles a mapWithState for a > particular key, if the key is

Spark Streaming: Does mapWithState implicitly partition the dsteram?

2016-01-17 Thread Lin Zhao
When the state is passed to the task that handles a mapWithState for a particular key, if the key is distributed, it seems extremely difficult to coordinate and synchronise the state. Is there a partition by key before a mapWithState? If not what exactly is the execution model? Thanks, Lin