Re: Spark Streaming: Does mapWithState implicitly partition the dsteram?

2016-01-18 Thread Shixiong(Ryan) Zhu
mapWithState uses HashPartitioner by default. You can use
"StateSpec.partitioner" to set your custom partitioner.

On Sun, Jan 17, 2016 at 11:00 AM, Lin Zhao  wrote:

> When the state is passed to the task that handles a mapWithState for a
> particular key, if the key is distributed, it seems extremely difficult to
> coordinate and synchronise the state. Is there a partition by key before a
> mapWithState? If not what exactly is the execution model?
>
> Thanks,
>
> Lin
>
>


Spark Streaming: Does mapWithState implicitly partition the dsteram?

2016-01-17 Thread Lin Zhao
When the state is passed to the task that handles a mapWithState for a 
particular key, if the key is distributed, it seems extremely difficult to 
coordinate and synchronise the state. Is there a partition by key before a 
mapWithState? If not what exactly is the execution model?

Thanks,

Lin