On Sun, Jul 13, 2014 at 4:22 PM, Ted Dunning <[email protected]> wrote:

> I have a program that I am trying to build that has this pattern:
>
>     broadcast state to all blocks
>     block map to do a bit of computation, create local state
>     merge all of the local states back to the global state
>     repeat
>
> What is the suggestion for merging the local state back to the global
> state?
>

I don't think there is an operator designed to do this. One way you could
do this, is it run a "thin" (ncol=1) mapBlock() output followed by a
collect(), hoping the result fits in memory, and do the reduction in-core.
I think some kind of a reduce operator needs to be introduced for doing
even simple things like scalable kmeans. Haven't thought of how it would
look yet.

Reply via email to