How does the Spark Accumulator work under the covers?

Areg Baghdasaryan (BLOOMBERG/ 731 LEX -) Fri, 10 Oct 2014 07:39:18 -0700

Hello,
I was wondering on what does the Spark accumulator do under the covers. I’ve 
implemented my own associative addInPlace function for the accumulator, where 
is this function being run? Let’s say you call something like myRdd.map(x => 
sum += x) is “sum” being accumulated locally in any way, for each element or 
partition or node? Is “sum” a broadcast variable? Or does it only exist on the 
driver node? How does the driver node get access to the “sum”?
Thanks,
Areg

How does the Spark Accumulator work under the covers?

Reply via email to