I'm not sure that this will work but it makes sense to me. Basically you
write the functionality in a static block in a class and broadcast that
class. Not sure what your use case is but I need to load a native library
and want to avoid running the init in mapPartitions if it's not necessary
(just
Hello,
I am running Cloudera 4 node cluster with 1 Master and 3 Slaves. I am
connecting with Spark Master from scala using SparkContext. I am trying to
execute a simple java function from the distributed jar on every Spark
Worker but haven't found a way to communicate with each worker or a Spark
One thing you could do is create an RDD of [1,2,3] and set a partitioner
that puts all three values on their own nodes. Then .foreach() over the
RDD and call your function that will run on each node.
Why do you need to run the function on every node? Is it some sort of
setup code that needs to