One thing you could do is create an RDD of [1,2,3] and set a partitioner
that puts all three values on their own nodes.  Then .foreach() over the
RDD and call your function that will run on each node.

Why do you need to run the function on every node?  Is it some sort of
setup code that needs to be run before other RDD operations?


On Tue, Apr 8, 2014 at 3:05 AM, Adnan <nsyaq...@gmail.com> wrote:

> Hello,
>
> I am running Cloudera 4 node cluster with 1 Master and 3 Slaves. I am
> connecting with Spark Master from scala using SparkContext. I am trying to
> execute a simple java function from the distributed jar on every Spark
> Worker but haven't found a way to communicate with each worker or a Spark
> API function to do it.
>
> Can somebody help me with it or point me in the right direction?
>
> Regards,
> Adnan
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-execute-a-function-from-class-in-distributed-jar-on-each-worker-node-tp3870.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Reply via email to