How many elements do you have in total? If they are fairly few (say less than a 
few thousand), do a collect() to bring them to the master, then do 
sc.parallelize(elements, numElements) to get an RDD with exactly one element 
per partition.

Matei

On May 12, 2014, at 10:29 AM, NevinLi158 <nevinli...@gmail.com> wrote:

> Hi all,
> 
> I'm currently trying to use pipe to run C++ code on each worker node, and I
> have an RDD of essentially command line arguments that i'm passing to each
> node. I want to send exactly one element to each node, but when I run my
> code, Spark ends up sending multiple elements to a node: is there any way to
> force Spark to send only one? I've tried coalescing and repartitioning the
> RDD to be equal to the number of elements in the RDD, but that hasn't
> worked.
> 
> Thanks!
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Forcing-spark-to-send-exactly-one-element-to-each-worker-node-tp5605.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to