Only TraversableOnce?

wxhsdp Tue, 08 Apr 2014 05:11:32 -0700

In my application, data parts inside an RDD partition have ralations. so I
need to do some operations beween them.


for example
RDD T1 has several partitions, each partition has three parts A, B and C.
then I transform T1 to T2. after transform, T2 also has three parts D, E and
F, D = A+B, E = A+C, F = B+C. As far as I know, spark only supports
operations traversing the RDD and calling a function for each element. how
can I do such a transform?

in hadoop I copy the data in each partition to a user defined buffer and do
any operations I like in the buffer, finally I call output.collect() to emit
the data. But how can I construct a new RDD with distributed partitions in
spark? makeRDD only distributes a local Scala collection to form an RDD.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Only-TraversableOnce-tp3873.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Only TraversableOnce?

Reply via email to