Hey Spark gurus! Sorry for the confusing title. I do not know the exactly description of my problem, if you know please tell me so I can change it :-)
Say I have two RDDs right now, and they are val rdd1 = sc.parallelize(List((1,(3)), (2,(5)), (3,(6)))) val rdd2 = sc.parallelize(List((2,(1)), (2,(3)), (3,(9)))) I want combine rdd1 and rdd2 to get rdd3 which looks like List((1,(3)), (2,(5,1)), (2,(5,3)), (3, (6,9))) The order in _._2 does not matter, so you can treat it as a Set. I tried to use zip, but since there is no guarantee that the length of rdd1 and rdd2 will be the same I do not know if it is doable. Also I looked into PairedRDD, some people use union operation on two RDDs and then apply a map function on it. Since I want all combinations according to _._1, I do not know how to achieve it by union and map. Thanks in advance! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/New-combination-like-RDD-based-on-two-RDDs-tp21508.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org