Rdd union will result in 1 2 3 4 5 6 7 8 9 10 11 12
What you are trying to do is join. There must be a logic/key to perform join operation. I think in your case you want the order (index) to be the joining key here. RDD is a distributed data structure and is not apt for your case. If that amount for data is less, you can use rdd.collect, just iterate on it both the list and produce the desired result -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/can-t-union-two-rdds-tp22320p22323.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org