Hello All - 

tried couple of operations by using ++ and union on RDD's but realized that
the end results are same. Do you know any differences?.

val odd_partA  = List(1,3,5,7,9,11,1,3,5,7,9,11,1,3,5,7,9,11)
odd_partA: List[Int] = List(1, 3, 5, 7, 9, 11, 1, 3, 5, 7, 9, 11, 1, 3, 5,
7, 9, 11)

val odd_partB  = List(1,3,13,15,9)
odd_partB: List[Int] = List(1, 3, 13, 15, 9)

val odd_partC  = List(15,9,1,3,13)
odd_partC: List[Int] = List(15, 9, 1, 3, 13)

val odd_partA_RDD = sc.parallelize(odd_partA)
odd_partA_RDD: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[9] at
parallelize at <console>:17

val odd_partB_RDD = sc.parallelize(odd_partB)
odd_partB_RDD: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[10] at
parallelize at <console>:17

val odd_partC_RDD = sc.parallelize(odd_partC)
odd_partC_RDD: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[11] at
parallelize at <console>:17

val odd_PARTAB_pp = odd_partA_RDD ++(odd_partB_RDD)
odd_PARTAB_pp: org.apache.spark.rdd.RDD[Int] = UnionRDD[12] at $plus$plus at
<console>:23

val odd_PARTAB_union = odd_partA_RDD.union(odd_partB_RDD)
odd_PARTAB_union: org.apache.spark.rdd.RDD[Int] = UnionRDD[13] at union at
<console>:23

odd_PARTAB_pp.count
res8: Long = 23

odd_PARTAB_union.count
res9: Long = 23

val odd_PARTABC_pp = odd_partA_RDD ++(odd_partB_RDD) ++ (odd_partC_RDD)
odd_PARTABC_pp: org.apache.spark.rdd.RDD[Int] = UnionRDD[15] at $plus$plus
at <console>:27

val odd_PARTABC_union =
odd_partA_RDD.union(odd_partB_RDD).union(odd_partC_RDD)
odd_PARTABC_union: org.apache.spark.rdd.RDD[Int] = UnionRDD[17] at union at
<console>:27

odd_PARTABC_pp.count
res10: Long = 28

odd_PARTABC_union.count
res11: Long = 28

Thanks
Gokul



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/difference-between-and-Union-of-a-RDD-tp25830.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to