This is as much of a Scala question as a Spark question
I have an RDD:
val rdd1: RDD[(Long, Array[Long])]
This RDD has duplicate keys that I can collapse such
val rdd2: RDD[(Long, Array[Long])] = rdd1.reduceByKey((a,b) = a++b)
If I start with an Array of primitive longs in rdd1, will rdd2
It seems that ++ does the right thing on arrays of longs, and gives you another
one:
scala val a = Array[Long](1,2,3)
a: Array[Long] = Array(1, 2, 3)
scala val b = Array[Long](1,2,3)
b: Array[Long] = Array(1, 2, 3)
scala a ++ b
res0: Array[Long] = Array(1, 2, 3, 1, 2, 3)
scala res0.getClass