Ah somehow after all this time I've never seen that!
On Wed, Jan 22, 2014 at 4:45 PM, Aureliano Buendia wrote:
>
>
>
> On Thu, Jan 23, 2014 at 12:37 AM, Patrick Wendell
> wrote:
>>
>> What is the ++ operator here? Is this something you defined?
>
>
> No, it's an alias for union defined in RDD.sc
On Thu, Jan 23, 2014 at 12:37 AM, Patrick Wendell wrote:
> What is the ++ operator here? Is this something you defined?
>
No, it's an alias for union defined in RDD.scala:
def ++(other: RDD[T]): RDD[T] = this.union(other)
>
> Another issue is that RDD's are not ordered, so when you union two
>
What is the ++ operator here? Is this something you defined?
Another issue is that RDD's are not ordered, so when you union two
together it doesn't have a well defined ordering.
If you do want to do this you could coalesce into one partition, then
call MapPartitions and return an iterator that fi
Hi,
I'm trying to find a way to create a csv header when using saveAsTextFile,
and I came up with this:
(sc.makeRDD(Array("col1,col2,col3"), 1) ++
myRdd.coalesce(1).map(_.mkString(",")))
.saveAsTextFile("out.csv")
But it only saves the header part. Why is that the union method does not
ret