* unionAll preserve duplicate v/s union that does not
This is true, if you want to eliminate duplicate items you should follow
the union with a distinct()
* SQL union and unionAll result in same output format i.e. another SQL v/s
different RDD types here.
* Understand the existing union
Hi,
I am trying SparkSQL based on the example on doc ...
val people =
sc.textFile(/data/spark/examples/src/main/resources/people.txt).map(_.split(,)).map(p
= Person(p(0), p(1).trim.toInt))
val olderThanTeans = people.where('age 19)
val youngerThanTeans = people.where('age 13)
val