I have 2 paraquet files with format e.g  name , age, town
I read them  and then join them to get  all the names which are in both
towns  .
the resultant dataset is

res4: Array[org.apache.spark.sql.Row] = Array([name1, age1,
town1,name2,age2,town2]....)

Name 1 and name 2 are same as I am joining .
Now , I want to get only to the format (name , age1, age2)

But I cant seem to getting to manipulate the spark.sql.row.

Trying something like map(_.split (",")).map(a=> (a(0), a(1).trim().toInt))
does not work .

Can you suggest a way ?

Thanks
-R

Reply via email to