try as below
results.map(row => row(1)).collect
try
var hobbies = results.flatMap(row => row(1))
It will create all the hobbies in a simpe array nowob
hbmap =hobbies.map(hobby =>(hobby,1)).reduceByKey((hobcnt1,hobcnt2)
=>hobcnt1+hobcnt2)
It will aggregate hobbies as below
{swimming,2}, {hiking,1}
Now hbmap .map{case(hobby,count)=>(count,hobby)}.sortByKey(ascending
=false).collect
will give you hobbies sorted in descending by their count
This is pseudo code and must help you
Regards
Pankaj
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Finding-most-occurrences-in-a-JSON-Nested-Array-tp20971p20975.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]