try as below results.map(row => row(1)).collect
try var hobbies = results.flatMap(row => row(1)) It will create all the hobbies in a simpe array nowob hbmap =hobbies.map(hobby =>(hobby,1)).reduceByKey((hobcnt1,hobcnt2) =>hobcnt1+hobcnt2) It will aggregate hobbies as below {swimming,2}, {hiking,1} Now hbmap .map{case(hobby,count)=>(count,hobby)}.sortByKey(ascending =false).collect will give you hobbies sorted in descending by their count This is pseudo code and must help you Regards Pankaj -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Finding-most-occurrences-in-a-JSON-Nested-Array-tp20971p20975.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org