I am exploring Spark SQL and Dataframe and trying to create an aggregration
by column and generate a single json row with aggregation. Any inputs on the
right approach will be helpful. 

Here is my sample data
user,sports,major,league,count

[test1,Sports,Switzerland,NLA,6]
[test1,Football,Australia,A-League,6]
[test1,Ice Hockey,Sweden,SHL,3]
[test1,Ice Hockey,Switzerland,NLB,2]
[test1,Football,Romania,Liga I,1]

I want to aggregate by user and create a single json row. 

{ user :  test1 , sports : [ { "Ice Hockey" : 11, "Football" : 7 }] , major
: [ {"Switzerland" : 8, "Australia" :6  , "Sweden" : 3 , "Romania" :1 }]
,league : [ "NLA" : 6 , "A-League" : 6 , "SHL" :3 , "NLB" :2 ,  "Liga I" :
1] , total : 18}



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Aggregation-by-column-and-generating-a-json-tp22562.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to