RE: How to Parse & flatten JSON object in a text file using Spark&Scala into Dataframe

Bharathi Raja Thu, 24 Dec 2015 04:08:06 -0800

Thanks Eran, I'll check the solution.

Regards,
Raja

-----Original Message-----
From: "Eran Witkon" <eranwit...@gmail.com>
Sent: ‎12/‎24/‎2015 4:07 PM
To: "Bharathi Raja" <raja...@yahoo.com>; "Gokula Krishnan D" 
<email2...@gmail.com>
Cc: "user@spark.apache.org" <user@spark.apache.org>
Subject: Re: How to Parse & flatten JSON object in a text file using 
Spark&Scala into Dataframe

raja! I found the answer to your question! 
Look at 
http://stackoverflow.com/questions/34069282/how-to-query-json-data-column-using-spark-dataframes
this is what you (and I) was looking for.
general idea - you read the list as text where project Details is just a string 
field and then you build the JSON string representation of the whole line and 
you have a nested JSON schema which SparkSQL can read.

Eran

On Thu, Dec 24, 2015 at 10:26 AM Eran Witkon <eranwit...@gmail.com> wrote:

I don't have the exact answer for you but I would look for something using 
explode method on DataFrame  

On Thu, Dec 24, 2015 at 7:34 AM Bharathi Raja <raja...@yahoo.com> wrote:

Thanks Gokul, but the file I have had the same format as I have mentioned. 
First two columns are not in Json format.

Thanks,
Raja

From: Gokula Krishnan D
Sent: ‎12/‎24/‎2015 2:44 AM
To: Eran Witkon
Cc: raja kbv; user@spark.apache.org

Subject: Re: How to Parse & flatten JSON object in a text file using Spark 
&Scala into Dataframe

You can try this .. But slightly modified the  input structure since first two 
columns were not in Json format. 

Thanks & Regards, 
Gokula Krishnan (Gokul)

On Wed, Dec 23, 2015 at 9:46 AM, Eran Witkon <eranwit...@gmail.com> wrote:

Did you get a solution for this?

On Tue, 22 Dec 2015 at 20:24 raja kbv <raja...@yahoo.com.invalid> wrote:

Hi,

I am new to spark.

I have a text file with below structure.

(employeeID: Int, Name: String, ProjectDetails: JsonObject{[{ProjectName, 
Description, Duriation, Role}]})
Eg:
(123456, Employee1, {“ProjectDetails”:[
                                                         { “ProjectName”: “Web 
Develoement”, “Description” : “Online Sales website”, “Duration” : “6 Months” , 
“Role” : “Developer”}
                                                         { “ProjectName”: 
“Spark Develoement”, “Description” : “Online Sales Analysis”, “Duration” : “6 
Months” , “Role” : “Data Engineer”}
                                                         { “ProjectName”: 
“Scala Training”, “Description” : “Training”, “Duration” : “1 Month” }
                                                          ]
                                                }

Could someone help me to parse & flatten the record as below dataframe using 
scala?

employeeID,Name, ProjectName, Description, Duration, Role
123456, Employee1, Web Develoement, Online Sales website, 6 Months , Developer
123456, Employee1, Spark Develoement, Online Sales Analysis, 6 Months, Data 
Engineer
123456, Employee1, Scala Training, Training, 1 Month, null

Thank you in advance.

Regards,
Raja

RE: How to Parse & flatten JSON object in a text file using Spark&Scala into Dataframe

Reply via email to