The simplest way is probably to use the sc.binaryFiles or sc.wholeTextFiles API 
to create an RDD containing the JSON files (maybe need a 
sc.wholeTextFiles(…).map(x => x._2) to drop off the filename column) then do a 
sqlContext.read.json(rddName)

That way, you don’t need to worry about combining lines.

Ewan

From: KhajaAsmath Mohammed [mailto:mdkhajaasm...@gmail.com]
Sent: 08 May 2016 23:20
To: user @spark <user@spark.apache.org>
Subject: Parse Json in Spark

Hi,

I am working on parsing the json in spark but most of the information available 
online states that  I need to have entire JSON in single line.

In my case, Json file is delivered in complex structure and not in a single 
line. could anyone know how to process this in SPARK.

I used Jackson jar to process json and was able to do it when it is present in 
single line. Any ideas?

Thanks,
Asmath

Reply via email to