The simplest way is probably to use the sc.binaryFiles or sc.wholeTextFiles API to create an RDD containing the JSON files (maybe need a sc.wholeTextFiles(…).map(x => x._2) to drop off the filename column) then do a sqlContext.read.json(rddName)
That way, you don’t need to worry about combining lines. Ewan From: KhajaAsmath Mohammed [mailto:mdkhajaasm...@gmail.com] Sent: 08 May 2016 23:20 To: user @spark <user@spark.apache.org> Subject: Parse Json in Spark Hi, I am working on parsing the json in spark but most of the information available online states that I need to have entire JSON in single line. In my case, Json file is delivered in complex structure and not in a single line. could anyone know how to process this in SPARK. I used Jackson jar to process json and was able to do it when it is present in single line. Any ideas? Thanks, Asmath