No, I meant it should be in a single line but it supports array type too as a root wrapper of JSON objects.
If you need to parse multiple lines, I have a reference here. http://searchdatascience.com/spark-adventures-1-processing-multi-line-json-files/ 2016-10-12 15:04 GMT+09:00 Kappaganthu, Sivaram (ES) < sivaram.kappagan...@adp.com>: > Hi, > > > > Does this mean that handling any Json with kind of below schema with > spark is not a good fit?? I have requirement to parse the below Json that > spans across multiple lines. Whats the best way to parse the jsns of this > kind?? Please suggest. > > > > root > > |-- maindate: struct (nullable = true) > > | |-- mainidnId: string (nullable = true) > > |-- Entity: array (nullable = true) > > | |-- element: struct (containsNull = true) > > | | |-- Profile: struct (nullable = true) > > | | | |-- Kind: string (nullable = true) > > | | |-- Identifier: string (nullable = true) > > | | |-- Group: array (nullable = true) > > | | | |-- element: struct (containsNull = true) > > | | | | |-- Period: struct (nullable = true) > > | | | | | |-- pid: string (nullable = true) > > | | | | | |-- pDate: string (nullable = true) > > | | | | | |-- quarter: long (nullable = true) > > | | | | | |-- labour: array (nullable = true) > > | | | | | | |-- element: struct (containsNull = true) > > | | | | | | | |-- category: string (nullable = true) > > | | | | | | | |-- id: string (nullable = true) > > | | | | | | | |-- person: struct (nullable = true) > > | | | | | | | | |-- address: array (nullable = > true) > > | | | | | | | | | |-- element: struct > (containsNull = true) > > | | | | | | | | | | |-- city: string > (nullable = true) > > | | | | | | | | | | |-- line1: string > (nullable = true) > > | | | | | | | | | | |-- line2: string > (nullable = true) > > | | | | | | | | | | |-- postalCode: string > (nullable = true) > > | | | | | | | | | | |-- state: string > (nullable = true) > > | | | | | | | | | | |-- type: string > (nullable = true) > > | | | | | | | | |-- familyName: string (nullable = > true) > > | | | | | | | |-- tax: array (nullable = true) > > | | | | | | | | |-- element: struct (containsNull > = true) > > | | | | | | | | | |-- code: string (nullable = > true) > > | | | | | | | | | |-- qwage: double (nullable = > true) > > | | | | | | | | | |-- qvalue: double (nullable > = true) > > | | | | | | | | | |-- qSubjectvalue: double > (nullable = true) > > | | | | | | | | | |-- qfinalvalue: double > (nullable = true) > > | | | | | | | | | |-- ywage: double (nullable = > true) > > | | | | | | | | | |-- yalue: double (nullable = > true) > > | | | | | | | | | |-- ySubjectvalue: double > (nullable = true) > > | | | | | | | | | |-- yfinalvalue: double > (nullable = true) > > | | | | | | | |-- tProfile: array (nullable = true) > > | | | | | | | | |-- element: struct (containsNull > = true) > > | | | | | | | | | |-- isExempt: boolean > (nullable = true) > > | | | | | | | | | |-- jurisdiction: struct > (nullable = true) > > | | | | | | | | | | |-- code: string > (nullable = true) > > | | | | | | | | | |-- maritalStatus: string > (nullable = true) > > | | | | | | | | | |-- numberOfDeductions: long > (nullable = true) > > | | | | | | | |-- wDate: struct (nullable = true) > > | | | | | | | | |-- originalHireDate: string > (nullable = true) > > | | | | | |-- year: long (nullable = true) > > > > > > *From:* Luciano Resende [mailto:luckbr1...@gmail.com] > *Sent:* Monday, October 10, 2016 11:39 PM > *To:* Jean Georges Perrin > *Cc:* user @spark > *Subject:* Re: JSON Arrays and Spark > > > > Please take a look at > http://spark.apache.org/docs/latest/sql-programming-guide. > html#json-datasets > > Particularly the note at the required format : > > Note that the file that is offered as *a json file* is not a typical JSON > file. Each line must contain a separate, self-contained valid JSON object. > As a consequence, a regular multi-line JSON file will most often fail. > > > > On Mon, Oct 10, 2016 at 9:57 AM, Jean Georges Perrin <j...@jgp.net> wrote: > > Hi folks, > > > > I am trying to parse JSON arrays and it’s getting a little crazy (for me > at least)… > > > > 1) > > If my JSON is: > > {"vals":[100,500,600,700,800,200,900,300]} > > > > I get: > > +--------------------+ > > | vals| > > +--------------------+ > > |[100, 500, 600, 7...| > > +--------------------+ > > > > root > > |-- vals: array (nullable = true) > > | |-- element: long (containsNull = true) > > > > and I am :) > > > > 2) > > If my JSON is: > > [100,500,600,700,800,200,900,300] > > > > I get: > > +--------------------+ > > | _corrupt_record| > > +--------------------+ > > |[100,500,600,700,...| > > +--------------------+ > > > > root > > |-- _corrupt_record: string (nullable = true) > > > > Both are legit JSON structures… Do you think that #2 is a bug? > > > > jg > > > > > > > > > > > > > > > -- > > Luciano Resende > http://twitter.com/lresende1975 > http://lresende.blogspot.com/ > ------------------------------ > This message and any attachments are intended only for the use of the > addressee and may contain information that is privileged and confidential. > If the reader of the message is not the intended recipient or an authorized > representative of the intended recipient, you are hereby notified that any > dissemination of this communication is strictly prohibited. If you have > received this communication in error, notify the sender immediately by > return email and delete the message and any attachments from your system. >