Re: JSON Arrays and Spark

Luciano Resende Mon, 10 Oct 2016 11:09:24 -0700

Please take a look at
http://spark.apache.org/docs/latest/sql-programming-guide.html#json-datasets


Particularly the note at the required format :

Note that the file that is offered as *a json file* is not a typical JSON
file. Each line must contain a separate, self-contained valid JSON object.
As a consequence, a regular multi-line JSON file will most often fail.



On Mon, Oct 10, 2016 at 9:57 AM, Jean Georges Perrin <j...@jgp.net> wrote:

> Hi folks,
>
> I am trying to parse JSON arrays and it’s getting a little crazy (for me
> at least)…
>
> 1)
> If my JSON is:
> {"vals":[100,500,600,700,800,200,900,300]}
>
> I get:
> +--------------------+
> |                vals|
> +--------------------+
> |[100, 500, 600, 7...|
> +--------------------+
>
> root
>  |-- vals: array (nullable = true)
>  |    |-- element: long (containsNull = true)
>
> and I am :)
>
> 2)
> If my JSON is:
> [100,500,600,700,800,200,900,300]
>
> I get:
> +--------------------+
> |     _corrupt_record|
> +--------------------+
> |[100,500,600,700,...|
> +--------------------+
>
> root
>  |-- _corrupt_record: string (nullable = true)
>
> Both are legit JSON structures… Do you think that #2 is a bug?
>
> jg
>
>
>
>
>
>


-- 
Luciano Resende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Re: JSON Arrays and Spark

Reply via email to