Hi Eran,
Can you try 1.6? With the change in
https://github.com/apache/spark/pull/10288, JSON data source will not throw
a runtime exception if there is any record that it cannot parse. Instead,
it will put the entire record to the column of "_corrupt_record".
Thanks,
Yin
On Sun, Dec 20, 2015
Once I removed the CR LF from the file it worked ok.
eran
On Mon, 21 Dec 2015 at 06:29 Yin Huai wrote:
> Hi Eran,
>
> Can you try 1.6? With the change in
> https://github.com/apache/spark/pull/10288, JSON data source will not
> throw a runtime exception if there is any
hey Eran, I run into this all the time with Json.
the problem is likely that your Json is "too pretty" and extending beyond a
single line which trips up the Json reader.
my solution is usually to de-pretty the Json - either manually or through an
ETL step - by stripping all white space before
Thanks for this!
This was the problem...
On Sun, 20 Dec 2015 at 18:49 Chris Fregly wrote:
> hey Eran, I run into this all the time with Json.
>
> the problem is likely that your Json is "too pretty" and extending beyond
> a single line which trips up the Json reader.
>
> my
Hi,
I tried the following code in spark-shell on spark1.5.2:
*val df =
sqlContext.read.json("/home/eranw/Workspace/JSON/sample/sample2.json")*
*df.count()*
15/12/19 23:49:40 ERROR Executor: Managed memory leak detected; size =
67108864 bytes, TID = 3
15/12/19 23:49:40 ERROR Executor: Exception
The 'Failed to parse a value' was the cause for execution failure.
Can you disclose the structure of your json file ?
Maybe try latest 1.6.0 RC to see if the problem goes away.
Thanks
On Sat, Dec 19, 2015 at 1:55 PM, Eran Witkon wrote:
> Hi,
> I tried the following code