subject:"Spark 1.1.1 SQLContext.jsonFile dumps trace if JSON has newlines ..."

Spark 1.1.1 SQLContext.jsonFile dumps trace if JSON has newlines ...

2014-12-10 Thread Manoj Samel

I am using SQLContext.jsonFile. If a valid JSON contains newlines, spark1.1.1 dumps trace below. If the JSON is read as one line, it works fine. Is this known? 14/12/10 11:44:02 ERROR Executor: Exception in task 0.0 in stage 14.0 (TID 28) com.fasterxml.jackson.core.JsonParseException:

Re: Spark 1.1.1 SQLContext.jsonFile dumps trace if JSON has newlines ...

2014-12-10 Thread Michael Armbrust

Yep, because sc.textFile will only guarantee that lines will be preserved across splits, this is the semantic. It would be possible to write a custom input format, but that hasn't been done yet. From the documentation: http://spark.apache.org/docs/latest/sql-programming-guide.html#json-datasets