I am using SQLContext.jsonFile. If a valid JSON contains newlines,
spark1.1.1 dumps trace below. If the JSON is read as one line, it works
fine. Is this known?
14/12/10 11:44:02 ERROR Executor: Exception in task 0.0 in stage 14.0 (TID
28)
com.fasterxml.jackson.core.JsonParseException:
Yep, because sc.textFile will only guarantee that lines will be preserved
across splits, this is the semantic. It would be possible to write a
custom input format, but that hasn't been done yet. From the documentation:
http://spark.apache.org/docs/latest/sql-programming-guide.html#json-datasets