[jira] [Commented] (SPARK-7273) The SQLContext.jsonFile() api has a problem when load a format json file?

Frederick Reiss (JIRA) Fri, 01 May 2015 09:18:45 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-7273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523384#comment-14523384
 ]


Frederick Reiss commented on SPARK-7273:
----------------------------------------

The error in the description indicates that there is a character in the middle 
of the first line of the JSON file that TextInputFormat treats as a line 
separator. Spark sees the JSON content as 

I can think of two potential causes:
a) Steven's JSON content has run through a pretty-printing function, and there 
is a newline character between the two parts of the JSON object, or
b) Steven's local Hadoop/YARN configuration has a nonstandard setting for 
"textinputformat.record.delimiter"

[~jiege]: Can you share a copy of your JSON file?

Technical details:
SQLContext.jsonFile() makes a call to org.apache.spark.sql.json.DefaultSource, 
which delegates the task to org.apache.spark.sql.json.JSONRelation, which uses 
SparkContext.textFile() to open the JSON file. SparkContext.textFile() uses 
TextInputFormat to read the file.

> The SQLContext.jsonFile() api has a problem when load a format json file?
> -------------------------------------------------------------------------
>
>                 Key: SPARK-7273
>                 URL: https://issues.apache.org/jira/browse/SPARK-7273
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.3.1
>            Reporter: steven
>            Priority: Minor
>
> my code as follow:
>  val df = sqlContext.jsonFile("test.json");
> test.json content is:
>  { "name": "steven",
>     "age" : "20"
> }
> the jsonFile invoke will get a Exception as follow:
>       java.lang.RuntimeException: Failed to parse record     "age" : "20"}. 
> Please make sure that each line of the file (or each string in the RDD) is a 
> valid JSON object or an array of JSON objects.
>       at scala.sys.package$.error(package.scala:27)
>       at 
> org.apache.spark.sql.json.JsonRDD$$anonfun$parseJson$1$$anonfun$apply$2.apply(JsonRDD.scala:313)
>       at 
> org.apache.spark.sql.json.JsonRDD$$anonfun$parseJson$1$$anonfun$apply$2.apply(JsonRDD.scala:307)
> is it a bug?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-7273) The SQLContext.jsonFile() api has a problem when load a format json file?

Reply via email to