[ https://issues.apache.org/jira/browse/SPARK-17695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15622415#comment-15622415 ]
Miguel Cabrera commented on SPARK-17695: ---------------------------------------- Hi, is there a way to prevent this? besides not using the {{json}} method? I currently mapping the underlying rdd and transforming into {{StringRDD}} with the already serialized json. I am using PySpark though and the default json serializer. > Deserialization error when using DataFrameReader.json on JSON line that > contains an empty JSON object > ----------------------------------------------------------------------------------------------------- > > Key: SPARK-17695 > URL: https://issues.apache.org/jira/browse/SPARK-17695 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.0.0 > Environment: Scala 2.11.7 > Reporter: Jonathan Simozar > > When using the {{DataFrameReader}} method {{json}} on the JSON > {noformat}{"field1":{},"field2":"a"}{noformat} > {{field1}} is removed at deserialization. > This can be reproduced in the example below. > {code:java}// create spark context > val sc: SparkContext = new SparkContext("local[*]", "My App") > // create spark session > val sparkSession: SparkSession = > SparkSession.builder().config(sc.getConf).getOrCreate() > // create rdd > val strings = sc.parallelize(Seq( > """{"field1":{},"field2":"a"}""" > )) > // create json DataSet[Row], convert back to RDD, and print lines to stdout > sparkSession.read.json(strings) > .toJSON.collect().foreach(println) > {code} > *stdout* > {noformat} > {"field2":"a"} > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org