Heedo Lee created SPARK-35912: --------------------------------- Summary: [SQL] JSON read behavior is different depending on the cache setting when nullable is false. Key: SPARK-35912 URL: https://issues.apache.org/jira/browse/SPARK-35912 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.1.1 Reporter: Heedo Lee
Below is the reproduced code. {code:java} import org.apache.spark.sql.Encoders case class TestSchema(x: Int, y: Int) case class BaseSchema(value: TestSchema) val schema = Encoders.product[BaseSchema].schema val testDS = Seq("""{"value":{"x":1}}""", """{"value":{"x":2}}""").toDS val jsonDS = spark.read.schema(schema).json(testDS) jsonDS.show +---------+ | value| +---------+ |{1, null}| |{2, null}| +---------+ jsonDS.cache.show +------+ | value| +------+ |{1, 0}| |{2, 0}| +------+ {code} The above result occurs when a schema is created with a nested StructType and nullable of StructField is false. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org