[ https://issues.apache.org/jira/browse/SPARK-38826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean R. Owen resolved SPARK-38826. ---------------------------------- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 36111 [https://github.com/apache/spark/pull/36111] > dropFieldIfAllNull option does not work for empty JSON struct > ------------------------------------------------------------- > > Key: SPARK-38826 > URL: https://issues.apache.org/jira/browse/SPARK-38826 > Project: Spark > Issue Type: Improvement > Components: Documentation > Affects Versions: 3.2.1 > Reporter: morvenhuang > Assignee: morvenhuang > Priority: Trivial > Fix For: 3.4.0 > > > As stated in the doc, > {quote}dropFieldIfAllNull > Whether to ignore column of all null values or empty array/struct during > schema inference. > > {quote} > But when I try this, > > {code:java} > String json = "{\"field1\":\"value1\", \"field2\":{}}"; > JavaSparkContext jsc = new JavaSparkContext(spark.sparkContext()); > JavaRDD<String> jrdd = jsc.parallelize(Arrays.asList(json)); > Dataset<Row> df = spark.read().option("dropFieldIfAllNull", > "false").json(jrdd); > df.printSchema(); > {code} > > I get this, > {code:java} > root > |-- field1: string (nullable = true){code} > Notice field2 is still missing even when dropFieldIfAllNull is set to false, > so apparently, this option does not work for empty struct. > This is due to SPARK-8093, the empty struct will be dropped anyway. > I think we should update the doc, otherwise it would be confusing. > I can make a patch for this. > -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org