Mengran Lan created SPARK-45334: ----------------------------------- Summary: Remove misleading comment in parquetSchemaConverter Key: SPARK-45334 URL: https://issues.apache.org/jira/browse/SPARK-45334 Project: Spark Issue Type: Documentation Components: SQL Affects Versions: 3.5.0 Reporter: Mengran Lan
I'm debugging a parquet issue and reading spark code as references. Happened to find a misleading comment which remains in the latest version as well. {code:java} Types .buildGroup(repetition).as(LogicalTypeAnnotation.listType()) .addField(Types .buildGroup(REPEATED) // "array" is the name chosen by parquet-hive (1.7.0 and prior version) .addField(convertField(StructField("array", elementType, nullable))) .named("bag")) .named(field.name) {code} the comment above is misleading since Hive always uses "array_element" as the name. It is imported by this PR [https://github.com/apache/spark/pull/14399] and relates to this issue https://issues.apache.org/jira/browse/SPARK-16777 Furthermore, the parquet-hive module has been removed from the parquet-mr project https://issues.apache.org/jira/browse/PARQUET-1676 I suggest removing this piece of comment and will submit a PR later. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org