sunchao commented on a change in pull request #33888: URL: https://github.com/apache/spark/pull/33888#discussion_r702038063
########## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala ########## @@ -200,19 +200,28 @@ private[parquet] class ParquetRowConverter( // Converters for each field. private[this] val fieldConverters: Array[Converter with HasParentContainerUpdater] = { - // (SPARK-31116) Use case insensitive map if spark.sql.caseSensitive is false - // to prevent throwing IllegalArgumentException when searching catalyst type's field index - val catalystFieldNameToIndex = if (SQLConf.get.caseSensitiveAnalysis) { - catalystType.fieldNames.zipWithIndex.toMap + if (SQLConf.get.parquetAccessByIndex) { + // SPARK-36634: When access parquet file by the idx of columns, we can not ensure 2 types + // matched + parquetType.getFields.asScala.zip(catalystType).zipWithIndex.map { Review comment: what if `len(parquetType.getFields != len(catalystType)`? I think it's not always guaranteed because of `ParquetReadSupport.intersectParquetGroups`? e.g., some fields in catalyst requested schema do not appear in Parquet file schema. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org