[ https://issues.apache.org/jira/browse/SPARK-48308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868716#comment-17868716 ]
Dongjoon Hyun commented on SPARK-48308: --------------------------------------- Since this causes RC2 failure, I updated the Jira information, `Priority`, from `Trivial` to `Blocker`. It's great to resolve it during RC period. Thank you! > Unify getting data schema without partition columns in FileSourceStrategy > ------------------------------------------------------------------------- > > Key: SPARK-48308 > URL: https://issues.apache.org/jira/browse/SPARK-48308 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 3.5.1 > Reporter: Johan Lasperas > Assignee: Johan Lasperas > Priority: Blocker > Labels: pull-request-available > Fix For: 4.0.0, 3.5.2 > > > In > [FileSourceStrategy,|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala#L191] > the schema of the data excluding partition columns is computed 2 times in a > slightly different way: > > {code:java} > val dataColumnsWithoutPartitionCols = > dataColumns.filterNot(partitionSet.contains) {code} > vs > {code:java} > val readDataColumns = dataColumns > .filterNot(partitionColumns.contains) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org