[ 
https://issues.apache.org/jira/browse/SPARK-36269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-36269.
---------------------------------
    Fix Version/s: 3.1.3
                   3.2.0
                   3.0.4
       Resolution: Fixed

Issue resolved by pull request 33489
[https://github.com/apache/spark/pull/33489]

> Fix only set data columns to Hive column names config
> -----------------------------------------------------
>
>                 Key: SPARK-36269
>                 URL: https://issues.apache.org/jira/browse/SPARK-36269
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: Cheng Su
>            Assignee: Cheng Su
>            Priority: Minor
>             Fix For: 3.0.4, 3.2.0, 3.1.3
>
>
> When reading Hive table, we set the Hive column id and column name configs 
> (`hive.io.file.readcolumn.ids` and `hive.io.file.readcolumn.names`). We 
> should set non-partition columns (data columns) for both configs, as Spark 
> always appends partition columns in its own reader - 
> [https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala#L240]
>  . The column id config has only non-partition columns, but column name 
> config has both partition and non-partition columns. We should keep them to 
> be consistent with only non-partition columns. This does not cause issue for 
> public OSS Hive file format, but for customized internal Hive file format, it 
> causes the issue as we are expecting these two configs to be same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to