[jira] [Assigned] (SPARK-25313) Fix regression in FileFormatWriter output schema

Apache Spark (JIRA) Mon, 03 Sep 2018 00:29:06 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-25313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Apache Spark reassigned SPARK-25313:
------------------------------------

    Assignee: Apache Spark

> Fix regression in FileFormatWriter output schema
> ------------------------------------------------
>
>                 Key: SPARK-25313
>                 URL: https://issues.apache.org/jira/browse/SPARK-25313
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Gengliang Wang
>            Assignee: Apache Spark
>            Priority: Major
>
> In the follow example:
>         val location = "/tmp/t"
>         val df = spark.range(10).toDF("id")
>         df.write.format("parquet").saveAsTable("tbl")
>         spark.sql("CREATE VIEW view1 AS SELECT id FROM tbl")
>         spark.sql(s"CREATE TABLE tbl2(ID long) USING parquet location 
> $location")
>         spark.sql("INSERT OVERWRITE TABLE tbl2 SELECT ID FROM view1")
>         println(spark.read.parquet(location).schema)
>         spark.table("tbl2").show()
> The output column name in schema will be id instead of ID, thus the last 
> query shows nothing from tbl2.
> By enabling the debug message we can see that the output naming is changed 
> from ID to id, and then the outputColumns in 
> InsertIntoHadoopFsRelationCommand is changed in RemoveRedundantAliases.
> To guarantee correctness, we should change the output columns from 
> `Seq[Attribute]` to `Seq[String]` to avoid its names being replaced by 
> optimizer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-25313) Fix regression in FileFormatWriter output schema

Reply via email to