[ 
https://issues.apache.org/jira/browse/SPARK-23273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16346148#comment-16346148
 ] 

Liang-Chi Hsieh commented on SPARK-23273:
-----------------------------------------

The {{name}} column will be added after {{age}} in {{ds2}}. So the schema of 
{{ds2}} doesn't match {{ds1}} in the order of columns. You can change column 
order with a projection before union:
{code:java}
scala> ds1.union(ds2.select("name", "age").as[NameAge]).show
+-------------+---+
|         name|age|
+-------------+---+
|henriquedsg89|  1|
+-------------+---+
{code}
Since 2.3.0, there is an API {{unionByName}} can be used for this kind of cases:
{code:java}
scala> ds1.unionByName(ds2).show
+-------------+---+
|         name|age|
+-------------+---+
|henriquedsg89|  1|
+-------------+---+
{code}

> Spark Dataset withColumn - schema column order isn't the same as case class 
> paramether order
> --------------------------------------------------------------------------------------------
>
>                 Key: SPARK-23273
>                 URL: https://issues.apache.org/jira/browse/SPARK-23273
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.1
>            Reporter: Henrique dos Santos Goulart
>            Priority: Major
>
> {code:java}
> case class OnlyAge(age: Int)
> case class NameAge(name: String, age: Int)
> val ds1 = spark.emptyDataset[NameAge]
> val ds2 = spark
>   .createDataset(Seq(OnlyAge(1)))
>   .withColumn("name", lit("henriquedsg89"))
>   .as[NameAge]
> ds1.show()
> ds2.show()
> ds1.union(ds2)
> {code}
>  
> It's going to raise this error:
> {noformat}
> Cannot up cast `age` from string to int as it may truncate
> The type path of the target object is:
> - field (class: "scala.Int", name: "age")
> - root class: "dw.NameAge"{noformat}
> It seems that .as[CaseClass] doesn't keep the order of paramethers that is 
> typed on case class.
> If I change the case class paramether order, it's going to work... like: 
> {code:java}
> case class NameAge(age: Int, name: String){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to