[ https://issues.apache.org/jira/browse/SPARK-35756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17365549#comment-17365549 ]
Saurabh Chawla commented on SPARK-35756: ---------------------------------------- This will work struct also if allowMissingColumns is set true ds1.unionByName(ds2.as[(Int,Struct1)], true) +---+------+ | _1| _2| +---+------+ | 1|\{1, 2}| | 1|\{2, 1}| +---+------+ > unionByName should support nested struct also > --------------------------------------------- > > Key: SPARK-35756 > URL: https://issues.apache.org/jira/browse/SPARK-35756 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.1.1 > Reporter: Wassim Almaaoui > Priority: Major > > It would be cool if `unionByName` supports also nested struct. I don't kwon > if it's the expected behaviour already or not so I am not sure if its a bug > or an improvement proposal. > {code:java} > case class Struct1(c1: Int, c2: Int) > case class Struct2(c2: Int, c1: Int) > val ds1 = Seq((1, Struct1(1,2))).toDS > val ds2 = Seq((1, Struct2(1,2))).toDS > ds1.unionByName(ds2.as[(Int,Struct1)]) {code} > gives > {code:java} > org.apache.spark.sql.AnalysisException: Union can only be performed on tables > with the compatible column types. struct<c2:int,c1:int> <> > struct<c1:int,c2:int> at the second column of the second table; 'Union false, > false :- LocalRelation [_1#38, _2#39] +- LocalRelation _1#45, _2#46 > {code} > The code documentation of the function `unionByName` says `Note that > allowMissingColumns supports nested column in struct types` but doesn't say > if the function itself supports the nested column ordering or not. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org