It turned out that Col1 appeared twice in the select :-)
> On Mar 16, 2016, at 7:29 PM, Divya Gehlot <divya.htco...@gmail.com> wrote: > > Hi, > I am dynamically doing union all and adding new column too > >> val dfresult = >> dfAcStamp.select("Col1","Col1","Col3","Col4","Col5","Col6","col7","col8","col9") >> val schemaL = dfresult.schema >> var dffiltered = sqlContext.createDataFrame(sc.emptyRDD[Row], schemaL) >> for ((key,values) <- lcrMap) { >> if(values(4) != null){ >> println("Condition============="+values(4)) >> val renameRepId = values(0)+"REP_ID" >> dffiltered.printSchema >> dfresult.printSchema >> dffiltered = >> dffiltered.unionAll(dfresult.withColumn(renameRepId,lit(values(3))).drop("Col9").select("Col1","Col1","Col3","Col4","Col5","Col6","Col7","Col8","Col9").where(values(4))).distinct() >> >> } >> } > > > when I am printing the schema > dfresult > root > |-- Col1: date (nullable = true) > |-- Col2: date (nullable = true) > |-- Col3: string (nullable = false) > |-- Col4: string (nullable = false) > |-- Col5: string (nullable = false) > |-- Col6: string (nullable = true) > |-- Col7: string (nullable = true) > |-- Col8: string (nullable = true) > |-- Col9: null (nullable = true) > > > dffiltered Schema > root > |-- Col1: date (nullable = true) > |-- Col2: date (nullable = true) > |-- Col3: string (nullable = false) > |-- Col4: string (nullable = false) > |-- Col5: string (nullable = false) > |-- Col6: string (nullable = true) > |-- Col7: string (nullable = true) > |-- Col8: string (nullable = true) > |-- Col9: null (nullable = true) > > > As It is priting the same schema but when I am doing UnionAll its giving me > below error > org.apache.spark.sql.AnalysisException: Union can only be performed on tables > with the same number of columns, but the left table has 9 columns and the > right has 8; > > Could somebody help me in pointing out my mistake . > > > Thanks, > >