Note that Spark never guarantees ordering of columns. There’s nothing in Spark 
documentation that says that the columns will be ordered a certain way. The 
proposed solution relies on an implementation detail that might change in 
future version of Spark.

Ideally, you shouldn’t rely on Dataframe to maintain order of columns. The 
question is why do you care about ordering of cols? If order of data is 
important, then you should put it in an array

From: Vikas Garg <sperry...@gmail.com>
Date: Thursday, November 12, 2020 at 12:40 PM
To: Subash Prabakar <subashpraba...@gmail.com>
Cc: German Schiavon <gschiavonsp...@gmail.com>, User <user@spark.apache.org>
Subject: RE: [EXTERNAL] Spark Dataset withColumn issue


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Ohhkkkk

Thanks a lot

On Thu, Nov 12, 2020, 21:23 Subash Prabakar 
<subashpraba...@gmail.com<mailto:subashpraba...@gmail.com>> wrote:
Hi Vikas,

He suggested to use the select() function after your withColumn function.

val ds1 = ds.select("Col1", "Col3").withColumn("Col2", 
lit("sample”)).select(“Col1”, “Col2”, “Col3")


Thanks,
Subash

On Thu, Nov 12, 2020 at 9:19 PM Vikas Garg 
<sperry...@gmail.com<mailto:sperry...@gmail.com>> wrote:
I am deriving the col2 using with colunn which is why I cant use it like you 
told me

On Thu, Nov 12, 2020, 20:11 German Schiavon 
<gschiavonsp...@gmail.com<mailto:gschiavonsp...@gmail.com>> wrote:
ds.select("Col1", "Col2", "Col3")

On Thu, 12 Nov 2020 at 15:28, Vikas Garg 
<sperry...@gmail.com<mailto:sperry...@gmail.com>> wrote:
In Spark Datase, if we add additional column using
withColumn
then the column is added in the last.

e.g.
val ds1 = ds.select("Col1", "Col3").withColumn("Col2", lit("sample"))

the the order of columns is >> Col1  |  Col3  |  Col2

I want the order to be  >> Col1  |  Col2  |  Col3

How can I achieve this?

Reply via email to