GitHub user wmellouli opened a pull request: https://github.com/apache/spark/pull/22332
[SPARK-25333][SQL] Ability add new columns in the beginning of Dataset ## What changes were proposed in this pull request? When we add new columns in a Dataset, they are added automatically at the end of the Dataset. Generally users want to add new columns either at the end or in the beginning, depends on use cases. In my case for example, we add technical columns in the beginning of a Dataset and we add business columns at the end. This pull request, add the ability to add new columns in the beginning of a Dataset, using an optional flag **atTheEnd**: - true (default behavior) means add the column at the end - false means add the column in the beginning The change brought is backward compatible with old versions, so we can: 1- add a new column without using the flag **atTheEnd** (default behavior): ``` val newDf = df.withColumn("newColumn", col("value") + 1) newDf.printSchema root |-- value: integer (nullable = true) |-- newColumn: integer (nullable = true) ``` 2- add a new column using the flag **atTheEnd** with **true** value: ``` val newDf = df.withColumn("newColumn", col("value") + 1, true) newDf.printSchema root |-- value: integer (nullable = true) |-- newColumn: integer (nullable = true) ``` 3- add a new column using the flag **atTheEnd** with **false** value: ``` val newDf = df.withColumn("newColumn", col("value") + 1, false) newDf.printSchema root |-- newColumn: integer (nullable = true) |-- value: integer (nullable = true) ``` ## How was this patch tested? This patch is tested with unit tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/wmellouli/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22332.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22332 ---- commit f83afe5172086993756e750bd2c7e3bb05667f62 Author: Walid MELLOULI <walid_mellouli@...> Date: 2018-09-04T16:30:32Z [SPARK-25333][SQL] Ability to add new columns in the beginning of Dataset ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org