[GitHub] spark pull request #22332: [SPARK-25333][SQL] Ability add new columns in Dat...

2018-09-06 Thread wmellouli
Github user wmellouli closed the pull request at:

https://github.com/apache/spark/pull/22332


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22332: [SPARK-25333][SQL] Ability add new columns in Dat...

2018-09-05 Thread wmellouli
Github user wmellouli commented on a diff in the pull request:

https://github.com/apache/spark/pull/22332#discussion_r215185048
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2226,16 +2226,18 @@ class Dataset[T] private[sql](
 * `column`'s expression must only refer to attributes supplied by this 
Dataset. It is an
 * error to add a column that refers to some other Dataset.
 *
-* You can choose to add new columns either at the end (default 
behavior) or at the beginning.
+* The position of the new column start from 0, and a negative position 
means at the end (default behavior).
 */
-  def withColumn(colName: String, col: Column, atTheEnd: Boolean): 
DataFrame =
-withColumns(Seq(colName), Seq(col), atTheEnd)
+  def withColumn(colName: String, col: Column, atPosition: Int): DataFrame 
=
--- End diff --

Added


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22332: [SPARK-25333][SQL] Ability add new columns in Dat...

2018-09-05 Thread wmellouli
Github user wmellouli commented on a diff in the pull request:

https://github.com/apache/spark/pull/22332#discussion_r215184928
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
---
@@ -831,13 +831,21 @@ class DataFrameSuite extends QueryTest with 
SharedSQLContext {
   }.toSeq)
 assert(df.schema.map(_.name) === Seq("key", "value", "newCol"))
 
-val df2 = testData.toDF().withColumn("newCol", col("key") + 1, false)
+val df2 = testData.toDF().withColumn("newCol", col("key") + 1, 0)
--- End diff --

Test with negative position was covered for the public method `withColumn` 
and the private method `withColumns`:
- 
https://github.com/apache/spark/pull/22332/files#diff-5d2ebf4e9ca5a990136b276859769289R852
- 
https://github.com/apache/spark/pull/22332/files#diff-5d2ebf4e9ca5a990136b276859769289R907
 
I'm testing 3 cases (in the same time) that add new column at the end:
- negative position
- last position
- position greater than columns size


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22332: [SPARK-25333][SQL] Ability add new columns in Dat...

2018-09-05 Thread wmellouli
Github user wmellouli commented on a diff in the pull request:

https://github.com/apache/spark/pull/22332#discussion_r215179856
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2226,16 +2226,18 @@ class Dataset[T] private[sql](
 * `column`'s expression must only refer to attributes supplied by this 
Dataset. It is an
 * error to add a column that refers to some other Dataset.
 *
-* You can choose to add new columns either at the end (default 
behavior) or at the beginning.
+* The position of the new column start from 0, and a negative position 
means at the end (default behavior).
--- End diff --

I modified as you suggested


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22332: [SPARK-25333][SQL] Ability add new columns in Dat...

2018-09-05 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request:

https://github.com/apache/spark/pull/22332#discussion_r215144932
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2226,16 +2226,18 @@ class Dataset[T] private[sql](
 * `column`'s expression must only refer to attributes supplied by this 
Dataset. It is an
 * error to add a column that refers to some other Dataset.
 *
-* You can choose to add new columns either at the end (default 
behavior) or at the beginning.
+* The position of the new column start from 0, and a negative position 
means at the end (default behavior).
--- End diff --

"starts at `0`. Any negative position means to add the column at the end"?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22332: [SPARK-25333][SQL] Ability add new columns in Dat...

2018-09-05 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request:

https://github.com/apache/spark/pull/22332#discussion_r215145065
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2226,16 +2226,18 @@ class Dataset[T] private[sql](
 * `column`'s expression must only refer to attributes supplied by this 
Dataset. It is an
 * error to add a column that refers to some other Dataset.
 *
-* You can choose to add new columns either at the end (default 
behavior) or at the beginning.
+* The position of the new column start from 0, and a negative position 
means at the end (default behavior).
 */
-  def withColumn(colName: String, col: Column, atTheEnd: Boolean): 
DataFrame =
-withColumns(Seq(colName), Seq(col), atTheEnd)
+  def withColumn(colName: String, col: Column, atPosition: Int): DataFrame 
=
+withColumns(Seq(colName), Seq(col), atPosition)
 
   /**
* Returns a new Dataset by adding columns or replacing the existing 
columns that has
--- End diff --

s/has/have


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22332: [SPARK-25333][SQL] Ability add new columns in Dat...

2018-09-05 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request:

https://github.com/apache/spark/pull/22332#discussion_r215144732
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2226,16 +2226,18 @@ class Dataset[T] private[sql](
 * `column`'s expression must only refer to attributes supplied by this 
Dataset. It is an
 * error to add a column that refers to some other Dataset.
 *
-* You can choose to add new columns either at the end (default 
behavior) or at the beginning.
+* The position of the new column start from 0, and a negative position 
means at the end (default behavior).
 */
-  def withColumn(colName: String, col: Column, atTheEnd: Boolean): 
DataFrame =
-withColumns(Seq(colName), Seq(col), atTheEnd)
+  def withColumn(colName: String, col: Column, atPosition: Int): DataFrame 
=
--- End diff --

`@since 2.4.0`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22332: [SPARK-25333][SQL] Ability add new columns in Dat...

2018-09-05 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request:

https://github.com/apache/spark/pull/22332#discussion_r215145351
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
---
@@ -831,13 +831,21 @@ class DataFrameSuite extends QueryTest with 
SharedSQLContext {
   }.toSeq)
 assert(df.schema.map(_.name) === Seq("key", "value", "newCol"))
 
-val df2 = testData.toDF().withColumn("newCol", col("key") + 1, false)
+val df2 = testData.toDF().withColumn("newCol", col("key") + 1, 0)
--- End diff --

What about tests with negative positions?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22332: [SPARK-25333][SQL] Ability add new columns in Dat...

2018-09-05 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request:

https://github.com/apache/spark/pull/22332#discussion_r215144982
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2226,16 +2226,18 @@ class Dataset[T] private[sql](
 * `column`'s expression must only refer to attributes supplied by this 
Dataset. It is an
 * error to add a column that refers to some other Dataset.
 *
-* You can choose to add new columns either at the end (default 
behavior) or at the beginning.
+* The position of the new column start from 0, and a negative position 
means at the end (default behavior).
 */
-  def withColumn(colName: String, col: Column, atTheEnd: Boolean): 
DataFrame =
-withColumns(Seq(colName), Seq(col), atTheEnd)
+  def withColumn(colName: String, col: Column, atPosition: Int): DataFrame 
=
+withColumns(Seq(colName), Seq(col), atPosition)
 
   /**
* Returns a new Dataset by adding columns or replacing the existing 
columns that has
* the same names.
+   *
+   * The position of new columns start from 0, and a negative position 
means at the end (default behavior).
--- End diff --

Same as above


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org