Github user vinodkc closed the pull request at:
https://github.com/apache/spark/pull/20947
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user TRANTANKHOA commented on a diff in the pull request:
https://github.com/apache/spark/pull/20947#discussion_r178444354
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -1593,7 +1596,9 @@ class Dataset[T] private[sql](
def groupBy(col1:
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/20947#discussion_r178324135
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -1593,7 +1596,9 @@ class Dataset[T] private[sql](
def groupBy(col1:
GitHub user vinodkc opened a pull request:
https://github.com/apache/spark/pull/20947
[SPARK-23705][SQL]Handle non-distinct columns in DataSet.groupBy
## What changes were proposed in this pull request?
If input columns to DataSet.groupBy contains non unique columns, remove