[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21802 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r205704109 --- Diff: python/pyspark/sql/functions.py --- @@ -2382,6 +2382,20 @@ def array_sort(col): return Column(sc._jvm.functions.array_sort(_to_java_co

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r205703558 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -3545,6 +3545,14 @@ object functions { */ def array_max(e: C

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r205703459 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1184,6 +1184,110 @@ case class Array

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r205702898 --- Diff: python/pyspark/sql/functions.py --- @@ -2382,6 +2382,20 @@ def array_sort(col): return Column(sc._jvm.functions.array_sort(_to_java_co

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-26 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r205534867 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/RandomIndicesGenerator.scala --- @@ -0,0 +1,45 @@ +/* + * Licensed to the A

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-26 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r205534846 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1184,6 +1184,110 @@ case class ArraySort(

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-26 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r205483672 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/RandomIndicesGenerator.scala --- @@ -0,0 +1,45 @@ +/* + * Licensed to th

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-26 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r205483502 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1184,6 +1184,110 @@ case class ArraySo

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r204276502 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala --- @@ -1419,4 +1421,71 @@ class Collection

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-22 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r204275498 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala --- @@ -1419,4 +1421,71 @@ class Collection

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-22 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r204270437 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1184,6 +1186,137 @@ case class ArraySort(c

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-22 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r204249027 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala --- @@ -1419,4 +1421,71 @@ class CollectionE

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r204087168 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2086,6 +2087,20 @@ class Analyzer( } }

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r203974038 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1184,6 +1186,137 @@ case class ArraySort(

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r203968939 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2086,6 +2087,20 @@ class Analyzer( }

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r203968608 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1184,6 +1186,137 @@ case class ArraySo

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r203962951 --- Diff: python/pyspark/sql/functions.py --- @@ -2382,6 +2382,20 @@ def array_sort(col): return Column(sc._jvm.functions.array_sort(_to_java_column(

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r203963010 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2086,6 +2087,20 @@ class Analyzer( } }

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r203963022 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1184,6 +1186,137 @@ case class ArraySort(

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r203956752 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1184,6 +1186,137 @@ case class ArraySo

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r203955095 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2086,6 +2087,20 @@ class Analyzer( }

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r203954268 --- Diff: python/pyspark/sql/functions.py --- @@ -2382,6 +2382,20 @@ def array_sort(col): return Column(sc._jvm.functions.array_sort(_to_java_colu

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r203560440 --- Diff: python/pyspark/sql/functions.py --- @@ -2382,6 +2382,20 @@ def array_sort(col): return Column(sc._jvm.functions.array_sort(_to_java_column(

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-18 Thread pkuwm
Github user pkuwm commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r203453412 --- Diff: python/pyspark/sql/functions.py --- @@ -2382,6 +2382,20 @@ def array_sort(col): return Column(sc._jvm.functions.array_sort(_to_java_column(c

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-18 Thread mn-mikke
Github user mn-mikke commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r203388798 --- Diff: python/pyspark/sql/functions.py --- @@ -2382,6 +2382,20 @@ def array_sort(col): return Column(sc._jvm.functions.array_sort(_to_java_colum

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-18 Thread mn-mikke
Github user mn-mikke commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r203407122 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala --- @@ -1444,6 +1444,51 @@ class DataFrameFunctionsSuite extends Query

[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-18 Thread ueshin
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/21802 [SPARK-23928][SQL] Add shuffle collection function. ## What changes were proposed in this pull request? This PR adds a new collection function: shuffle. It generates a random permutation of