Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r203560440 --- Diff: python/pyspark/sql/functions.py --- @@ -2382,6 +2382,20 @@ def array_sort(col): return Column(sc._jvm.functions.array_sort(_to_java_column(col))) +@since(2.4) +def shuffle(col): + """ + Collection function: Generates a random permutation of the given array. + + .. note:: The function is non-deterministic because its results depends on order of rows which --- End diff -- The permutation is determined randomly but it is determined for the same query plan if the order of rows is determined, because the analyzer will assign a random seed for it.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org