Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21045#discussion_r193974637 --- Diff: python/pyspark/sql/functions.py --- @@ -2394,6 +2394,23 @@ def array_repeat(col, count): return Column(sc._jvm.functions.array_repeat(_to_java_column(col), count)) +@since(2.4) +def zip(*cols): + """ + Collection function: Merge two columns into one, such that the M-th element of the N-th + argument will be the N-th field of the M-th output element. + + :param cols: columns in input + + >>> from pyspark.sql.functions import zip as spark_zip --- End diff -- Eh, I think `zip` is fine (like `sum`, `min` or `max`). I agree it's a bad practice to shadow builtin functions but we already started this in this way ...
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org