Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22227#discussion_r213519898 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -2554,7 +2554,27 @@ object functions { * @since 1.5.0 */ def split(str: Column, pattern: String): Column = withExpr { - StringSplit(str.expr, lit(pattern).expr) + StringSplit(str.expr, Literal(pattern), Literal(-1)) + } + + /** + * Splits str around pattern (pattern is a regular expression). + * + * The limit parameter controls the number of times the pattern is applied and therefore + * affects the length of the resulting array. If the limit n is greater than zero then the + * pattern will be applied at most n - 1 times, the array's length will be no greater than + * n, and the array's last entry will contain all input beyond the last matched delimiter. + * If n is non-positive then the pattern will be applied as many times as possible and the + * array can have any length. If n is zero then the pattern will be applied as many times as + * possible, the array can have any length, and trailing empty strings will be discarded. --- End diff -- Can you copy SQL's doc here? You could describe them via `@param` here as well.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org