Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22227#discussion_r214562493 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -952,6 +952,11 @@ public static UTF8String concatWs(UTF8String separator, UTF8String... inputs) { } public UTF8String[] split(UTF8String pattern, int limit) { + // Java String's split method supports "ignore empty string" behavior when the limit is 0. + // To avoid this, we fall back to -1 when the limit is 0. --- End diff -- I also would leave a short justification for this given https://github.com/apache/spark/pull/22227#issuecomment-417471241
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org