[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186635312 --- Diff: R/pkg/R/functions.R --- @@ -1253,19 +1256,6 @@ setMethod("quarter", column(jc) }) -#' @details -#' \code{reverse}: Reverses the string column and returns it as a new string column. -#' -#' @rdname column_string_functions -#' @aliases reverse reverse,Column-method -#' @note reverse since 1.5.0 -setMethod("reverse", - signature(x = "Column"), - function(x) { -jc <- callJStatic("org.apache.spark.sql.functions", "reverse", x@jc) -column(jc) - }) - --- End diff -- I will put them back to the original places and add back the examples. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186634471 --- Diff: R/pkg/R/functions.R --- @@ -2043,34 +2033,6 @@ setMethod("countDistinct", column(jc) }) -#' @details -#' \code{concat}: Concatenates multiple input columns together into a single column. -#' If all inputs are binary, concat returns an output as binary. Otherwise, it returns as string. -#' -#' @rdname column_string_functions -#' @aliases concat concat,Column-method -#' @examples -#' -#' \dontrun{ -#' # concatenate strings -#' tmp <- mutate(df, s1 = concat(df$Class, df$Sex), -#' s2 = concat(df$Class, df$Sex, df$Age), -#' s3 = concat(df$Class, df$Sex, df$Age, df$Class), -#' s4 = concat_ws("_", df$Class, df$Sex), -#' s5 = concat_ws("+", df$Class, df$Sex, df$Age, df$Survived)) --- End diff -- yes, let's not lose example. we should add concat to the column_collection_functions and concat_ws to column_string_functions --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186633965 --- Diff: R/pkg/R/functions.R --- @@ -1253,19 +1256,6 @@ setMethod("quarter", column(jc) }) -#' @details -#' \code{reverse}: Reverses the string column and returns it as a new string column. -#' -#' @rdname column_string_functions -#' @aliases reverse reverse,Column-method -#' @note reverse since 1.5.0 -setMethod("reverse", - signature(x = "Column"), - function(x) { -jc <- callJStatic("org.apache.spark.sql.functions", "reverse", x@jc) -column(jc) - }) - --- End diff -- right, but that's not strictly where collection functions are. I'm not saying we need to re-sort/group all functions - that will be thousands of lines we need to change --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186631550 --- Diff: R/pkg/R/functions.R --- @@ -2043,34 +2033,6 @@ setMethod("countDistinct", column(jc) }) -#' @details -#' \code{concat}: Concatenates multiple input columns together into a single column. -#' If all inputs are binary, concat returns an output as binary. Otherwise, it returns as string. -#' -#' @rdname column_string_functions -#' @aliases concat concat,Column-method -#' @examples -#' -#' \dontrun{ -#' # concatenate strings -#' tmp <- mutate(df, s1 = concat(df$Class, df$Sex), -#' s2 = concat(df$Class, df$Sex, df$Age), -#' s3 = concat(df$Class, df$Sex, df$Age, df$Class), -#' s4 = concat_ws("_", df$Class, df$Sex), -#' s5 = concat_ws("+", df$Class, df$Sex, df$Age, df$Survived)) --- End diff -- Since now I put concat in the Collection functions section, I am not sure if I should keep the column_string_functions examples there. I will add them back if you prefer to keep these examples. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186630904 --- Diff: R/pkg/R/functions.R --- @@ -1253,19 +1256,6 @@ setMethod("quarter", column(jc) }) -#' @details -#' \code{reverse}: Reverses the string column and returns it as a new string column. -#' -#' @rdname column_string_functions -#' @aliases reverse reverse,Column-method -#' @note reverse since 1.5.0 -setMethod("reverse", - signature(x = "Column"), - function(x) { -jc <- callJStatic("org.apache.spark.sql.functions", "reverse", x@jc) -column(jc) - }) - --- End diff -- @felixcheung Thanks for your comment. The reason I moved reverse and concat around is that I want to put them under line 2943 for Collection functions. ## Collection functions## --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186624915 --- Diff: R/pkg/R/functions.R --- @@ -2043,34 +2033,6 @@ setMethod("countDistinct", column(jc) }) -#' @details -#' \code{concat}: Concatenates multiple input columns together into a single column. -#' If all inputs are binary, concat returns an output as binary. Otherwise, it returns as string. -#' -#' @rdname column_string_functions -#' @aliases concat concat,Column-method -#' @examples -#' -#' \dontrun{ -#' # concatenate strings -#' tmp <- mutate(df, s1 = concat(df$Class, df$Sex), -#' s2 = concat(df$Class, df$Sex, df$Age), -#' s3 = concat(df$Class, df$Sex, df$Age, df$Class), -#' s4 = concat_ws("_", df$Class, df$Sex), -#' s5 = concat_ws("+", df$Class, df$Sex, df$Age, df$Survived)) --- End diff -- also what happen to all other examples? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186624766 --- Diff: R/pkg/R/functions.R --- @@ -1253,19 +1256,6 @@ setMethod("quarter", column(jc) }) -#' @details -#' \code{reverse}: Reverses the string column and returns it as a new string column. -#' -#' @rdname column_string_functions -#' @aliases reverse reverse,Column-method -#' @note reverse since 1.5.0 -setMethod("reverse", - signature(x = "Column"), - function(x) { -jc <- callJStatic("org.apache.spark.sql.functions", "reverse", x@jc) -column(jc) - }) - --- End diff -- so these func are not actually ordered or grouped - my preference would be not to move then around unless we have to? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186624379 --- Diff: R/pkg/R/functions.R --- @@ -209,6 +209,7 @@ NULL #' head(select(tmp, array_max(tmp$v1), array_min(tmp$v1))) #' head(select(tmp, array_position(tmp$v1, 21))) #' head(select(tmp, flatten(tmp$v1))) +#' head(select(tmp, reverse(tmp$v1))) --- End diff -- let's merge into an existing line - we don't need one line per function; don't want these too be too long each line but also not too many lines either --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186612820 --- Diff: R/pkg/tests/fulltests/test_sparkSQL.R --- @@ -1739,6 +1748,13 @@ test_that("string operators", { collect(select(df5, repeat_string(df5$a, -1)))[1, 1], "" ) + + l6 <- list(list(a = "abc")) + df6 <- createDataFrame(l6) + expect_equal( +collect(select(df6, reverse(df6$a)))[1, 1], +"cba" + ) --- End diff -- Let's make this inlined while we are here. I would also put this test around 1505L above. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186612845 --- Diff: R/pkg/tests/fulltests/test_sparkSQL.R --- @@ -1739,6 +1748,13 @@ test_that("string operators", { collect(select(df5, repeat_string(df5$a, -1)))[1, 1], "" ) + + l6 <- list(list(a = "abc")) + df6 <- createDataFrame(l6) --- End diff -- I would combine those two lines too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186610290 --- Diff: R/pkg/R/functions.R --- @@ -218,6 +219,8 @@ NULL #' head(select(tmp3, map_keys(tmp3$v3))) #' head(select(tmp3, map_values(tmp3$v3))) #' head(select(tmp3, element_at(tmp3$v3, "Valiant")))} --- End diff -- The right `}` for this `dontrun` block is wrongly put here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186604950 --- Diff: R/pkg/tests/fulltests/test_sparkSQL.R --- @@ -1748,6 +1748,14 @@ test_that("string operators", { collect(select(df5, repeat_string(df5$a, -1)))[1, 1], "" ) + + l6 <- list(list(a = "abc")) + df6 <- createDataFrame(l6) + df7 <- select(df6, reverse(df6$a)) --- End diff -- `df7` is not used below? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186604721 --- Diff: R/pkg/tests/fulltests/test_sparkSQL.R --- @@ -1502,12 +1502,21 @@ test_that("column functions", { result <- collect(select(df, sort_array(df[[1]])))[[1]] expect_equal(result, list(list(1L, 2L, 3L), list(4L, 5L, 6L))) - # Test flattern + result <- collect(select(df, reverse(df[[1]])))[[1]] --- End diff -- @viirya Thank for your comments. I will make changes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186368498 --- Diff: R/pkg/tests/fulltests/test_sparkSQL.R --- @@ -1502,12 +1502,21 @@ test_that("column functions", { result <- collect(select(df, sort_array(df[[1]])))[[1]] expect_equal(result, list(list(1L, 2L, 3L), list(4L, 5L, 6L))) - # Test flattern + result <- collect(select(df, reverse(df[[1]])))[[1]] --- End diff -- Seems we don't have test for `reverse` for string. Can you add one for it too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186369103 --- Diff: R/pkg/R/functions.R --- @@ -209,6 +209,7 @@ NULL #' head(select(tmp, array_max(tmp$v1), array_min(tmp$v1))) #' head(select(tmp, array_position(tmp$v1, 21))) #' head(select(tmp, flatten(tmp$v1))) +#' head(select(tmp, reverse(tmp$v1))) --- End diff -- Also add `concat` here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/21255 [SPARK-24186][SparR][SQL]change reverse and concat to collection functions in R ## What changes were proposed in this pull request? reverse and concat are already in functions.R as column string functions. Since now these two functions are categorized as collection functions in scala and python, we will do the same in R. ## How was this patch tested? Add test in test_sparkSQL.R You can merge this pull request into a Git repository by running: $ git pull https://github.com/huaxingao/spark spark-24186 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21255.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21255 commit 3985285089673e42a85a5d1ba3cd7419a6948909 Author: Huaxin Gao Date: 2018-05-07T06:47:34Z [SPARK-24186][SparR][SQL]add array reverse and concat functions to R --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org