[GitHub] [spark] MichaelChirico commented on a change in pull request #28386: [SPARK-31517][R] fix strategy for handling ... names in mutate
MichaelChirico commented on a change in pull request #28386: URL: https://github.com/apache/spark/pull/28386#discussion_r416376995 ## File path: R/pkg/R/DataFrame.R ## @@ -2287,16 +2287,19 @@ setMethod("mutate", # For named arguments, use the names for arguments as the column names # For unnamed arguments, use the argument symbols as the column names -args <- sapply(substitute(list(...))[-1], deparse) ns <- names(cols) -if (!is.null(ns)) { - lapply(seq_along(args), function(i) { -if (ns[[i]] != "") { - args[[i]] <<- ns[[i]] -} +if (is.null(ns)) ns <- rep('', length(cols)) +named_idx <- nzchar(ns) +args <- character(length(ns)) +if (any(named_idx)) args[named_idx] <- ns[named_idx] +if (!all(named_idx)) { + # SPARK-31517: deparse uses width.cutoff on wide input and the + # output is length>1, so need to collapse it to scalar + colsub <- substitute(list(...))[-1L] + args[!named_idx] <- sapply(which(!named_idx), function(ii) { +paste(trimws(deparse(colsub[[ii]])), collapse = ' ') Review comment: Have added `trimws` as a backport This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MichaelChirico commented on a change in pull request #28386: [SPARK-31517][R] fix strategy for handling ... names in mutate
MichaelChirico commented on a change in pull request #28386: URL: https://github.com/apache/spark/pull/28386#discussion_r416376851 ## File path: R/pkg/R/DataFrame.R ## @@ -3445,7 +3448,7 @@ setMethod("as.data.frame", #' @note attach since 1.6.0 setMethod("attach", signature(what = "SparkDataFrame"), - function(what, pos = 2L, name = deparse(substitute(what), backtick = FALSE), + function(what, pos = 2L, name = deparse1(substitute(what), backtick = FALSE), Review comment: this is now the signature of `base::attach` in R 4.0.0. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MichaelChirico commented on a change in pull request #28386: [SPARK-31517][R] fix strategy for handling ... names in mutate
MichaelChirico commented on a change in pull request #28386: URL: https://github.com/apache/spark/pull/28386#discussion_r416375375 ## File path: R/pkg/R/DataFrame.R ## @@ -2287,16 +2287,19 @@ setMethod("mutate", # For named arguments, use the names for arguments as the column names # For unnamed arguments, use the argument symbols as the column names -args <- sapply(substitute(list(...))[-1], deparse) ns <- names(cols) -if (!is.null(ns)) { - lapply(seq_along(args), function(i) { -if (ns[[i]] != "") { - args[[i]] <<- ns[[i]] -} +if (is.null(ns)) ns <- rep('', length(cols)) +named_idx <- nzchar(ns) +args <- character(length(ns)) +if (any(named_idx)) args[named_idx] <- ns[named_idx] +if (!all(named_idx)) { + # SPARK-31517: deparse uses width.cutoff on wide input and the + # output is length>1, so need to collapse it to scalar + colsub <- substitute(list(...))[-1L] + args[!named_idx] <- sapply(which(!named_idx), function(ii) { +paste(trimws(deparse(colsub[[ii]])), collapse = ' ') Review comment: Just remembered `trimws` is R 3.2.0 & `SparkR` stated dependency is 3.1.0 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MichaelChirico commented on a change in pull request #28386: [SPARK-31517][R] fix strategy for handling ... names in mutate
MichaelChirico commented on a change in pull request #28386: URL: https://github.com/apache/spark/pull/28386#discussion_r416372474 ## File path: R/pkg/R/DataFrame.R ## @@ -2287,16 +2287,19 @@ setMethod("mutate", # For named arguments, use the names for arguments as the column names # For unnamed arguments, use the argument symbols as the column names -args <- sapply(substitute(list(...))[-1], deparse) Review comment: R 4.0.0 adds `deparse1` that would have been more appropriate here: > `deparse1()` is a simple utility added in R 4.0.0 to ensure a string result (character vector of length one), typically used in name construction, as `deparse1(substitute(.))`. That function is just a wrapper so easy to backport: ``` deparse1 = function (expr, collapse = " ", width.cutoff = 500L, ...) paste(deparse(expr, width.cutoff, ...), collapse = collapse) ``` (though personally I would still stick with `trimws`) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org