nealrichardson commented on a change in pull request #10327: URL: https://github.com/apache/arrow/pull/10327#discussion_r632858178
########## File path: r/R/dplyr-functions.R ########## @@ -109,6 +109,39 @@ nse_funcs$as.numeric <- function(x) { Expression$create("cast", x, options = cast_options(to_type = float64())) } +# is.* type functions +nse_funcs$is.character <- function(x) { + x$type_id() %in% Type[c("STRING", "LARGE_STRING")] +} + +nse_funcs$is.numeric <- function(x) { + x$type_id() %in% c(2:12, 23:24) +} + +nse_funcs$is.double <- function(x) { + x$type_id() %in% Type["DOUBLE"] +} + +nse_funcs$is.integer <- function(x) { + x$type_id() %in% c(2:9) +} + +nse_funcs$is.integer64 <- function(x) { + x$type_id() %in% Type[c("UINT64", "INT64")] Review comment: Technically `bit64::integer64` can't do uint64 ```suggestion x$type_id() %in% Type["INT64"] ``` ########## File path: r/R/dplyr-functions.R ########## @@ -109,6 +109,39 @@ nse_funcs$as.numeric <- function(x) { Expression$create("cast", x, options = cast_options(to_type = float64())) } +# is.* type functions +nse_funcs$is.character <- function(x) { + x$type_id() %in% Type[c("STRING", "LARGE_STRING")] +} + +nse_funcs$is.numeric <- function(x) { + x$type_id() %in% c(2:12, 23:24) Review comment: I would spell these out too with the Type enum, just so they're obvious ########## File path: r/R/dplyr-functions.R ########## @@ -109,6 +109,39 @@ nse_funcs$as.numeric <- function(x) { Expression$create("cast", x, options = cast_options(to_type = float64())) } +# is.* type functions +nse_funcs$is.character <- function(x) { + x$type_id() %in% Type[c("STRING", "LARGE_STRING")] +} + +nse_funcs$is.numeric <- function(x) { + x$type_id() %in% c(2:12, 23:24) +} + +nse_funcs$is.double <- function(x) { + x$type_id() %in% Type["DOUBLE"] +} + +nse_funcs$is.integer <- function(x) { + x$type_id() %in% c(2:9) +} + +nse_funcs$is.integer64 <- function(x) { + x$type_id() %in% Type[c("UINT64", "INT64")] +} + +nse_funcs$is.logical <- function(x) { + x$type_id() %in% Type["BOOL"] +} + +nse_funcs$is.factor <- function(x) { + x$type_id() == Type["DICTIONARY"] +} + +nse_funcs$is.list <- function(x) { + x$type_id() %in% Type[c("LIST", "FIXED_SIZE_LIST", "LARGE_LIST")] +} + Review comment: You may want to consider other method(s) to do the same logic but with Arrow-specific capabilities. Perhaps something as simple as this (following the base R signature) ```r nse_funcs$is <- function(object, class2) { x$type$ToString() %in% class2 } ``` Also, are there hms/lubridate/etc. `is_*` functions, or tidyverse-spelling versions of these base R ones, to add? ########## File path: r/R/expression.R ########## @@ -76,7 +76,15 @@ Expression <- R6Class("Expression", inherit = ArrowObject, public = list( ToString = function() compute___expr__ToString(self), - type = function(schema) compute___expr__type(self, schema), + schema = NULL, + bind = function(schema) self$schema <- schema, + type = function() { + if (is.null(self$schema)) { + stop("Must bind() expression to a schema before returning its type", call. = FALSE) + } + compute___expr__type(self, self$schema) Review comment: How about this? ```suggestion type = function(schema = self$schema) { assert_that(!is.null(schema)) compute___expr__type(self, schema) ``` I think it's misleading to have a `$bind()` method that doesn't actually Bind the Expression. I think you can achieve the same by doing `expr$schema <- schema` inside `relocate()` And also this way you retain the schema arg to `$type()` (so you don't need to modify the print method above) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org