nealrichardson commented on a change in pull request #9606: URL: https://github.com/apache/arrow/pull/9606#discussion_r589822176
########## File path: r/tests/testthat/test-Array.R ########## @@ -723,6 +723,17 @@ test_that("[ accepts Arrays and otherwise handles bad input", { ) }) +test_that("[ %in% looks up string key in dictionary", { + a1 <- Array$create(as.factor(c("A", "B", "C"))) + a2 <- DictionaryArray$create(c(0L, 1L, 2L), c(4.5, 3.2, 1.1)) + b1 <- Array$create(c(FALSE, TRUE, FALSE)) + b2 <- Array$create(c(FALSE, FALSE, FALSE)) + expect_equal(b1, arrow:::call_function("is_in_meta_binary", a1, Array$create("B"))) + expect_equal(b2, arrow:::call_function("is_in_meta_binary", a1, Array$create("D"))) + expect_equal(b1, arrow:::call_function("is_in_meta_binary", a2, Array$create(3.2))) + expect_error(arrow:::call_function("is_in_meta_binary", a2, Array$create("B"))) Review comment: Within the context of a dplyr method, we can do nonstandard evaluation and map anything we want to the appropriate arrow compute functions, and we do that. But when using R's standard evaluation, we can only define arrow methods where the R function is defined as a generic. We do this for (most) arithmetic and logical operations, for example. However, `%in%` is not defined that way (nor is `match()`, the lower-level function it calls), so we can't. I'm ok with adding an `is_in()` function, though I'm also ok with not doing it as I don't think it will get used much. What do you think @jonkeane @ianmcook ? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org