nealrichardson commented on a change in pull request #9606:
URL: https://github.com/apache/arrow/pull/9606#discussion_r589822176
##########
File path: r/tests/testthat/test-Array.R
##########
@@ -723,6 +723,17 @@ test_that("[ accepts Arrays and otherwise handles bad
input", {
)
})
+test_that("[ %in% looks up string key in dictionary", {
+ a1 <- Array$create(as.factor(c("A", "B", "C")))
+ a2 <- DictionaryArray$create(c(0L, 1L, 2L), c(4.5, 3.2, 1.1))
+ b1 <- Array$create(c(FALSE, TRUE, FALSE))
+ b2 <- Array$create(c(FALSE, FALSE, FALSE))
+ expect_equal(b1, arrow:::call_function("is_in_meta_binary", a1,
Array$create("B")))
+ expect_equal(b2, arrow:::call_function("is_in_meta_binary", a1,
Array$create("D")))
+ expect_equal(b1, arrow:::call_function("is_in_meta_binary", a2,
Array$create(3.2)))
+ expect_error(arrow:::call_function("is_in_meta_binary", a2,
Array$create("B")))
Review comment:
Within the context of a dplyr method, we can do nonstandard evaluation
and map anything we want to the appropriate arrow compute functions, and we do
that. But when using R's standard evaluation, we can only define arrow methods
where the R function is defined as a generic. We do this for (most) arithmetic
and logical operations, for example. However, `%in%` is not defined that way
(nor is `match()`, the lower-level function it calls), so we can't.
I'm ok with adding an `is_in()` function, though I'm also ok with not doing
it as I don't think it will get used much. What do you think @jonkeane
@ianmcook ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]