ianmcook commented on a change in pull request #11009: URL: https://github.com/apache/arrow/pull/11009#discussion_r699308976
########## File path: r/tests/testthat/test-dplyr-aggregate.R ########## @@ -235,6 +235,24 @@ test_that("Group by any/all", { ) }) +test_that("Group by n_distinct() on dataset", { + expect_dplyr_equal( + input %>% + group_by(some_grouping) %>% + summarize(distinct = n_distinct(lgl, na.rm = FALSE)) %>% + collect(), + tbl + ) + skip("ARROW-13764 - CountOptions (na.rm) not yet implemented for compute_distinct") Review comment: Looks like ARROW-13764 is merged now ```suggestion ``` ########## File path: r/R/dplyr-functions.R ########## @@ -825,6 +824,17 @@ agg_funcs$var <- function(x, na.rm = FALSE, ddof = 1) { options = list(ddof = ddof) ) } + +agg_funcs$n_distinct <- function(x, na.rm = FALSE) { + list( + fun = "count_distinct", + data = x, + # ARROW-13764 Passing in na.rm = TRUE doesn't actually work yet as + # CountOptions not yet implemented for count_distinct Review comment: ```suggestion ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org