ianmcook commented on a change in pull request #11009:
URL: https://github.com/apache/arrow/pull/11009#discussion_r699308976



##########
File path: r/tests/testthat/test-dplyr-aggregate.R
##########
@@ -235,6 +235,24 @@ test_that("Group by any/all", {
   )
 })
 
+test_that("Group by n_distinct() on dataset", {
+  expect_dplyr_equal(
+    input %>%
+      group_by(some_grouping) %>%
+      summarize(distinct = n_distinct(lgl, na.rm = FALSE)) %>%
+      collect(),
+    tbl
+  )
+  skip("ARROW-13764 - CountOptions (na.rm) not yet implemented for 
compute_distinct")

Review comment:
       Looks like ARROW-13764 is merged now
   ```suggestion
   ```

##########
File path: r/R/dplyr-functions.R
##########
@@ -825,6 +824,17 @@ agg_funcs$var <- function(x, na.rm = FALSE, ddof = 1) {
     options = list(ddof = ddof)
   )
 }
+
+agg_funcs$n_distinct <- function(x, na.rm = FALSE) {
+  list(
+    fun = "count_distinct",
+    data = x,
+    # ARROW-13764 Passing in na.rm = TRUE doesn't actually work yet as
+    # CountOptions not yet implemented for count_distinct

Review comment:
       ```suggestion
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to