[
https://issues.apache.org/jira/browse/ARROW-13766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17461577#comment-17461577
]
Dewey Dunnington commented on ARROW-13766:
------------------------------------------
Without ties this isn't bad:
{code:R}
library(arrow, warn.conflicts = FALSE)
library(dplyr, warn.conflicts = FALSE)
df <- tibble(a = rep(letters, 10), b = 1:260, c = 260:1)
# slice_*() without ties is easier
record_batch(df) %>%
arrange(c) %>% head(5) %>%
collect()
#> # A tibble: 5 × 3
#> a b c
#> <chr> <int> <int>
#> 1 z 260 1
#> 2 y 259 2
#> 3 x 258 3
#> 4 w 257 4
#> 5 v 256 5
record_batch(df) %>%
arrange(desc(c)) %>% head(5) %>%
collect()
#> # A tibble: 5 × 3
#> a b c
#> <chr> <int> <int>
#> 1 a 1 260
#> 2 b 2 259
#> 3 c 3 258
#> 4 d 4 257
#> 5 e 5 256
{code}
With ties isn't too bad either (just needs a join):
{code:R}
library(arrow, warn.conflicts = FALSE)
library(dplyr, warn.conflicts = FALSE)
df <- tibble(a = rep(letters, 10), b = 1:260, c = 260:1)
# slice_*() with ties needs a join
rb <- record_batch(df)
rb %>% arrange(a) %>% select(a) %>% head(5) %>% distinct() %>% left_join(rb)
%>% collect()
#> # A tibble: 10 × 3
#> a b c
#> <chr> <int> <int>
#> 1 a 1 260
#> 2 a 27 234
#> 3 a 53 208
#> 4 a 79 182
#> 5 a 105 156
#> 6 a 131 130
#> 7 a 157 104
#> 8 a 183 78
#> 9 a 209 52
#> 10 a 235 26
{code}
> [R] Add Arrow methods slice_min(), slice_max()
> ----------------------------------------------
>
> Key: ARROW-13766
> URL: https://issues.apache.org/jira/browse/ARROW-13766
> Project: Apache Arrow
> Issue Type: Improvement
> Components: R
> Reporter: Ian Cook
> Priority: Major
> Labels: query-engine
> Fix For: 7.0.0
>
>
> Implement [{{slice_min()}} and
> {{slice_max()}}|https://dplyr.tidyverse.org/reference/slice.html] methods for
> {{ArrowTabular}}, {{Dataset}}, and {{arrow_dplyr_query}} objects.
> These dplyr functions supersede the older dplyr function
> [{{top_n()}}|https://dplyr.tidyverse.org/reference/top_n.html] which I
> suppose we should also consider implementing a method for.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)