dragosmg opened a new pull request, #13789:
URL: https://github.com/apache/arrow/pull/13789
Once this PR is merged, users should be able to use their own functions in
an {arrow} dplyr-like pipeline (provided they can be translated with the help
of existing bindings).
``` r
library(dplyr)
library(arrow)
nchar2 <- function(x) {
1 + nchar(x)
}
# simple expression
tibble::tibble(my_string = "1234") %>%
mutate(
var1 = nchar(my_string),
var2 = nchar2(my_string)) %>%
collect()
#> # A tibble: 1 × 3
#> my_string var1 var2
#> <chr> <int> <dbl>
#> 1 1234 4 5
# a slightly more complicated expression
tibble::tibble(my_string = "1234") %>%
mutate(
var1 = nchar(my_string),
var2 = 1 + nchar2(my_string)) %>%
collect()
#> # A tibble: 1 × 3
#> my_string var1 var2
#> <chr> <int> <dbl>
#> 1 1234 4 6
nchar3 <- function(x) {
2 + nchar(x)
}
# multiple unknown calls in the same expression (to test the iteration)
tibble::tibble(my_string = "1234") %>%
mutate(
var1 = nchar(my_string),
var2 = nchar2(my_string) + nchar3(my_string)) %>%
collect()
#> # A tibble: 1 × 3
#> my_string var1 var2
#> <chr> <int> <dbl>
#> 1 1234 4 11
# user function defined using namespacing
nchar4 <- function(x) {
2 + base::nchar(x)
}
tibble::tibble(my_string = "1234") %>%
mutate(
var1 = nchar(my_string),
var2 = 1 + nchar4(my_string)) %>%
collect()
#> # A tibble: 1 × 3
#> my_string var1 var2
#> <chr> <int> <dbl>
#> 1 1234 4 7
```
Thoughts around the design were captured in this [design
doc](https://docs.google.com/document/d/1vBp8M8yXXfwMfpAPTXPKVT6QvhGrZWc0Uv5_xpEOq00/edit#).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]