[ 
https://issues.apache.org/jira/browse/ARROW-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17585019#comment-17585019
 ] 

Travis Lim commented on ARROW-12711:
------------------------------------

[~icook] Any updates on bindings for dplyr summarise with paste(collapse) or 
str_c(collapse) in upcoming releases?

A potential workaround was floated for Python here 
https://issues.apache.org/jira/browse/ARROW-12710 but having this in R would be 
a game changer, especially for NLP applications :pray: :pray: :pray:

 

> [R] Bindings for paste(collapse), str_c(collapse), and str_flatten()
> --------------------------------------------------------------------
>
>                 Key: ARROW-12711
>                 URL: https://issues.apache.org/jira/browse/ARROW-12711
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: R
>            Reporter: Ian Cook
>            Priority: Major
>              Labels: query-engine
>
> These are the aggregating versions of string concatenation—they combine 
> values from a set of rows into a single value. 
> The bindings for {{paste()}} and {{str_c()}} might be tricky to implement 
> because when these functions are called with the {{coallapse}} argument 
> unset, they do _not_ aggregate.
> In {{summarise()}} we need to be able to use scalar concatenation within 
> aggregate concatenation, like this: 
> {code:java}
> starwars %>%
>   filter(!is.na(hair_color) & !is.na(eye_color)) %>% 
>   group_by(homeworld) %>% 
>   summarise(hair_and_eyes = paste0(paste0(hair_color, "-haired and ", 
> eye_color, "-eyed"), collapse = ", ")){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to