[ https://issues.apache.org/jira/browse/ARROW-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17585019#comment-17585019 ]
Travis Lim commented on ARROW-12711: ------------------------------------ [~icook] Any updates on bindings for dplyr summarise with paste(collapse) or str_c(collapse) in upcoming releases? A potential workaround was floated for Python here https://issues.apache.org/jira/browse/ARROW-12710 but having this in R would be a game changer, especially for NLP applications :pray: :pray: :pray: > [R] Bindings for paste(collapse), str_c(collapse), and str_flatten() > -------------------------------------------------------------------- > > Key: ARROW-12711 > URL: https://issues.apache.org/jira/browse/ARROW-12711 > Project: Apache Arrow > Issue Type: New Feature > Components: R > Reporter: Ian Cook > Priority: Major > Labels: query-engine > > These are the aggregating versions of string concatenation—they combine > values from a set of rows into a single value. > The bindings for {{paste()}} and {{str_c()}} might be tricky to implement > because when these functions are called with the {{coallapse}} argument > unset, they do _not_ aggregate. > In {{summarise()}} we need to be able to use scalar concatenation within > aggregate concatenation, like this: > {code:java} > starwars %>% > filter(!is.na(hair_color) & !is.na(eye_color)) %>% > group_by(homeworld) %>% > summarise(hair_and_eyes = paste0(paste0(hair_color, "-haired and ", > eye_color, "-eyed"), collapse = ", ")){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)