Jonathan Keane created ARROW-14027: -------------------------------------- Summary: [R] Allow me to group_by + summarise() with partitioning fields Key: ARROW-14027 URL: https://issues.apache.org/jira/browse/ARROW-14027 Project: Apache Arrow Issue Type: Bug Components: R Reporter: Jonathan Keane Fix For: 6.0.0
If one puts a field that is one of the partitioning variables in {{group_by()}} and then summarises, we get a segfault: {code:r} library(arrow) library(dplyr) temp <- tempfile() write_dataset(mtcars, path = temp, partitioning = "cyl") ds <- open_dataset(temp) # this works just fine ds %>% group_by(gear) %>% summarise( sum(mpg) ) %>% collect() # however this segfaults (regardless of the aggregation, even simply n()) # *** caught segfault *** # address 0x0, cause 'memory not mapped' ds %>% group_by(cyl) %>% summarise( sum(mpg) ) %>% collect() {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)