[ https://issues.apache.org/jira/browse/ARROW-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neal Richardson resolved ARROW-13761. ------------------------------------- Fix Version/s: 6.0.0 Resolution: Fixed Issue resolved by pull request 11033 [https://github.com/apache/arrow/pull/11033] > [R] arrow::filter() crashes (aborts R session) > ---------------------------------------------- > > Key: ARROW-13761 > URL: https://issues.apache.org/jira/browse/ARROW-13761 > Project: Apache Arrow > Issue Type: Bug > Components: R > Affects Versions: 5.0.0 > Reporter: Carl Boettiger > Assignee: Weston Pace > Priority: Major > Labels: pull-request-available > Fix For: 6.0.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Arrow crashes (aborts R session) when attempting to evaluate `filter` with a > `collect()` command, e.g. following arrow's dplyr vignette: > https://cran.r-project.org/web/packages/arrow/vignettes/dataset.html > ```r > library(arrow) > library(dplyr) > ds <- open_dataset("nyc-taxi", partitioning = c("year", "month")) > x <- ds %>% > filter(total_amount > 100, year == 2015) > x %>% collect() # crashes R > ``` > (Note for simplicity I downloaded only years 2009 and 2010 using the R loop > you provide in the Vignette. > I observe this behavior in a RStudio server instance on a Ubuntu 20.04 Linux > server with 128 cores and 256 GB RAM. > Here's my sessionInfo(): > ```r > sessionInfo() > R version 4.1.0 (2021-05-18) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu 20.04.2 LTS > Matrix products: default > BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > attached base packages: > [1] stats graphics grDevices utils datasets methods base > other attached packages: > [1] dplyr_1.0.7 arrow_5.0.0 > loaded via a namespace (and not attached): > [1] fansi_0.5.0 crayon_1.4.1 utf8_1.2.2 assertthat_0.2.1 > [5] R6_2.5.1 DBI_1.1.1 lifecycle_1.0.0 magrittr_2.0.1 > [9] pillar_1.6.2 rlang_0.4.11 vctrs_0.3.8 generics_0.1.0 > [13] ellipsis_0.3.2 tools_4.1.0 bit64_4.0.5 glue_1.4.2 > [17] purrr_0.3.4 bit_4.0.4 compiler_4.1.0 pkgconfig_2.0.3 > [21] tidyselect_1.1.1 tibble_3.1.3 > ``` -- This message was sent by Atlassian Jira (v8.3.4#803005)