Lucas Mation created ARROW-18314: ------------------------------------ Summary: "open_dataset(f) %>% filder(id %in% myvec) %>% collect" causes CPP11::unwind_execption, crashed R Key: ARROW-18314 URL: https://issues.apache.org/jira/browse/ARROW-18314 Project: Apache Arrow Issue Type: Bug Reporter: Lucas Mation Attachments: image-2022-11-11-14-55-36-430.png, image-2022-11-11-14-59-30-132.png
I issued two calls ``` ft <- path_to_dataset1 fa <- path_to_dataset2 tic() d2 <- ft %>% open_dataset %>% filter( pis %in% mypis ) %>% collect toc() 927.11 sec elapsed #returned a dataset with 44 obs, 38 columns, took abnormal time, 16min ft <- paste0(p2,'/RAIS_operacional/vinc_1976_2001/parquet_temp') fa <- paste0(p2,'/RAIS_operacional/vinc_1976_2001/parquet') tic() d3 <- fa %>% open_dataset %>% filter( pis %in% mypis ) %>% collect terminate called after throwing an instance of 'cpp11::unwind_exception' ``` Then I got an error that craspad_hendler.exe stopped working. And R becomes frozen, after a while R crashed too. !image-2022-11-11-14-59-30-132.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)