[ https://issues.apache.org/jira/browse/ARROW-8748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neal Richardson updated ARROW-8748: ----------------------------------- Summary: [R] Add bindings to ConcatenateTables (was: [R] Implementing methodes for combining arrow tabels using dplyr::bind_rows and dplyr::bind_cols) > [R] Add bindings to ConcatenateTables > ------------------------------------- > > Key: ARROW-8748 > URL: https://issues.apache.org/jira/browse/ARROW-8748 > Project: Apache Arrow > Issue Type: New Feature > Components: R > Reporter: Dominic Dennenmoser > Priority: Major > Labels: features, performance > Time Spent: 20m > Remaining Estimate: 0h > > First at all, many thanks for your hard work! I was quite exited, when you > guys implemented some basic function of the the {{dplyr}} package. Is there a > why to combine tow or more arrow tables into one by rows or columns? At the > moment my workaround looks like this: > {code:r} > dplyr::bind_rows( > "a" = arrow.table.1 %>% dplyr::collect(), > "b" = arrow.table.2 %>% dplyr::collect(), > "c" = arrow.table.3 %>% dplyr::collect(), > "d" = arrow.table.4 %>% dplyr::collect(), > .id = "ID" > ) %>% > arrow::write_ipc_stream(sink = "file_name_combined_tables.arrow") > {code} > But this is actually not really a meaningful measure because of putting the > data back as dataframes/tibbles into the r environment, which might lead to > an exhaust of RAM space. Perhaps you might have a better workaround on hand. > It would be great if you guys could implement the {{bind_rows}} and > {{bind_cols}} methods provided by {{dplyr}}. > {code:java} > dplyr::bind_rows( > "a" = arrow.table.1, > "b" = arrow.table.2, > "c" = arrow.table.3, > "d" = arrow.table.4, > .id = "ID" > ) %>% > arrow::write_ipc_stream(sink = "file_name_combined_tables.arrow"){code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)