[ https://issues.apache.org/jira/browse/ARROW-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Keane updated ARROW-15168: ----------------------------------- Fix Version/s: 8.0.0 > [R] Add S3 generics to create main Arrow objects > ------------------------------------------------ > > Key: ARROW-15168 > URL: https://issues.apache.org/jira/browse/ARROW-15168 > Project: Apache Arrow > Issue Type: Improvement > Components: R > Reporter: Dewey Dunnington > Assignee: Dewey Dunnington > Priority: Major > Labels: pull-request-available > Fix For: 8.0.0 > > Time Spent: 7h > Remaining Estimate: 0h > > Right now we create Tables, RecordBatches, ChunkedArrays, and Arrays using > the corresponding {{$create()}} functions (or a few shortcut functions). This > works well for converting other Arrow or base R types to Arow objects but > doesn’t work well for objects in other packages (e.g., sf). This is related > to ARROW-14378 in that it provides a mechanism for other packages support > writing objects to Arrow in a more Arrow-native form instead of serializing > attributes that are unlikely to be readable in other packages. Many of these > came up when experimenting with {{carrow}} when trying to provide seamless > arrow package compatibility for S3 objects that wrap external pointers to C > API data structures. S3 is a good way to do this because the other package > doesn't have to put arrow in {{Imports}} since it's a heavy dependency. > For argument’s sake I’ll propose adding the following methods: > - {{as_arrow_array(x, type = NULL)}} -> {{Array}} > - {{as_arrow_chunked_array(x, type = NULL)}} -> {{ChunkedArray}} > - {{as_arrow_record_batch(x, schema = NULL)}} -> {{RecordBatch}} > - {{as_arrow_table(x, schema = NULL)}} -> {{Table}} > - {{as_arrow_data_type(x)}} -> {{DataType}} > - {{as_arrow_record_batch_reader(x, schema = NULL)}} -> > {{RecordBatchReader}} > I’ll note that use {{as_adq()}} internally for similar reasons (to convert a > few different object types into a arrow dplyr query when that’s the data > structure we need). > As part of this ticket, if we choose to move forward, we should implement the > default methods with some internal consistency (i.e., somebody wanting to > provide Arrow support in a package probably only has to implement > {{as_arrow_array()}} to get most support. -- This message was sent by Atlassian Jira (v8.20.7#820007)