[ 
https://issues.apache.org/jira/browse/ARROW-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Keane updated ARROW-15168:
-----------------------------------
    Fix Version/s: 8.0.0

> [R] Add S3 generics to create main Arrow objects
> ------------------------------------------------
>
>                 Key: ARROW-15168
>                 URL: https://issues.apache.org/jira/browse/ARROW-15168
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>            Reporter: Dewey Dunnington
>            Assignee: Dewey Dunnington
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 8.0.0
>
>          Time Spent: 7h
>  Remaining Estimate: 0h
>
> Right now we create Tables, RecordBatches, ChunkedArrays, and Arrays using 
> the corresponding {{$create()}} functions (or a few shortcut functions). This 
> works well for converting other Arrow or base R types to Arow objects but 
> doesn’t work well for objects in other packages (e.g., sf). This is related 
> to ARROW-14378 in that it provides a mechanism for other packages support 
> writing objects to Arrow in a more Arrow-native form instead of serializing 
> attributes that are unlikely to be readable in other packages. Many of these 
> came up when experimenting with {{carrow}} when trying to provide seamless 
> arrow package compatibility for S3 objects that wrap external pointers to C 
> API data structures. S3 is a good way to do this because the other package 
> doesn't have to put arrow in {{Imports}} since it's a heavy dependency.
> For argument’s sake I’ll propose adding the following methods: 
> -   {{as_arrow_array(x, type = NULL)}} -> {{Array}} 
> -   {{as_arrow_chunked_array(x, type = NULL)}} -> {{ChunkedArray}} 
> -   {{as_arrow_record_batch(x, schema = NULL)}} -> {{RecordBatch}} 
> -   {{as_arrow_table(x, schema = NULL)}} -> {{Table}} 
> -   {{as_arrow_data_type(x)}} -> {{DataType}} 
> -   {{as_arrow_record_batch_reader(x, schema = NULL)}} -> 
> {{RecordBatchReader}} 
> I’ll note that use {{as_adq()}} internally for similar reasons (to convert a 
> few different object types into a arrow dplyr query when that’s the data 
> structure we need). 
> As part of this ticket, if we choose to move forward, we should implement the 
> default methods with some internal consistency (i.e., somebody wanting to 
> provide Arrow support in a package probably only has to implement 
> {{as_arrow_array()}} to get most support.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to