Ian Cook created ARROW-13186:
--------------------------------

             Summary: [R] Implement type determination more cleanly
                 Key: ARROW-13186
                 URL: https://issues.apache.org/jira/browse/ARROW-13186
             Project: Apache Arrow
          Issue Type: Improvement
          Components: R
    Affects Versions: 5.0.0
            Reporter: Ian Cook


In the R package, there are several improvements in data type determination in 
the 5.0.0 release. The implementation of these improvements used a kludge: They 
made it possible to store a {{Schema}} in an {{Expression}} object in the R 
package; when set, this {{Schema}} is retained in derivative {{Expression}}s. 
This was the most convenient way to make the {{Schema}} available for passing 
it to the {{type_id()}} method, which requires it. But this introduces a 
deviation of the R package's {{Expression}} object from the C++ library's 
{{Expression}} object, and it makes our type determination functions work 
differently than the other R functions in {{nse_funcs}}.

The Jira issues in which these somewhat kludgy improvements were made are:
 * allowing a schema to be stored in the {{Expression}} object, and 
implementing type determination functions in a way that uses that schema 
(ARROW-12781)
 * retaining a schema in derivative {{Expression}} objects (ARROW-13117)
 * setting an empty schema in scalar literal {{Expression}} objects 
(ARROW-13119)

>From the perspective of the R package, an ideal way to implement type 
>determination functions would be to call a {{type_id}} kernel through the 
>{{call_function}} interface, but this was rejected in ARROW-13167. Consider 
>other ways that we might improve this implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to