Ian Cook created ARROW-13186: -------------------------------- Summary: [R] Implement type determination more cleanly Key: ARROW-13186 URL: https://issues.apache.org/jira/browse/ARROW-13186 Project: Apache Arrow Issue Type: Improvement Components: R Affects Versions: 5.0.0 Reporter: Ian Cook
In the R package, there are several improvements in data type determination in the 5.0.0 release. The implementation of these improvements used a kludge: They made it possible to store a {{Schema}} in an {{Expression}} object in the R package; when set, this {{Schema}} is retained in derivative {{Expression}}s. This was the most convenient way to make the {{Schema}} available for passing it to the {{type_id()}} method, which requires it. But this introduces a deviation of the R package's {{Expression}} object from the C++ library's {{Expression}} object, and it makes our type determination functions work differently than the other R functions in {{nse_funcs}}. The Jira issues in which these somewhat kludgy improvements were made are: * allowing a schema to be stored in the {{Expression}} object, and implementing type determination functions in a way that uses that schema (ARROW-12781) * retaining a schema in derivative {{Expression}} objects (ARROW-13117) * setting an empty schema in scalar literal {{Expression}} objects (ARROW-13119) >From the perspective of the R package, an ideal way to implement type >determination functions would be to call a {{type_id}} kernel through the >{{call_function}} interface, but this was rejected in ARROW-13167. Consider >other ways that we might improve this implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005)