paleolimbot commented on code in PR #12817:
URL: https://github.com/apache/arrow/pull/12817#discussion_r854273061
##########
r/R/array.R:
##########
@@ -217,6 +217,125 @@ Array$create <- function(x, type = NULL) {
Array$import_from_c <- ImportArray
+#' Convert an object to an Arrow Array
+#'
+#' Whereas `Array$create()` constructs an [Array] from the built-in data types
+#' for which the Arrow package implements fast converters, `as_arrow_array()`
+#' provides a means by which other packages can define conversions to Arrow
+#' objects.
+#'
+#' @param x An object to convert to an Arrow Array
+#' @param ... Passed to S3 methods
+#' @param type A [type][data-type] for the final Array. A value of `NULL`
+#' will default to the type guessed by [type()].
+#'
+#' @return An [Array].
+#' @export
+#'
+#' @examplesIf arrow_available()
+#' as_arrow_array(1:5)
+#'
+as_arrow_array <- function(x, ..., type = NULL) {
+ UseMethod("as_arrow_array")
+}
+
+#' @rdname as_arrow_array
+#' @export
+as_arrow_array.Array <- function(x, ..., type = NULL) {
+ if (is.null(type)) {
+ x
+ } else {
+ x$cast(type)
+ }
+}
+
+#' @rdname as_arrow_array
+#' @export
+as_arrow_array.ChunkedArray <- function(x, ..., type = NULL) {
+ concat_arrays(!!! x$chunks, type = type)
+}
+
+#' @rdname as_arrow_array
+#' @export
+as_arrow_array.vctrs_vctr <- function(x, ..., type = NULL) {
+ if (is.null(type)) {
+ vctrs_extension_array(x)
+ } else if (inherits(type, "VctrsExtensionType")) {
+ vctrs_extension_array(
+ x,
+ ptype = type$ptype(),
+ storage_type = type$storage_type()
+ )
+ } else {
+ stop_cant_convert_array(x, type)
+ }
+}
+
+#' @export
+as_arrow_array.POSIXlt <- function(x, ..., type = NULL) {
+ as_arrow_array.vctrs_vctr(x, ..., type = type)
+}
+
+#' @export
+as_arrow_array.data.frame <- function(x, ..., type = NULL) {
+ type <- type %||% infer_type(x)
+
+ if (inherits(type, "VctrsExtensionType")) {
+ storage <- as_arrow_array(x, type = type$storage_type())
+ new_extension_array(storage, type)
+ } else if (inherits(type, "StructType")) {
+ fields <- type$fields()
+ names <- map_chr(fields, "name")
+ types <- map(fields, "type")
+ arrays <- Map(as_arrow_array, x, types)
+ names(arrays) <- names
+
+ # ...because there isn't a StructArray$create() yet
Review Comment:
@wjones127's review highlighted the fact that the C++ conversion didn't
handle `ExtensionType` fields within a `StructArray`, and because
`as_arrow_array()` doesn't necessarily return an `ExtensionType`, it might do
the wrong thing for an S3 vctr that maps to a regular `Array` type (e.g.,
`narrow::narrow_vctr()`). I went down a bit of a rabbit hole trying to make it
work in C++; however, the `StructArray` converter is tightly coupled to the
`StructBuilder`, the `AsArrowArrayConverter` completely circumvents the Builder.
All this to say that the C++ level `can_convert_native()` now recurses
through data.frames and returns `false` if any nested column won't work in C++
🤯. This means we also need an S3 method to handle that case, and hence, have to
create a `StructArray` from `Array` objects. I have a vague recollection that I
didn't see the point of `StructArray$create()` at the time (I was wrong!). I
added a note about that next to the S3 method for data.frame since it's a
little confusing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]