paleolimbot commented on code in PR #12817:
URL: https://github.com/apache/arrow/pull/12817#discussion_r852224810
##########
r/R/array.R:
##########
@@ -217,6 +217,93 @@ Array$create <- function(x, type = NULL) {
Array$import_from_c <- ImportArray
+#' Convert an object to an Arrow Array
+#'
+#' Whereas `Array$create()` constructs an [Array] from the built-in data types
+#' for which the Arrow package implements fast converters, `as_arrow_array()`
+#' provides a means by which other packages can define conversions to Arrow
+#' objects.
+#'
+#' @param x An object to convert to an Arrow Array
+#' @param ... Passed to S3 methods
+#' @param type A [type][data-type] for the final Array. A value of `NULL`
+#' will default to the type guessed by [type()].
+#'
+#' @return An [Array].
+#' @export
+#'
+#' @examplesIf arrow_available()
+#' as_arrow_array(1:5)
+#'
+as_arrow_array <- function(x, ..., type = NULL) {
+ UseMethod("as_arrow_array")
+}
+
+#' @rdname as_arrow_array
+#' @export
+as_arrow_array.Array <- function(x, ..., type = NULL) {
+ if (is.null(type)) {
+ x
+ } else {
+ x$cast(type)
+ }
+}
+
+#' @rdname as_arrow_array
+#' @export
+as_arrow_array.ChunkedArray <- function(x, ..., type = NULL) {
+ concat_arrays(!!! x$chunks, type = type)
+}
+
+#' @rdname as_arrow_array
+#' @export
+as_arrow_array.vctrs_vctr <- function(x, ..., type = NULL) {
+ if (is.null(type)) {
+ vctrs_extension_array(x)
+ } else if (inherits(type, "VctrsExtensionType")) {
+ vctrs_extension_array(
+ x,
+ ptype = type$ptype(),
+ storage_type = type$storage_type()
+ )
+ } else {
+ NextMethod()
+ }
+}
+
+#' @export
+as_arrow_array.POSIXlt <- function(x, ..., type = NULL) {
+ as_arrow_array.vctrs_vctr(x, ..., type = type)
+}
+
+
+#' @export
+as_arrow_array.default <- function(x, ..., type = NULL, from_constructor =
FALSE) {
+ # If from_constructor is TRUE, this is a call from C++ for which S3 dispatch
+ # failed to find a method for the object. If this is the case, we error.
+ if (from_constructor && is.null(type)) {
+ abort(
+ sprintf(
+ "Can't create Array from object of type %s",
+ paste(class(x), collapse = " / ")
+ )
+ )
+ } else if (from_constructor) {
+ abort(
+ sprintf(
+ "Can't create Array<%s> from object of type %s",
+ format(type$code()),
+ paste(class(x), collapse = " / ")
+ )
+ )
+ }
+
+ # If from_constructor is FALSE, we use the built-in logic exposed by
+ # Array$create(). If there is no built-in conversion, C++ will call back
+ # here with from_constructor = TRUE to generate a nice error message.
Review Comment:
I agree that it's weird. It's because array creation occurs from both R and
C++ (and most of the time the call comes from C++). For example,
`Table$create()` does:
- `[R] Table$create()`
- `[C++] Table__from_dots()`
- `[C++] arrow::r::RConverter` (using all the old code pathways if we
handle that object type internally, or)
- `arrow::r::RExtensionConverter` -> `[R] as_arrow_array()`.
Other R calls that use this or a similar path are `Array$create()`,
`ChunkedArray$create()` and `RecordBatch$create()`. Perhaps some of that usage
should use `as_arrow_array()` instead, but table creation in particular makes
good use of threading and this was the best I came up with that didn't undo all
of that optimization!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]