paleolimbot commented on code in PR #35543:
URL: https://github.com/apache/arrow/pull/35543#discussion_r1235299471
##########
r/R/schema.R:
##########
@@ -244,10 +230,59 @@ print_schema_fields <- function(s) {
paste(map_chr(s$fields, ~ .$ToString()), collapse = "\n")
}
-#' @param ... [fields][field] or field name/[data type][data-type] pairs
+#' Schemas
+#'
+#' Create a schema or extract one from an object.
+#'
+#' @seealso [Schema] for detailed documentation of the Schema R6 object
+#' @param ... [fields][field], field name/[data type][data-type] pairs, or
object from which to extract a schema
+#' @examples
+#' # Create schema using pairs of field names and data types
+#' schema(a = int32(), b = float64())
+#'
+#' # Create schema using fields
+#' schema(
+#' field("b", double()),
+#' field("c", bool(), nullable = FALSE),
+#' field("d", string())
+#' )
+#'
+#' # Extract schemas from objects
+#' df <- data.frame(col1 = 2:4, col2 = c(0.1, 0.3, 0.5))
+#' tab1 <- arrow_table(df)
+#' schema(tab1)
+#' tab2 <- arrow_table(df, schema = schema(col1 = int8(), col2 = float32()))
+#' schema(tab2)
+#' @export
+schema <- function(...) {
+ dots <- list2(...)
+
+ if (length(dots) == 1 && !is.list(dots[[1]]) && is.null(names(dots)) &&
!inherits(dots[[1]], "Field")) {
Review Comment:
I think that `length(dots) == 1 && rlang::is_named2(dots[[1]])` may be
sufficient if (1) we add `infer_schema.Field(x, ...) Schema$create(x)` and (2)
we go back to using `!!!items`. If we do allow a single list to be passed as
the first argument, `rlang::is_bare_list()` may be a better choice than
`is.list()` (since `is.list()` will return TRUE for S3 objects that are a list
under the hood like a data.frame).
##########
r/R/schema.R:
##########
@@ -244,10 +230,59 @@ print_schema_fields <- function(s) {
paste(map_chr(s$fields, ~ .$ToString()), collapse = "\n")
}
-#' @param ... [fields][field] or field name/[data type][data-type] pairs
+#' Schemas
+#'
+#' Create a schema or extract one from an object.
+#'
+#' @seealso [Schema] for detailed documentation of the Schema R6 object
+#' @param ... [fields][field], field name/[data type][data-type] pairs, or
object from which to extract a schema
+#' @examples
+#' # Create schema using pairs of field names and data types
+#' schema(a = int32(), b = float64())
+#'
+#' # Create schema using fields
+#' schema(
+#' field("b", double()),
+#' field("c", bool(), nullable = FALSE),
+#' field("d", string())
+#' )
+#'
+#' # Extract schemas from objects
+#' df <- data.frame(col1 = 2:4, col2 = c(0.1, 0.3, 0.5))
+#' tab1 <- arrow_table(df)
+#' schema(tab1)
+#' tab2 <- arrow_table(df, schema = schema(col1 = int8(), col2 = float32()))
+#' schema(tab2)
+#' @export
+schema <- function(...) {
+ dots <- list2(...)
+
+ if (length(dots) == 1 && !is.list(dots[[1]]) && is.null(names(dots)) &&
!inherits(dots[[1]], "Field")) {
+ return(infer_schema(dots[[1]]))
+ }
+
+ Schema$create(...)
+}
+
+#' Create a schema or extract one from an object.
Review Comment:
```suggestion
#' Extract a schema from an object
```
(maybe?)
##########
r/tests/testthat/test-schema.R:
##########
@@ -292,3 +292,9 @@ test_that("schema name assignment", {
expect_identical(names(schm2), c("col1", "col2"))
expect_identical(names(schm2$r_metadata$columns), c("col1", "col2"))
})
+
+test_that("schema extraction", {
+ tbl <- arrow_table(example_data)
+ schema(tbl)
+ expect_equal(schema(tbl), tbl$schema)
+})
Review Comment:
Some other tests that may be useful:
- Test `schema(!!!some_list)` (or `schema(some_list)` if you choose that
direction)
- One test for each type of object that can have a schema extracted with
`schema()`
##########
r/R/schema.R:
##########
@@ -244,10 +230,59 @@ print_schema_fields <- function(s) {
paste(map_chr(s$fields, ~ .$ToString()), collapse = "\n")
}
-#' @param ... [fields][field] or field name/[data type][data-type] pairs
+#' Schemas
+#'
+#' Create a schema or extract one from an object.
+#'
+#' @seealso [Schema] for detailed documentation of the Schema R6 object
+#' @param ... [fields][field], field name/[data type][data-type] pairs, or
object from which to extract a schema
+#' @examples
+#' # Create schema using pairs of field names and data types
+#' schema(a = int32(), b = float64())
+#'
+#' # Create schema using fields
+#' schema(
+#' field("b", double()),
+#' field("c", bool(), nullable = FALSE),
+#' field("d", string())
+#' )
+#'
+#' # Extract schemas from objects
+#' df <- data.frame(col1 = 2:4, col2 = c(0.1, 0.3, 0.5))
+#' tab1 <- arrow_table(df)
+#' schema(tab1)
+#' tab2 <- arrow_table(df, schema = schema(col1 = int8(), col2 = float32()))
+#' schema(tab2)
+#' @export
+schema <- function(...) {
+ dots <- list2(...)
+
+ if (length(dots) == 1 && !is.list(dots[[1]]) && is.null(names(dots)) &&
!inherits(dots[[1]], "Field")) {
+ return(infer_schema(dots[[1]]))
+ }
+
+ Schema$create(...)
+}
+
+#' Create a schema or extract one from an object.
+#'
+#' @param x An object which has a schema, e.g. a `Dataset`
+#' @export
+infer_schema <- function(x) {
+ UseMethod("infer_schema")
+}
+
+#' @export
+infer_schema.ArrowTabular <- function(x) x$schema
+
+#' @export
+infer_schema.RecordBatchReader <- function(x) x$schema
+
+#' @export
+infer_schema.Dataset <- function(x) x$schema
+
Review Comment:
I think you added a way to do this for `data.frame` already that might make
sense to wire up here?
##########
r/R/schema.R:
##########
@@ -244,10 +230,59 @@ print_schema_fields <- function(s) {
paste(map_chr(s$fields, ~ .$ToString()), collapse = "\n")
}
-#' @param ... [fields][field] or field name/[data type][data-type] pairs
+#' Schemas
+#'
+#' Create a schema or extract one from an object.
+#'
+#' @seealso [Schema] for detailed documentation of the Schema R6 object
+#' @param ... [fields][field], field name/[data type][data-type] pairs, or
object from which to extract a schema
+#' @examples
+#' # Create schema using pairs of field names and data types
+#' schema(a = int32(), b = float64())
+#'
+#' # Create schema using fields
+#' schema(
+#' field("b", double()),
+#' field("c", bool(), nullable = FALSE),
+#' field("d", string())
+#' )
+#'
+#' # Extract schemas from objects
+#' df <- data.frame(col1 = 2:4, col2 = c(0.1, 0.3, 0.5))
+#' tab1 <- arrow_table(df)
+#' schema(tab1)
+#' tab2 <- arrow_table(df, schema = schema(col1 = int8(), col2 = float32()))
+#' schema(tab2)
+#' @export
+schema <- function(...) {
+ dots <- list2(...)
+
+ if (length(dots) == 1 && !is.list(dots[[1]]) && is.null(names(dots)) &&
!inherits(dots[[1]], "Field")) {
+ return(infer_schema(dots[[1]]))
+ }
+
+ Schema$create(...)
Review Comment:
```suggestion
Schema$create(!!!dots)
```
(this is what I usually see after capturing a dots expression but I forget
the details/if it matters)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]