jonkeane commented on a change in pull request #10601:
URL: https://github.com/apache/arrow/pull/10601#discussion_r658846244
##########
File path: r/R/metadata.R
##########
@@ -56,7 +56,21 @@ apply_arrow_r_metadata <- function(x, r_metadata) {
if (is.data.frame(x)) {
if (length(names(x)) && !is.null(columns_metadata)) {
for (name in intersect(names(columns_metadata), names(x))) {
- x[[name]] <- apply_arrow_r_metadata(x[[name]],
columns_metadata[[name]])
+ x[[name]] <- tryCatch({
+ x[[name]] <- apply_arrow_r_metadata(x[[name]],
columns_metadata[[name]])
+ },
+ error = function(e) {
+ # if we are erroring because of incompatible data, try and make
this
+ # a tibble
+ # TODO: also check if this is a list?
+ # TODO: only if there are exactly as many sub-list elements as
rows?
+ # TODO: decide if this obviates the need for the option
+ # arrow.strucs_as_dfs (or if that is actually a better way to
handle that)
+ if (grepl("must be compatible with existing data", e$message))
+ x[[name]] <- as.data.frame(x[[name]])
+ class(x[[name]]) <- c("tbl_df", "tbl", "data.frame")
+ apply_arrow_r_metadata(x[[name]], columns_metadata[[name]])
+ })
Review comment:
This code chunk will make it so that we can read in data saved in
parquet files that don't also store the data.frame metadata needed to
reconstruct them (since until this PR we always stripped that).
If we go this route, we probably should implement the todos listed here and
also warn that this setup is deprecated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]