ianmcook commented on a change in pull request #9927: URL: https://github.com/apache/arrow/pull/9927#discussion_r608988128
########## File path: r/R/dplyr.R ########## @@ -619,9 +619,23 @@ collect.arrow_dplyr_query <- function(x, as_data_frame = TRUE, ...) { restore_dplyr_features(tab, x) } } -collect.ArrowTabular <- as.data.frame.ArrowTabular +collect.ArrowTabular <- function(x, as_data_frame = TRUE, ...) { + if (as_data_frame) as.data.frame(x, ...) else x +} collect.Dataset <- function(x, ...) dplyr::collect(arrow_dplyr_query(x), ...) +compute.arrow_dplyr_query <- function(x, ...) dplyr::collect(x, as_data_frame = FALSE) +compute.ArrowTabular <- function(x, ...) x +compute.Dataset <- function(x, ...) { Review comment: I spent about an hour trying various different things here, such as paring back what `restore_dplyr_features()` does and also factoring out the compute code into a separate internal function and calling it from both `collect` and `compute`, but this caused test failures. Investigating these seems beyond the scope of what we're trying to achieve here and I think my time would be better spent on other work. Can we leave this as is for now and open a Jira for improvement later? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org