ianmcook commented on a change in pull request #9927:
URL: https://github.com/apache/arrow/pull/9927#discussion_r608988128



##########
File path: r/R/dplyr.R
##########
@@ -619,9 +619,23 @@ collect.arrow_dplyr_query <- function(x, as_data_frame = 
TRUE, ...) {
     restore_dplyr_features(tab, x)
   }
 }
-collect.ArrowTabular <- as.data.frame.ArrowTabular
+collect.ArrowTabular <- function(x, as_data_frame = TRUE, ...) {
+  if (as_data_frame) as.data.frame(x, ...) else x
+}
 collect.Dataset <- function(x, ...) dplyr::collect(arrow_dplyr_query(x), ...)
 
+compute.arrow_dplyr_query <- function(x, ...) dplyr::collect(x, as_data_frame 
= FALSE)
+compute.ArrowTabular <- function(x, ...) x
+compute.Dataset <- function(x, ...) {

Review comment:
       I spent about an hour trying various different things here, such as 
paring back what `restore_dplyr_features()` does and also factoring out the 
compute code into a separate internal function and calling it from both 
`collect` and `compute`, but this caused test failures. Investigating these 
seems beyond the scope of what we're trying to achieve here and I think my time 
would be better spent on other work. Can we leave this as is for now and open a 
Jira for improvement later?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to