nealrichardson commented on code in PR #19706: URL: https://github.com/apache/arrow/pull/19706#discussion_r1067152162
########## r/tests/testthat/test-dplyr-query.R: ########## @@ -714,3 +714,48 @@ test_that("Scalars in expressions match the type of the field, if possible", { collect() expect_equal(result$tpc_h_1, result$as_dbl) }) + +test_that("Can use nested field refs", { + nested_data <- tibble(int = 1:5, df_col = tibble(a = 6:10, b = 11:15)) + + compare_dplyr_binding( + .input %>% + mutate( + nested = df_col$a, + times2 = df_col$a * 2 + ) %>% + filter(nested > 7) %>% + collect(), + nested_data + ) + + compare_dplyr_binding( + .input %>% + mutate( + nested = df_col$a, + times2 = df_col$a * 2 + ) %>% + filter(nested > 7) %>% + summarize(sum(times2)) %>% + collect(), + nested_data + ) + + # Now with Dataset Review Comment: There is a different code path used to create a ScanNode for a Dataset, and it identifies all fields used in the projection so they can be pushed down into the scan: https://github.com/apache/arrow/blob/master/r/R/query-engine.R#L37-L41 This was segfaulting (nested FieldRef doesn't have `.name()`) until I modified that function. I'll add more to this comment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org