Jonathan Keane created ARROW-16701: -------------------------------------- Summary: [R] Can we execute SQL in a dply pipeline? Key: ARROW-16701 URL: https://issues.apache.org/jira/browse/ARROW-16701 Project: Apache Arrow Issue Type: New Feature Components: R Reporter: Jonathan Keane
Now that we have {{to_duckdb()}} and {{to_arrow()}} is it possible to wrap those and allow someone to insert arbitrary SQL into a dplyr query? Something like: {code:r} sql <- function(data, sql) { tbl <- to_duckdb(data) res <- DBI::dbSendQuery(dbplyr::remote_con(.data), sql, arrow = TRUE) duckdb::duckdb_fetch_record_batch(res) } ds %>% filter(year > 2020) %>% sql("SELECT tip_amount, fare_amount, total_amount FROM ") %>% compute() {code} This won't work totally, but is vaguely what we're looking for. One part that we need to think about is how to deal with the {{from}} clause, a few possibilities: * bis does this by making you "name" the table before doing sql so you can FROM explicitly * though maybe you could get away with FROM . like it is a magrittr thing and sub that * empty string, and we add it in based on the lazy_tbl object Possibly related prior art: https://dbplyr.tidyverse.org/reference/build_sql.html (though the name isn't perfect IMO, and I think this is more geared towards package developers than end users?) -- This message was sent by Atlassian Jira (v8.20.7#820007)