paleolimbot commented on a change in pull request #12564:
URL: https://github.com/apache/arrow/pull/12564#discussion_r819650353
##########
File path: r/R/query-engine.R
##########
@@ -296,3 +296,39 @@ ExecNode <- R6Class("ExecNode",
schema = function() ExecNode_output_schema(self)
)
)
+
+do_exec_plan_substrait <- function(.data, substrait_plan) {
+ if (is.string(substrait_plan)) {
+ substrait_plan <- engine__internal__SubstraitFromJSON(substrait_plan)
+ } else if (is.raw(substrait_plan)) {
+ substrait_plan <- buffer(substrait_plan)
+ } else {
+ abort("`substrait_plan` must be a JSON string or raw() vector")
+ }
+
+ plan <- ExecPlan$create()
+
+ if (inherits(.data, "RecordBatchReader")) {
+ source_node <- ExecNode_ReadFromRecordBatchReader(self, dataset$.data)
+ } else if (inehrits(.data, "ArrowTabular")) {
+ dataset <- InMemoryDataset$create(dataset)
+ source_node <- ExecNode_Scan(
+ plan,
+ dataset,
+ Expression$scalar(TRUE),
+ colnames %||% character(0)
+ )
+ } else if (inherits(.data, "Dataset")) {
+ source_node <- ExecNode_Scan(
+ plan,
+ .data,
+ Expression$scalar(TRUE),
+ colnames %||% character(0)
+ )
+ } else {
+ obj_desc <- paste0(class(.data), collapse = " / ")
+ abort(glue("Can't construct source node from object of type {obj_desc}"))
+ }
Review comment:
My reading of it was that `plan$Scan(.data)` needed an
`arrow_dplyr_query` and I was trying to keep the initial "this thing works" as
simple as possible.
I imagine we could do some inspection of the substrait plan to extract which
field references actually show up (totally possible from what I currently know
about substrait), or do that inspection the C++ consumer.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]