Nicola Crane created ARROW-16649: ------------------------------------ Summary: [C++] Add support for sorting to the Substrait consumer Key: ARROW-16649 URL: https://issues.apache.org/jira/browse/ARROW-16649 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Nicola Crane
The streaming execution engine supports sorting (I believe, as a sink node option?), but the Substrait consumer does not currently consume sort relations. Please can we have support for this? Here's the example code/plan I tested with: {code:java} library(dplyr) library(substrait) # create a basic table and order it out <- tibble::tibble(a = 1, b = 2) %>% arrow_substrait_compiler() %>% arrange(a) # take a look at the plan created out$plan() #> message of type 'substrait.Plan' with 2 fields set #> extension_uris { #> extension_uri_anchor: 1 #> } #> relations { #> root { #> input { #> sort { #> input { #> read { #> base_schema { #> names: "a" #> names: "b" #> struct_ { #> types { #> fp64 { #> } #> } #> types { #> fp64 { #> } #> } #> } #> } #> named_table { #> names: "named_table_1" #> } #> } #> } #> sorts { #> expr { #> selection { #> direct_reference { #> struct_field { #> } #> } #> } #> } #> direction: SORT_DIRECTION_ASC_NULLS_LAST #> } #> } #> } #> names: "a" #> names: "b" #> } #> } # try to run the plan collect(out) #> Error: NotImplemented: conversion to arrow::compute::Declaration from Substrait relation sort { ... #> /home/nic2/arrow/cpp/src/arrow/engine/substrait/serde.cc:73 FromProto(plan_rel.rel(), ext_set) {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)