[ https://issues.apache.org/jira/browse/ARROW-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Richard Tia updated ARROW-17061: -------------------------------- Description: SQL {code:java} SELECT o_orderpriority, count(*) AS order_count FROM orders GROUP BY o_orderpriority{code} The substrait plan generated from SQL, using Isthmus. substrait count: [https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate_generic.yaml] Running the substrait plan with Acero returns this error: {code:java} E pyarrow.lib.ArrowInvalid: JsonToBinaryStream returned INVALID_ARGUMENT:(relations[0].root.input.aggregate.measures[0].measure) arguments: Cannot find field. {code} >From substrait query plan: relations[0].root.input.aggregate.measures[0].measure {code:java} "measure": { "functionReference": 0, "args": [], "sorts": [], "phase": "AGGREGATION_PHASE_INITIAL_TO_RESULT", "outputType": { "i64": { "typeVariationReference": 0, "nullability": "NULLABILITY_REQUIRED" } }, "invocation": "AGGREGATION_INVOCATION_ALL", "arguments": [] }{code} {code:java} "extensions": [{ "extensionFunction": { "extensionUriReference": 1, "functionAnchor": 0, "name": "count:opt" } }],{code} Count is a unary function and should be consumable, but isn't in this case. was: SQL {code:java} select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_order from '{}' where l_shipdate <= date '1998-12-01' - interval '120' day (3) group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus {code} The substrait plan generated from SQL, using Isthmus. substrait count: [https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate_generic.yaml] Running the substrait plan with Acero returns this error: {code:java} E pyarrow.lib.ArrowInvalid: JsonToBinaryStream returned INVALID_ARGUMENT:(relations[0].root.input.sort.input.aggregate.measures[7].measure) arguments: Cannot find field. {code} >From substrait query plan: relations[0].root.input.sort.input.aggregate.measures[7].measure {code:java} "measure": { "functionReference": 7, "args": [], "sorts": [], "phase": "AGGREGATION_PHASE_INITIAL_TO_RESULT", "outputType": { "i64": { "typeVariationReference": 0, "nullability": "NULLABILITY_REQUIRED" } }, "invocation": "AGGREGATION_INVOCATION_ALL", "arguments": [] } {code} {code:java} "extensionFunction": { "extensionUriReference": 3, "functionAnchor": 7, "name": "count:opt" } {code} Count is a unary function and should be consumable, but isn't in this case. > [Python] Acero consumer is unable to consume count function from substrait > query plan > ------------------------------------------------------------------------------------- > > Key: ARROW-17061 > URL: https://issues.apache.org/jira/browse/ARROW-17061 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Reporter: Richard Tia > Priority: Major > > SQL > {code:java} > SELECT > o_orderpriority, > count(*) AS order_count > FROM > orders > GROUP BY > o_orderpriority{code} > The substrait plan generated from SQL, using Isthmus. > > substrait count: > [https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate_generic.yaml] > > Running the substrait plan with Acero returns this error: > {code:java} > E pyarrow.lib.ArrowInvalid: JsonToBinaryStream returned > INVALID_ARGUMENT:(relations[0].root.input.aggregate.measures[0].measure) > arguments: Cannot find field. {code} > > From substrait query plan: > relations[0].root.input.aggregate.measures[0].measure > {code:java} > "measure": { > "functionReference": 0, > "args": [], > "sorts": [], > "phase": "AGGREGATION_PHASE_INITIAL_TO_RESULT", > "outputType": { > "i64": { > "typeVariationReference": 0, > "nullability": "NULLABILITY_REQUIRED" > } > }, > "invocation": "AGGREGATION_INVOCATION_ALL", > "arguments": [] > }{code} > {code:java} > "extensions": [{ > "extensionFunction": { > "extensionUriReference": 1, > "functionAnchor": 0, > "name": "count:opt" > } > }],{code} > Count is a unary function and should be consumable, but isn't in this case. -- This message was sent by Atlassian Jira (v8.20.10#820010)