[ 
https://issues.apache.org/jira/browse/ARROW-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Tia updated ARROW-17061:
--------------------------------
    Description: 
SQL
{code:java}
SELECT
    o_orderpriority,
    count(*) AS order_count
FROM
    orders
GROUP BY
    o_orderpriority{code}
The substrait plan generated from SQL, using Isthmus.

 

substrait count: 

[https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate_generic.yaml]

 

Running the substrait plan with Acero returns this error:
{code:java}
E   pyarrow.lib.ArrowInvalid: JsonToBinaryStream returned 
INVALID_ARGUMENT:(relations[0].root.input.aggregate.measures[0].measure) 
arguments: Cannot find field.  {code}
 

>From substrait query plan:

relations[0].root.input.aggregate.measures[0].measure
{code:java}
"measure": {
  "functionReference": 0,
  "args": [],
  "sorts": [],
  "phase": "AGGREGATION_PHASE_INITIAL_TO_RESULT",
  "outputType": {
    "i64": {
      "typeVariationReference": 0,
      "nullability": "NULLABILITY_REQUIRED"
    }
  },
  "invocation": "AGGREGATION_INVOCATION_ALL",
  "arguments": []
}{code}
{code:java}
"extensions": [{
  "extensionFunction": {
    "extensionUriReference": 1,
    "functionAnchor": 0,
    "name": "count:opt"
  }
}],{code}
Count is a unary function and should be consumable, but isn't in this case.

  was:
SQL
{code:java}
select
  l_returnflag,
  l_linestatus,
  sum(l_quantity) as sum_qty,
  sum(l_extendedprice) as sum_base_price,
  sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
  sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
  avg(l_quantity) as avg_qty,
  avg(l_extendedprice) as avg_price,
  avg(l_discount) as avg_disc,
  count(*) as count_order
from
  '{}'
where
  l_shipdate <= date '1998-12-01' - interval '120' day (3)
group by
  l_returnflag,
  l_linestatus
order by
  l_returnflag,
  l_linestatus {code}
The substrait plan generated from SQL, using Isthmus.

 

substrait count: 

[https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate_generic.yaml]

 

Running the substrait plan with Acero returns this error:
{code:java}
E   pyarrow.lib.ArrowInvalid: JsonToBinaryStream returned 
INVALID_ARGUMENT:(relations[0].root.input.sort.input.aggregate.measures[7].measure)
 arguments: Cannot find field.
 {code}
 

>From substrait query plan:

relations[0].root.input.sort.input.aggregate.measures[7].measure
{code:java}
"measure": {
  "functionReference": 7,
  "args": [],
  "sorts": [],
  "phase": "AGGREGATION_PHASE_INITIAL_TO_RESULT",
  "outputType": {
    "i64": {
      "typeVariationReference": 0,
      "nullability": "NULLABILITY_REQUIRED"
    }
  },
  "invocation": "AGGREGATION_INVOCATION_ALL",
  "arguments": []
} {code}
{code:java}
"extensionFunction": {
  "extensionUriReference": 3,
  "functionAnchor": 7,
  "name": "count:opt"
} {code}
Count is a unary function and should be consumable, but isn't in this case.


> [Python] Acero consumer is unable to consume count function from substrait 
> query plan
> -------------------------------------------------------------------------------------
>
>                 Key: ARROW-17061
>                 URL: https://issues.apache.org/jira/browse/ARROW-17061
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>            Reporter: Richard Tia
>            Priority: Major
>
> SQL
> {code:java}
> SELECT
>     o_orderpriority,
>     count(*) AS order_count
> FROM
>     orders
> GROUP BY
>     o_orderpriority{code}
> The substrait plan generated from SQL, using Isthmus.
>  
> substrait count: 
> [https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate_generic.yaml]
>  
> Running the substrait plan with Acero returns this error:
> {code:java}
> E   pyarrow.lib.ArrowInvalid: JsonToBinaryStream returned 
> INVALID_ARGUMENT:(relations[0].root.input.aggregate.measures[0].measure) 
> arguments: Cannot find field.  {code}
>  
> From substrait query plan:
> relations[0].root.input.aggregate.measures[0].measure
> {code:java}
> "measure": {
>   "functionReference": 0,
>   "args": [],
>   "sorts": [],
>   "phase": "AGGREGATION_PHASE_INITIAL_TO_RESULT",
>   "outputType": {
>     "i64": {
>       "typeVariationReference": 0,
>       "nullability": "NULLABILITY_REQUIRED"
>     }
>   },
>   "invocation": "AGGREGATION_INVOCATION_ALL",
>   "arguments": []
> }{code}
> {code:java}
> "extensions": [{
>   "extensionFunction": {
>     "extensionUriReference": 1,
>     "functionAnchor": 0,
>     "name": "count:opt"
>   }
> }],{code}
> Count is a unary function and should be consumable, but isn't in this case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to