mustafasrepo opened a new issue, #7128:
URL: https://github.com/apache/arrow-datafusion/issues/7128

   ### Describe the bug
   
   Assume we following tables.
   ```sql
   statement ok
   CREATE TABLE exchange_rates (
       sn INTEGER,
       ts TIMESTAMP,
       currency_from VARCHAR(3),
       currency_to VARCHAR(3),
       rate DECIMAL(10,2)
   ) as VALUES
       (0, '2022-01-01 06:00:00'::timestamp, 'EUR', 'USD', 1.10),
       (1, '2022-01-01 08:00:00'::timestamp, 'TRY', 'USD', 0.10),
       (2, '2022-01-01 11:30:00'::timestamp, 'EUR', 'USD', 1.12),
       (3, '2022-01-02 12:00:00'::timestamp, 'TRY', 'USD', 0.11),
       (4, '2022-01-03 10:00:00'::timestamp, 'EUR', 'USD', 1.12)
   ```
   and 
   ```sql
   statement ok
   CREATE TABLE sales_global (zip_code INT,
             country VARCHAR(3),
             sn INT,
             ts TIMESTAMP,
             currency VARCHAR(3),
             amount FLOAT
           ) as VALUES
             (0, 'GRC', 0, '2022-01-01 06:00:00'::timestamp, 'EUR', 30.0),
             (1, 'FRA', 1, '2022-01-01 08:00:00'::timestamp, 'EUR', 50.0),
             (1, 'TUR', 2, '2022-01-01 11:30:00'::timestamp, 'TRY', 75.0),
             (1, 'FRA', 3, '2022-01-02 12:00:00'::timestamp, 'EUR', 200.0),
             (1, 'TUR', 4, '2022-01-03 10:00:00'::timestamp, 'TRY', 100.0),
             (0, 'GRC', 4, '2022-01-03 10:00:00'::timestamp, 'EUR', 80.0)
   ```
   The following query 
   ```sql
   SELECT ARRAY_AGG(e.rate ORDER BY e.sn)
   FROM sales_global AS s
   JOIN exchange_rates AS e
   ON s.currency = e.currency_from AND
      e.currency_to = 'USD' AND
      s.ts >= e.ts
   GROUP BY s.sn
   ORDER BY s.sn;
   ```
   gives the error. When working in multi partitions.
   `External error: query failed: DataFusion error: Arrow error: Compute error: 
Sort not supported for list type Decimal128(10, 2)`
   
   I have examined the bug. Above query produces following physical plan
   
   ```sql
   +   ProjectionExec: expr=[ARRAY_AGG(e.rate) ORDER BY [e.sn ASC NULLS LAST]@0 
as ARRAY_AGG(e.rate)]
   +   --SortPreservingMergeExec: [sn@1 ASC NULLS LAST]
   +   ----SortExec: expr=[sn@1 ASC NULLS LAST]
   +   ------ProjectionExec: expr=[ARRAY_AGG(e.rate) ORDER BY [e.sn ASC NULLS 
LAST]@1 as ARRAY_AGG(e.rate), sn@0 as sn]
   +   --------AggregateExec: mode=FinalPartitioned, gby=[sn@0 as sn], 
aggr=[ARRAY_AGG(e.rate)]
   +   ----------SortExec: expr=[sn@1 ASC NULLS LAST]
   +   ------------CoalesceBatchesExec: target_batch_size=8192
   +   --------------RepartitionExec: partitioning=Hash([sn@0], 8), 
input_partitions=8
   +   ----------------AggregateExec: mode=Partial, gby=[sn@0 as sn], 
aggr=[ARRAY_AGG(e.rate)], ordering_mode=None
   +   ------------------SortExec: expr=[sn@1 ASC NULLS LAST]
   +   --------------------ProjectionExec: expr=[sn@0 as sn, sn@3 as sn, rate@6 
as rate]
   +   ----------------------CoalesceBatchesExec: target_batch_size=8192
   +   ------------------------HashJoinExec: mode=Partitioned, join_type=Inner, 
on=[(currency@2, currency_from@2)], filter=ts@0 >= ts@1
   +   --------------------------CoalesceBatchesExec: target_batch_size=8192
   +   ----------------------------RepartitionExec: 
partitioning=Hash([currency@2], 8), input_partitions=8
   +   ------------------------------RepartitionExec: 
partitioning=RoundRobinBatch(8), input_partitions=1
   +   --------------------------------MemoryExec: partitions=1, 
partition_sizes=[1]
   +   --------------------------CoalesceBatchesExec: target_batch_size=8192
   +   ----------------------------RepartitionExec: 
partitioning=Hash([currency_from@2], 8), input_partitions=8
   +   ------------------------------ProjectionExec: expr=[sn@0 as sn, ts@1 as 
ts, currency_from@2 as currency_from, rate@4 as rate]
   +   --------------------------------CoalesceBatchesExec: 
target_batch_size=8192
   +   ----------------------------------FilterExec: currency_to@3 = USD
   +   ------------------------------------MemoryExec: partitions=8, 
partition_sizes=[1, 0, 0, 0, 0, 0, 0, 0]
   
   ```
   The reason of the error is that `AggregateExec: mode=FinalPartitioned` 
requires its input to be sorted by `e.sn` as in `AggregateExec: mode=Partial`. 
Column `sn@1` is valid for the input `AggregateExec: mode=Partial`. However, 
this column is no longer valid for `AggregateExec: mode=FinalPartitioned`. at 
index 1 we have array_agg result.
   
   
   ### To Reproduce
   
   _No response_
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to