jiangzhx commented on issue #1246:
URL:
https://github.com/apache/arrow-datafusion/issues/1246#issuecomment-961646148
> If I recall correctly, datafusion doesn't do fine optimization about
`group by` and `aggregate functions` at present. It's worth adding it to our
RoadMap and doing it in the future.
i try to dig code in trino and doris; there are all have streaming aggregate
node; but i can't understand how they working.
`aggregate functions` was working fine; with sum(LO_EXTENDEDPRICE) or
without; the performence has no big difference,there are also have 5~10 times
slow;
low cardinality:
select 1 FROM lineorder_flat group by LO_ORDERPRIORITY;
5 rows in set. Query took 0.236 seconds.
high cardinality:
select 1 FROM lineorder_flat group by S_ADDRESS;
20000 rows in set. Query took 1.429 seconds.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]