[ 
https://issues.apache.org/jira/browse/IGNITE-16396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Plekhanov updated IGNITE-16396:
---------------------------------------
    Description: 
Currently, we allow only single output distribution for aggregates, but looks 
like if we have hash input distribution and all grouping set contains all of 
the distribution keys we can make aggregation on remote nodes and produce hash 
output distribution with the same keys. This will reduce memory consumption on 
the initiator node and make some other optimizations possible.

For example, query:
{noformat}
SELECT t1.aff_key, t2.cnt FROM t1 JOIN (SELECT aff_key, COUNT(*) AS cnt FROM t2 
GROUP BY id) AS t2 ON t1.aff_key = t2.aff_key{noformat}
Can do colocated join if both tables are colocated on {{{}aff_key{}}}. 
Currently, such a query does join on the initiator node.

The same for set-ops (EXCEPT, INTERSECT).

  was:
Currently, we allow only single output distribution for aggregates, but looks 
like if we have hash input distribution and all grouping set contains all of 
the distribution keys we can make aggregation on remote nodes and produce hash 
output distribution with the same keys. This will reduce memory consumption on 
the initiator node and make some other optimizations possible.

For example, query:
{noformat}
SELECT t1.aff_key, t2.cnt FROM t1 JOIN (SELECT aff_key, COUNT(*) AS cnt FROM t2 
GROUP BY id) AS t2 ON t1.aff_key = t2.aff_key{noformat}
Can do colocated join if both tables are colocated on {{{}aff_key{}}}. 
Currently, such a query does join on the initiator node.


> Calcite engine. Allow hash output distribution for aggregations 
> ----------------------------------------------------------------
>
>                 Key: IGNITE-16396
>                 URL: https://issues.apache.org/jira/browse/IGNITE-16396
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Aleksey Plekhanov
>            Assignee: Aleksey Plekhanov
>            Priority: Major
>
> Currently, we allow only single output distribution for aggregates, but looks 
> like if we have hash input distribution and all grouping set contains all of 
> the distribution keys we can make aggregation on remote nodes and produce 
> hash output distribution with the same keys. This will reduce memory 
> consumption on the initiator node and make some other optimizations possible.
> For example, query:
> {noformat}
> SELECT t1.aff_key, t2.cnt FROM t1 JOIN (SELECT aff_key, COUNT(*) AS cnt FROM 
> t2 GROUP BY id) AS t2 ON t1.aff_key = t2.aff_key{noformat}
> Can do colocated join if both tables are colocated on {{{}aff_key{}}}. 
> Currently, such a query does join on the initiator node.
> The same for set-ops (EXCEPT, INTERSECT).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to