[jira] [Assigned] (IMPALA-3825) Distribute runtime filter aggregation across cluster
[ https://issues.apache.org/jira/browse/IMPALA-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong reassigned IMPALA-3825: - Assignee: Abhishek Rawat (was: Rahul Shivu Mahadev) > Distribute runtime filter aggregation across cluster > > > Key: IMPALA-3825 > URL: https://issues.apache.org/jira/browse/IMPALA-3825 > Project: IMPALA > Issue Type: Improvement > Components: Distributed Exec >Affects Versions: Impala 2.6.0 >Reporter: Henry Robinson >Assignee: Abhishek Rawat >Priority: Major > Labels: runtime-filters > > Runtime filters can be tens of MB or more, and incasting all filters from all > shuffle joins to the coordinator can put a lot of memory pressure on that > node. To alleviate this we should consider spreading out the aggregation > operation across the cluster, so that a different node aggregates each > runtime filter. > This still restricts aggregation to #runtime-filters nodes, which will > usually be less than the cluster size. If we want to smooth that out further > we could use tree-based aggregation, but let's measure the benefits of simply > distributing the aggregation work first. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-3825) Distribute runtime filter aggregation across cluster
[ https://issues.apache.org/jira/browse/IMPALA-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sailesh Mukil reassigned IMPALA-3825: - Assignee: Rahul Shivu Mahadev (was: Sailesh Mukil) > Distribute runtime filter aggregation across cluster > > > Key: IMPALA-3825 > URL: https://issues.apache.org/jira/browse/IMPALA-3825 > Project: IMPALA > Issue Type: Improvement > Components: Distributed Exec >Affects Versions: Impala 2.6.0 >Reporter: Henry Robinson >Assignee: Rahul Shivu Mahadev >Priority: Major > Labels: runtime-filters > > Runtime filters can be tens of MB or more, and incasting all filters from all > shuffle joins to the coordinator can put a lot of memory pressure on that > node. To alleviate this we should consider spreading out the aggregation > operation across the cluster, so that a different node aggregates each > runtime filter. > This still restricts aggregation to #runtime-filters nodes, which will > usually be less than the cluster size. If we want to smooth that out further > we could use tree-based aggregation, but let's measure the benefits of simply > distributing the aggregation work first. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org