Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/20612 )
Change subject: IMPALA-3825: Delegate runtime filter aggregation to some executors ...................................................................... Patch Set 7: (3 comments) The implementation looks great in general, just a few comments http://gerrit.cloudera.org:8080/#/c/20612/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20612/7//COMMIT_MSG@20 PS7, Line 20: Query option MAX_NUM_FILTER_AGGREGATOR is added to control : this feature. Given N as the number of backend executors excluding the : coordinator, the selected number of intermediate aggregators M = : min(MAX_NUM_FILTER_AGGREGATOR, N / 2). Setting MAX_NUM_FILTER_AGGREGATOR : <= 0 will disable the intermediate aggregator feature. MAX_NUM_FILTER_AGGREGATOR seems ok for me in the short term to test this feature, but I think that there should be a more sophisticated to tune this on the long term. My concern is that depending on the number of hosts used in the query, a static default for MAX_NUM_FILTER_AGGREGATOR can be easily too low or too large. I think that ideally we should have an option like MAX_NUM_FILTERS_AGGREGATED_PER_HOST, so if this value is 10, then a 20 host query will have 2 pre-aggregators, while a 100 host query will have 10. http://gerrit.cloudera.org:8080/#/c/20612/7/be/src/scheduling/scheduler.cc File be/src/scheduling/scheduler.cc: http://gerrit.cloudera.org:8080/#/c/20612/7/be/src/scheduling/scheduler.cc@282 PS7, Line 282: inline bool KrpcAddressMatch(const NetworkAddressPB& lhs, const NetworkAddressPB& rhs) { : return lhs.hostname() == rhs.hostname() && lhs.port() == rhs.port(); : } Can you move this to network-util.h? I think that there should be a single function to compare addresses, as it is not completely self-evident (due to uds_ address) http://gerrit.cloudera.org:8080/#/c/20612/7/be/src/scheduling/scheduler.cc@317 PS7, Line 317: if (num_agg <= 0) return; Does num_agg = 1 make sense? -- To view, visit http://gerrit.cloudera.org:8080/20612 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11d38ed0f223d6e5b32a19ebe725af7738ee4ab0 Gerrit-Change-Number: 20612 Gerrit-PatchSet: 7 Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Comment-Date: Mon, 30 Oct 2023 14:37:24 +0000 Gerrit-HasComments: Yes