Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20612 )

Change subject: IMPALA-3825: Delegate runtime filter aggregation to some 
executors
......................................................................


Patch Set 7:

(3 comments)

The implementation looks great in general, just a few comments

http://gerrit.cloudera.org:8080/#/c/20612/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20612/7//COMMIT_MSG@20
PS7, Line 20: Query option MAX_NUM_FILTER_AGGREGATOR is added to control
            : this feature. Given N as the number of backend executors 
excluding the
            : coordinator, the selected number of intermediate aggregators M =
            : min(MAX_NUM_FILTER_AGGREGATOR, N / 2). Setting 
MAX_NUM_FILTER_AGGREGATOR
            : <= 0 will disable the intermediate aggregator feature.
MAX_NUM_FILTER_AGGREGATOR seems ok for me in the short term to test this 
feature, but I think that there should be a more sophisticated to tune this on 
the long term. My concern is that depending on the number of hosts used in the 
query, a static default for  MAX_NUM_FILTER_AGGREGATOR  can be easily too low 
or too large.

I think that ideally we should have an option like 
MAX_NUM_FILTERS_AGGREGATED_PER_HOST, so if this value is 10, then a 20 host 
query will have 2 pre-aggregators, while a 100 host query will have 10.


http://gerrit.cloudera.org:8080/#/c/20612/7/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/20612/7/be/src/scheduling/scheduler.cc@282
PS7, Line 282: inline bool KrpcAddressMatch(const NetworkAddressPB& lhs, const 
NetworkAddressPB& rhs) {
             :   return lhs.hostname() == rhs.hostname() && lhs.port() == 
rhs.port();
             : }
Can you move this to network-util.h?
I think that there should be a single function to compare addresses, as it is 
not completely self-evident (due to uds_ address)


http://gerrit.cloudera.org:8080/#/c/20612/7/be/src/scheduling/scheduler.cc@317
PS7, Line 317:   if (num_agg <= 0) return;
Does num_agg = 1 make sense?



--
To view, visit http://gerrit.cloudera.org:8080/20612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d38ed0f223d6e5b32a19ebe725af7738ee4ab0
Gerrit-Change-Number: 20612
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Comment-Date: Mon, 30 Oct 2023 14:37:24 +0000
Gerrit-HasComments: Yes

Reply via email to