Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/20366 )
Change subject: IMPALA-12357: Skip scheduling bloom filter from full-build scan ...................................................................... Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/20366/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20366/4//COMMIT_MSG@22 PS4, Line 22: e any predicate filter > You mean planner add null checking predicate on join key column? I don't th Right, currently Impala does not add the IS NOT NULL predicate for the join columns. Hive does add this for inner joins (for outer joins it won't be applicable). http://gerrit.cloudera.org:8080/#/c/20366/4//COMMIT_MSG@29 PS4, Line 29: thus reducing the bloom filter aggregation : overhead in coordinator This applies to the remote bloom filters right ? For local filters that are not sent to the coordinator, this patch could still potentially help for the high fpp case. http://gerrit.cloudera.org:8080/#/c/20366/4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java: http://gerrit.cloudera.org:8080/#/c/20366/4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@252 PS4, Line 252: private int rank_ = 1; Is 'rank' the right term here ? Normally, rank implies some type of comparison function. Perhaps 'level' is more appropriate to indicate level in a subtree. http://gerrit.cloudera.org:8080/#/c/20366/4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@844 PS4, Line 844: Math.max(0.9 For testing purposes or for a backdoor in a real deployment, would it be useful to tune this ? Maybe in some cases we might want higher threshold or if there is a bug in the fpp estimation for any reason. I could also be convinced to not add yet another tuning option - so open to a counter argument. -- To view, visit http://gerrit.cloudera.org:8080/20366 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I494533bc06da84e606cbd1ae1619083333089a5e Gerrit-Change-Number: 20366 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Comment-Date: Sat, 26 Aug 2023 00:10:50 +0000 Gerrit-HasComments: Yes