Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20366 )

Change subject: IMPALA-12357: Skip scheduling bloom filter from full-build scan
......................................................................


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/20366/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20366/4//COMMIT_MSG@22
PS4, Line 22: e any predicate filter
One filter I am confused about is null checking - won't we add a null check on 
they key for a lot of kind of joins?


http://gerrit.cloudera.org:8080/#/c/20366/2/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/20366/2/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@858
PS2, Line 858:
> It is because of rule 2: "The build scan does not have any predicate filter
I still don't get it - don't we check rule 2 with hasScanConjuncts() and 
incomingFilters == null || incomingFilters.isEmpty() ?

If I understand correctly in a union all of different joins if 
hasEliminatedFilter is false for the first join, this will lead to not 
eliminating in the other joins, even if we could, while the joins are 
completely independent of each other.


http://gerrit.cloudera.org:8080/#/c/20366/2/testdata/workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test:

http://gerrit.cloudera.org:8080/#/c/20366/2/testdata/workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test@543
PS2, Line 543: # IMPALA-12357: filter size 512KB can achieve fpp lower than 0.9.
> I think this is because our test workload is relatively small while the def
I see, I was confused because I had a different optimization model in my head: 
I though that generally in FK/PK joins the runtime filter is not useful if 
there are no predicates/runtime filters on the build side, regardless of FPP.

For example in the test below the runtime filters do not seem useful, as the 
NDV on build side key >= the NDV of probe side key, so it is unlikely to drop 
any rows.

But this seems like a separate optimization with more chance of side affects 
than the current one.



--
To view, visit http://gerrit.cloudera.org:8080/20366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I494533bc06da84e606cbd1ae1619083333089a5e
Gerrit-Change-Number: 20366
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Comment-Date: Wed, 23 Aug 2023 09:53:14 +0000
Gerrit-HasComments: Yes

Reply via email to