Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16720 )

Change subject: IMPALA-10325: Parquet scan should use min/max statistics to 
skip pages based on equi-join predicate
......................................................................


Patch Set 29:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/16720/29/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test
File testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test:

http://gerrit.cloudera.org:8080/#/c/16720/29/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@429
PS29, Line 429: ###################################################
> Add a new negative test by setting enabled_runtime_filter_types set to bloo
Ok, I think I misunderstood something about how this all works - so the min/max 
filters are not actually applied at the row level, but there are still filter 
contexts in the scan node?

I'm confused because I assumed there would be filter_exprs_ and filter_ctxs_ in 
the backend scan node generated for these, and therefore we would be evaluating 
the min-max filters over each row, but that doesn't seem to be the case. Can 
you explain where we disable the row runtime filtering for the min-max filters?


http://gerrit.cloudera.org:8080/#/c/16720/29/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@496
PS29, Line 496: explain select count(*) from
> It looks like in PlannerTests the actual plans should be included as the ex
I feel pretty strongly that we should keep these tests as java planner tests, 
it'll be easier to maintain if we keep the tests that validate plans (vs 
results/runtime behaviour) in the same framework.

Agree that it's sometimes a bit verbose including the whole plan, but that is 
maybe something that could be improved in the planner test framework.


http://gerrit.cloudera.org:8080/#/c/16720/34/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test
File testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test:

http://gerrit.cloudera.org:8080/#/c/16720/34/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@429
PS34, Line 429: ###################################################
I think it would be good to add straight_join hints to some or all of these 
queries to prevent join ordering changes affecting the plans in future. 
Especially the ones which are joining two similarly-sized tables.


http://gerrit.cloudera.org:8080/#/c/16720/34/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@489
PS34, Line 489: ====
I'd like to see more tests where we check the results for some of the edge 
cases below, e.g. where we are generating a plan with an implicit cast, 
checking that it can execute ok and produce the correct results.


http://gerrit.cloudera.org:8080/#/c/16720/34/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@621
PS34, Line 621: STRING
I think the check is a bit weak because the test could pass even if the runtime 
filter existed but had a different number.


http://gerrit.cloudera.org:8080/#/c/16720/34/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@655
PS34, Line 655: # Verify that the overlap predicate is not possible
For this and the following test, can we add identical tests except with the 
types reversed between the two tables (so that the cast is applied to the 
opposite join input).



--
To view, visit http://gerrit.cloudera.org:8080/16720
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691
Gerrit-Change-Number: 16720
Gerrit-PatchSet: 29
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Tue, 15 Dec 2020 02:02:20 +0000
Gerrit-HasComments: Yes

Reply via email to