Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16720 )
Change subject: IMPALA-10325: Parquet scan should use min/max statistics to skip pages based on equi-join predicate ...................................................................... Patch Set 29: (6 comments) http://gerrit.cloudera.org:8080/#/c/16720/29/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test File testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test: http://gerrit.cloudera.org:8080/#/c/16720/29/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@429 PS29, Line 429: ################################################### > Add a new negative test by setting enabled_runtime_filter_types set to bloo Ok, I think I misunderstood something about how this all works - so the min/max filters are not actually applied at the row level, but there are still filter contexts in the scan node? I'm confused because I assumed there would be filter_exprs_ and filter_ctxs_ in the backend scan node generated for these, and therefore we would be evaluating the min-max filters over each row, but that doesn't seem to be the case. Can you explain where we disable the row runtime filtering for the min-max filters? http://gerrit.cloudera.org:8080/#/c/16720/29/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@496 PS29, Line 496: explain select count(*) from > It looks like in PlannerTests the actual plans should be included as the ex I feel pretty strongly that we should keep these tests as java planner tests, it'll be easier to maintain if we keep the tests that validate plans (vs results/runtime behaviour) in the same framework. Agree that it's sometimes a bit verbose including the whole plan, but that is maybe something that could be improved in the planner test framework. http://gerrit.cloudera.org:8080/#/c/16720/34/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test File testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test: http://gerrit.cloudera.org:8080/#/c/16720/34/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@429 PS34, Line 429: ################################################### I think it would be good to add straight_join hints to some or all of these queries to prevent join ordering changes affecting the plans in future. Especially the ones which are joining two similarly-sized tables. http://gerrit.cloudera.org:8080/#/c/16720/34/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@489 PS34, Line 489: ==== I'd like to see more tests where we check the results for some of the edge cases below, e.g. where we are generating a plan with an implicit cast, checking that it can execute ok and produce the correct results. http://gerrit.cloudera.org:8080/#/c/16720/34/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@621 PS34, Line 621: STRING I think the check is a bit weak because the test could pass even if the runtime filter existed but had a different number. http://gerrit.cloudera.org:8080/#/c/16720/34/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@655 PS34, Line 655: # Verify that the overlap predicate is not possible For this and the following test, can we add identical tests except with the types reversed between the two tables (so that the cast is applied to the opposite join input). -- To view, visit http://gerrit.cloudera.org:8080/16720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691 Gerrit-Change-Number: 16720 Gerrit-PatchSet: 29 Gerrit-Owner: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Tue, 15 Dec 2020 02:02:20 +0000 Gerrit-HasComments: Yes