[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 15: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 15 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 22:57:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. IMPALA-10064: Support constant propagation for eligible range predicates This patch adds support for constant propagation of range predicates involving date and timestamp constants. Previously, only equality predicates were considered for propagation. The new type of propagation is shown by the following example: Before constant propagation: WHERE date_col = CAST(timestamp_col as DATE) AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01' After constant propagation: WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01' AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01' AND date_col = CAST(timestamp_col as DATE) As a consequence, since Impala supports table partitioning by date columns but not timestamp columns, the above propagation enables partition pruning based on timestamp ranges. Existing code for equality based constant propagation was refactored and consolidated into a new class which handles both equality and range based constant propagation. Range based propagation is only applied to date and timestamp columns. Testing: - Added new range constant propagation tests to PlannerTest. - Added e2e test for range constant propagation based on a newly added date partitioned table. - Ran precommit tests. Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Reviewed-on: http://gerrit.cloudera.org:8080/16346 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/test/java/org/apache/impala/service/FrontendTest.java M testdata/bin/compute-table-stats.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test A testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test M tests/query_test/test_queries.py 11 files changed, 412 insertions(+), 31 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 16 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 15: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6392/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 15 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 17:44:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 15: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 15 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 17:44:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 14: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 14 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 17:43:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7078/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 14 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 17:25:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 14: > Patch Set 13: > > > Patch Set 12: > > > > > Patch Set 12: Verified-1 > > > > > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6385/ > > > > 1 failure in org.apache.impala.planner.PlannerTest.testConstantPropagation: > > (this contains new tests added for this CR): seems related to ordering of > > the predicates and stats difference. The test passes on my dev machine. > > > > Actual does not match expected result: > > PLAN-ROOT SINK > > | > > 00:SCAN HDFS [functional.alltypes_date_partition] > >partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE > > '2009-02-01' > >HDFS partitions=32/55 files=32 size=15.99KB > >predicates: functional.alltypes_date_partition.int_col < 100, > > functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 > > 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP > > '2009-01-01 00:00:00', functional.alltypes_date_partition.bigint_col IN (5, > > 10), date_col = CAST(timestamp_col AS DATE) > > ^^^ > >row-size=64B cardinality=26 > > > > Expected: > > PLAN-ROOT SINK > > | > > 00:SCAN HDFS [functional.alltypes_date_partition] > >partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE > > '2009-02-01' > >HDFS partitions=32/55 files=32 size=15.99KB > >predicates: functional.alltypes_date_partition.bigint_col IN (5, 10), > > functional.alltypes_date_partition.int_col < 100, > > functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 > > 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP > > '2009-01-01 00:00:00', date_col = CAST(timestamp_col AS DATE) > >row-size=65B cardinality=13 > > > > There's also 1 failure in > > org.apache.impala.service.FrontendTest.TestGetTablesTypeTable which I > > didn't quite understand yet. > > 1st failure was due to stats not present. On my local machine I think I had > run the compute stats manually. I have added the alltypes_date_partition to > the compute-table-stats.sh script. Second failure needed a minor update to > the FrontendTest.java. There's some discrepancy in the Jenkins run vs my local one..it seems the table 'functional.alltypes_datasource' was never loaded on my local machine and I am not sure how to load it. In any case, I have made an update to the FrontendTest.java in PatchSet14 that should hopefully work now. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 14 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 17:13:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16346 to look at the new patch set (#14). Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. IMPALA-10064: Support constant propagation for eligible range predicates This patch adds support for constant propagation of range predicates involving date and timestamp constants. Previously, only equality predicates were considered for propagation. The new type of propagation is shown by the following example: Before constant propagation: WHERE date_col = CAST(timestamp_col as DATE) AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01' After constant propagation: WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01' AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01' AND date_col = CAST(timestamp_col as DATE) As a consequence, since Impala supports table partitioning by date columns but not timestamp columns, the above propagation enables partition pruning based on timestamp ranges. Existing code for equality based constant propagation was refactored and consolidated into a new class which handles both equality and range based constant propagation. Range based propagation is only applied to date and timestamp columns. Testing: - Added new range constant propagation tests to PlannerTest. - Added e2e test for range constant propagation based on a newly added date partitioned table. - Ran precommit tests. Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b --- M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/test/java/org/apache/impala/service/FrontendTest.java M testdata/bin/compute-table-stats.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test A testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test M tests/query_test/test_queries.py 11 files changed, 412 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/14 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 14 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7077/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 13 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 16:54:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 13: > Patch Set 12: > > > Patch Set 12: Verified-1 > > > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6385/ > > 1 failure in org.apache.impala.planner.PlannerTest.testConstantPropagation: > (this contains new tests added for this CR): seems related to ordering of > the predicates and stats difference. The test passes on my dev machine. > > Actual does not match expected result: > PLAN-ROOT SINK > | > 00:SCAN HDFS [functional.alltypes_date_partition] >partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE > '2009-02-01' >HDFS partitions=32/55 files=32 size=15.99KB >predicates: functional.alltypes_date_partition.int_col < 100, > functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 > 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP > '2009-01-01 00:00:00', functional.alltypes_date_partition.bigint_col IN (5, > 10), date_col = CAST(timestamp_col AS DATE) > ^^^ >row-size=64B cardinality=26 > > Expected: > PLAN-ROOT SINK > | > 00:SCAN HDFS [functional.alltypes_date_partition] >partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE > '2009-02-01' >HDFS partitions=32/55 files=32 size=15.99KB >predicates: functional.alltypes_date_partition.bigint_col IN (5, 10), > functional.alltypes_date_partition.int_col < 100, > functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 > 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP > '2009-01-01 00:00:00', date_col = CAST(timestamp_col AS DATE) >row-size=65B cardinality=13 > > There's also 1 failure in > org.apache.impala.service.FrontendTest.TestGetTablesTypeTable which I didn't > quite understand yet. 1st failure was due to stats not present. On my local machine I think I had run the compute stats manually. I have added the alltypes_date_partition to the compute-table-stats.sh script. Second failure needed a minor update to the FrontendTest.java. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 13 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 16:38:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16346 to look at the new patch set (#13). Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. IMPALA-10064: Support constant propagation for eligible range predicates This patch adds support for constant propagation of range predicates involving date and timestamp constants. Previously, only equality predicates were considered for propagation. The new type of propagation is shown by the following example: Before constant propagation: WHERE date_col = CAST(timestamp_col as DATE) AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01' After constant propagation: WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01' AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01' AND date_col = CAST(timestamp_col as DATE) As a consequence, since Impala supports table partitioning by date columns but not timestamp columns, the above propagation enables partition pruning based on timestamp ranges. Existing code for equality based constant propagation was refactored and consolidated into a new class which handles both equality and range based constant propagation. Range based propagation is only applied to date and timestamp columns. Testing: - Added new range constant propagation tests to PlannerTest. - Added e2e test for range constant propagation based on a newly added date partitioned table. - Ran precommit tests. Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b --- M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/test/java/org/apache/impala/service/FrontendTest.java M testdata/bin/compute-table-stats.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test A testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test M tests/query_test/test_queries.py 11 files changed, 410 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/13 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 13 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 12: > Patch Set 12: Verified-1 > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6385/ 1 failure in org.apache.impala.planner.PlannerTest.testConstantPropagation: (this contains new tests added for this CR): seems related to ordering of the predicates and stats difference. The test passes on my dev machine. Actual does not match expected result: PLAN-ROOT SINK | 00:SCAN HDFS [functional.alltypes_date_partition] partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE '2009-02-01' HDFS partitions=32/55 files=32 size=15.99KB predicates: functional.alltypes_date_partition.int_col < 100, functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP '2009-01-01 00:00:00', functional.alltypes_date_partition.bigint_col IN (5, 10), date_col = CAST(timestamp_col AS DATE) ^^^ row-size=64B cardinality=26 Expected: PLAN-ROOT SINK | 00:SCAN HDFS [functional.alltypes_date_partition] partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE '2009-02-01' HDFS partitions=32/55 files=32 size=15.99KB predicates: functional.alltypes_date_partition.bigint_col IN (5, 10), functional.alltypes_date_partition.int_col < 100, functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP '2009-01-01 00:00:00', date_col = CAST(timestamp_col AS DATE) row-size=65B cardinality=13 There's also 1 failure in org.apache.impala.service.FrontendTest.TestGetTablesTypeTable which I didn't quite understand yet. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 12 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 15:33:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 12: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6385/ -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 12 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 12:34:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6385/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 12 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 07:19:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 11: > Patch Set 10: > > Was able to find some time to look for local joins in TPCDS. There are not > many at all. > > query2.sql:WHERE d_week_seq1 = d_week_seq2 - 53 > query59.sql: AND d_week_seq1 = d_week_seq2 - 52 > query59.sql:WHERE s_store_id1 = s_store_id2 > > Sounds like they can be used for the min/max filtering at least. Thanks Qifan. Looking at query2, the d_week_seq1, d_week_seq2 come from different derived tables 'y' and 'z' even though the underlying base table is the same. From the planner perspective, they would be treated as a regular join. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 11 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 07:16:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7072/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 11 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 07:13:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 11: (4 comments) > Patch Set 10: > > > Patch Set 10: > > > > Change looks good to me. I should wait for the e2e test you mentioned right? > > Thanks for the review. Yes, I plan to add at least 1 e2e test with the > modified dataset. I hope to do it later today after some other ongoing work. Made these changes. http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java: http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@95 PS10, Line 95: public static final com.google.common.base.Predicate > line too long (92 > 90) Done http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/Expr.java@1234 PS10, Line 1234: BitSet changed = propagateConstants(tmpConjuncts, candidates, keepConjuncts, > line too long (95 > 90) Done http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@419 PS9, Line 419:predicates: timestamp_col <= TIMESTAMP '2009-02-01 00:00:00', timestamp_col >= TIMESTAMP '2009-01-01 00:00:00', date_col = CAST(timestamp_col AS DATE) > Made the code change to preserve the original conjunct date_col = cast(tim I changed the alltypes_date_partition table to have rows with odd values of 'id' to have matching timestamp_col and date_col values and even values of 'id' to have non-matching (I added an INTERVAL to the timestamp). The same e2e test from previous patchset is used. http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test File testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test@4 PS9, Line 4: alltypes_date_part > I would be ok with running with other data sets but I had some struggles in Removed the functional_parquet prefix. Also, I since these tests or the planner tests don't need parquet data, I modified the data generation script to only generate text version of alltypes_date_partition. Also reduced the size of the table so it does not run into the default maximum partitions limit. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 11 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 07:09:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16346 to look at the new patch set (#11). Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. IMPALA-10064: Support constant propagation for eligible range predicates This patch adds support for constant propagation of range predicates involving date and timestamp constants. Previously, only equality predicates were considered for propagation. The new type of propagation is shown by the following example: Before constant propagation: WHERE date_col = CAST(timestamp_col as DATE) AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01' After constant propagation: WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01' AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01' AND date_col = CAST(timestamp_col as DATE) As a consequence, since Impala supports table partitioning by date columns but not timestamp columns, the above propagation enables partition pruning based on timestamp ranges. Existing code for equality based constant propagation was refactored and consolidated into a new class which handles both equality and range based constant propagation. Range based propagation is only applied to date and timestamp columns. Testing: - Added new range constant propagation tests to PlannerTest. - Added e2e test for range constant propagation based on a newly added date partitioned table. - Ran precommit tests. Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b --- M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test A testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test M tests/query_test/test_queries.py 9 files changed, 408 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/11 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 11 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 10: > Patch Set 10: > > Change looks good to me. I should wait for the e2e test you mentioned right? Thanks for the review. Yes, I plan to add at least 1 e2e test with the modified dataset. I hope to do it later today after some other ongoing work. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 22:48:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 10: Change looks good to me. I should wait for the e2e test you mentioned right? -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 22:26:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 10: Was able to find some time to look for local joins in TPCDS. There are not many at all. query2.sql:WHERE d_week_seq1 = d_week_seq2 - 53 query59.sql: AND d_week_seq1 = d_week_seq2 - 52 query59.sql:WHERE s_store_id1 = s_store_id2 Sounds like they can be used for the min/max filtering at least. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 15:08:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7055/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 07:08:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 10: (6 comments) http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java: http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@55 PS9, Line 55:* predicates. The candidates BitSet is used to determine which members of > Mention how 'candidates' is used? Done http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@61 PS9, Line 61: co > nit: these parens prob aren't needed, right? Done http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@66 PS9, Line 66: = BinaryPredicate.IS_RANGE_PREDICATE.apply(b > can't this be !=? Done http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@128 PS9, Line 128: opagation > Can't this be Map.Entry to keep it type-safe and avoid cast? Done http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@132 PS9, Line 132: dtExpr = d > Map.Entry>? Done http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@419 PS9, Line 419:predicates: timestamp_col <= TIMESTAMP '2010-12-01 00:00:00', timestamp_col >= TIMESTAMP '2009-12-01 00:00:00', date_col = CAST(timestamp_col AS DATE) > Good point. All the use cases I have seen so far were ones where date_col Made the code change to preserve the original conjunct date_col = cast(timestamp_col as date). Updated plans. TODO: Add e2e test against a dataset where date values != timestamp col's date component. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 06:59:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has removed a vote on this change. Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Removed Code-Review+1 by Aman Sinha -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 10: Code-Review+1 Carry Shant's +1 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 06:48:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 10: (2 comments) http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java: http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@95 PS10, Line 95: public static final com.google.common.base.Predicate IS_RANGE_PREDICATE = line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/Expr.java@1234 PS10, Line 1234: BitSet changed = propagateConstants(tmpConjuncts, candidates, keepConjuncts, analyzer); line too long (95 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 06:48:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16346 to look at the new patch set (#10). Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. IMPALA-10064: Support constant propagation for eligible range predicates This patch adds support for constant propagation of range predicates involving date and timestamp constants. Previously, only equality predicates were considered for propagation. The new type of propagation is shown by the following example: Before constant propagation: WHERE date_col = CAST(timestamp_col as DATE) AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01' After constant propagation: WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01' AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01' As a consequence, since Impala supports table partitioning by date columns but not timestamp columns, the above propagation enables partition pruning based on timestamp ranges. Existing code for equality based constant propagation was refactored and consolidated into a new class which handles both equality and range based constant propagation. Range based propagation is only applied to date and timestamp columns. Testing: - Added new range constant propagation tests to PlannerTest. - Added e2e test for range constant propagation based on a newly added date partitioned table. - Ran precommit tests. Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b --- M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test A testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test M tests/query_test/test_queries.py 9 files changed, 401 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/10 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@419 PS9, Line 419:predicates: timestamp_col <= TIMESTAMP '2010-12-01 00:00:00', timestamp_col >= TIMESTAMP '2009-12-01 00:00:00' > Don't we still need to keep the date_col = cast(timestamp_col as date) pred Good point. All the use cases I have seen so far were ones where date_col was derived from the timestamp column. Yeah, for your example, we need to keep the cast predicate if the constant is a range predicate. I think the code change isn't much but I need to think about how to create a test data set for this. http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test File testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test@4 PS9, Line 4: functional_parquet > We generally don't include database names in the test files, since the infr I would be ok with running with other data sets but I had some struggles in loading the alltypes_date_partition table and had offline discussion with Shant. For Text format loading, the following error occurred since it went through HIve load process rather than Impala: The load-functional-planner-core-hive-generated-text-none-none.sql.log had the following error: "Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Dynamic partition strict mode requires at least one static partition column. To turn this off set hive.exec.dynamic.partition.mode=nonstrict" Setting the partition.mode to nonstrict got past that but ran into a default limit of the # dynamic partitions: "The maximum number of dynamic partitions is controlled by hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode. Maximum was set to 100 partitions per node, number of dynamic partitions on this node: 101" I could bump this up too .. but the Tez job does take much longer to execute..so I wasn't sure if it is worthwhile. I could move this to TestQueriesParquetTables unless you have other suggestions. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 02:20:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 9: (7 comments) http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java: http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@55 PS9, Line 55:* predicates. Mention how 'candidates' is used? http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@61 PS9, Line 61: (E nit: these parens prob aren't needed, right? http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@66 PS9, Line 66: !(bp.getOp() == BinaryPredicate.Operator.EQ) can't this be !=? http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@128 PS9, Line 128: Map.Entry Can't this be Map.Entry to keep it type-safe and avoid cast? http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@132 PS9, Line 132: Map.Entry Map.Entry>? http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@419 PS9, Line 419:predicates: timestamp_col <= TIMESTAMP '2010-12-01 00:00:00', timestamp_col >= TIMESTAMP '2009-12-01 00:00:00' Don't we still need to keep the date_col = cast(timestamp_col as date) predicate for this to be correct in cases where this isn't guaranteed to be true in the underlying data set? E.g. one counter-example would be date_col timestamp_col 2009-12-01 2009-12-2 00:00:00 I.e. I think we need to keep the equality predicate around. http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test File testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test@4 PS9, Line 4: functional_parquet We generally don't include database names in the test files, since the infra should switch to the appropriate functional database. We can move it to TestQueriesParquetTables if we only want it to run on the parquet data set (not kudu, etc). -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 01:17:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Shant Hovsepian has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 9: Code-Review+1 (1 comment) LGTM nice little fix. http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@461 PS7, Line 461: timestamp_col <= '2010-12-01'; > Done Done -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 21:39:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java: http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@93 PS8, Line 93: Propagate equality constant predicates to other conjuncts. Propagate :* range constant predicates to conjuncts involving date and timestamp :* columns. > The runtime filter is the bloom filter created after build side of the Has Thanks. That matches the same practice in my previous job: min/max/bloom filters work for HJ only. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 29 Aug 2020 02:31:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java: http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@93 PS8, Line 93: Propagate equality constant predicates to other conjuncts. Propagate :* range constant predicates to conjuncts involving date and timestamp :* columns. > OK. I see. The runtime filter is the bloom filter created after build side of the HashJoin is completed..so it is just based on the build values rather than the original predicate. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 29 Aug 2020 02:08:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java: http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@78 PS8, Line 78: isRangeOp && !(constant instanceof DateLiteral || : constant instanceof TimestampLiteral)) > Yes, even without the CAST, in the simple cases e.g a1 = b1 AND b1 > 10, w Yes, I agree only the transitivity on equal predicate makes sense. We can hold the change until such a use case is encountered. I was thinking about local join as the use case. http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@93 PS8, Line 93: Propagate equality constant predicates to other conjuncts. Propagate :* range constant predicates to conjuncts involving date and timestamp :* columns. > I tried an example with a join but given the sequence in which the optimiza OK. I see. I was under the impression that in Impala run-time filtering is applied only for equi-HJ. For MJ or NJ, any range predicate propagation will be from the join transitivity. Maybe I missed something here. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 29 Aug 2020 01:41:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7040/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Aug 2020 22:17:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@461 PS7, Line 461: timestamp_col <= '2010-12-01'; > Added a test for the swapped order. Done -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Aug 2020 22:17:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 9: (3 comments) http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java: http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@78 PS8, Line 78: isRangeOp && !(constant instanceof DateLiteral || : constant instanceof TimestampLiteral)) > I am not aware of other products doing the same :-). Yes, even without the CAST, in the simple cases e.g a1 = b1 AND b1 > 10, where these are numeric columns it should be fine. But won't be ok for something like 'a1 = 1/b1 AND b1 > 10'. So we would need to do additional checks for the expression. I haven't seen a good use case for the range propagation for other types besides date, timestamp. http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@93 PS8, Line 93: Propagate equality constant predicates to other conjuncts. Propagate :* range constant predicates to conjuncts involving date and timestamp :* columns. > Thanks for the background info. Appreciate it. The Trafodion has similar co I tried an example with a join but given the sequence in which the optimizations are done (join transitivity is done earlier and then constant propagation), it doesn't do the propagation on the other side of the join. See below. I think it is not straightforward. But note that due to Impala's runtime filter propagation, the runtime filter eventually gets applied on the other side of the join. explain select t1.* from functional_parquet.alltypes_date_partition t1, functional_parquet.alltypes_date_partition t2 where t1.date_col = t2.date_col and t2.date_col = cast(t2.timestamp_col as date) and t2.timestamp_col <= '2010-10-01'; ++ | Explain String | ++ | Max Per-Host Resource Reservation: Memory=3.02MB Threads=5 | | Per-Host Resource Estimates: Memory=152MB | | WARNING: The following tables are missing relevant table and/or column statistics. | | functional_parquet.alltypes_date_partition | | | | PLAN-ROOT SINK | | | | | 04:EXCHANGE [UNPARTITIONED] | | | | | 02:HASH JOIN [INNER JOIN, BROADCAST] | | | hash predicates: t1.date_col = t2.date_col | | | runtime filters: RF000 <- t2.date_col | | | row-size=84B cardinality=180.66K | | | | | |--03:EXCHANGE [BROADCAST] | | | | | | | 01:SCAN HDFS [functional_parquet.alltypes_date_partition t2] | | | partition predicates: t2.date_col <= DATE '2010-10-01' | | | HDFS partitions=639/730 files=639 size=1.94MB | | | predicates: t2.timestamp_col <= TIMESTAMP '2010-10-01 00:00:00' | | | row-size=20B cardinality=15.81K | | | | | 00:SCAN HDFS [functional_parquet.alltypes_date_partition t1] | |HDFS partitions=730/730 files=730 size=2.22MB | |runtime filters: RF000 -> t1.date_col | |row-size=64B cardinality=180.66K | ++ Also, just to be clear if you have range predicate on one side of the join : t1.a1 = t2.a2 AND t2.a2 > 10, this already will be propagated to the other side. http://ge
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java: http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@95 PS9, Line 95: public static final com.google.common.base.Predicate IS_RANGE_PREDICATE = line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Aug 2020 21:56:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16346 to look at the new patch set (#9). Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. IMPALA-10064: Support constant propagation for eligible range predicates This patch adds support for constant propagation of range predicates involving date and timestamp constants. Previously, only equality predicates were considered for propagation. The new type of propagation is shown by the following example: Before constant propagation: WHERE date_col = CAST(timestamp_col as DATE) AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01' After constant propagation: WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01' AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01' As a consequence, since Impala supports table partitioning by date columns but not timestamp columns, the above propagation enables partition pruning based on timestamp ranges. Existing code for equality based constant propagation was refactored and consolidated into a new class which handles both equality and range based constant propagation. Range based propagation is only applied to date and timestamp columns. Testing: - Added new range constant propagation tests to PlannerTest. - Added e2e test for range constant propagation based on a newly added date partitioned table. - Ran precommit tests. Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b --- M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test A testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test M tests/query_test/test_queries.py 9 files changed, 390 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/9 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 8: (3 comments) Thanks for the answers! http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java: http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@78 PS8, Line 78: isRangeOp && !(constant instanceof DateLiteral || : constant instanceof TimestampLiteral)) > It is kind of tricky to make it extensible in that sense because the logic I am not aware of other products doing the same :-). However, it seems a cast in the exact numeric (i.e. from smallint to int), or char to varchar, the equivalence can be transferred. http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@93 PS8, Line 93: Propagate equality constant predicates to other conjuncts. Propagate :* range constant predicates to conjuncts involving date and timestamp :* columns. > So the join scenario is handled differently. The planner has the notion of Thanks for the background info. Appreciate it. The Trafodion has similar concept (called it VEG: value equivalent group), which is a set for all values connecting by equal relationship. The new logic applies to a local join which is nice. On paper, I would guess the range can be applied to other side of an equi-join as well, similar to the min/max optimization for HJ. http://gerrit.cloudera.org:8080/#/c/16346/8/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/8/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@410 PS8, Line 410: select * from functional_parquet.alltypes_date_partition : where date_col = cast(timestamp_col as date) : and timestamp_col between '2009-12-01' and '2010-12-01'; > Actually were were never supporting constant propagation where the source i OK. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Aug 2020 18:33:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 8: (3 comments) http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java: http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@78 PS8, Line 78: isRangeOp && !(constant instanceof DateLiteral || : constant instanceof TimestampLiteral)) > For extensibility, maybe we could add a new argument to the method to indic It is kind of tricky to make it extensible in that sense because the logic for the range predicate propagation is closely tied with date and timestamp types. It is intentional to keep it limited to these types since semantically it may not be correct for other types. Let me know if you are aware of other products doing the propagation for any other data type. http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@93 PS8, Line 93: Propagate equality constant predicates to other conjuncts. Propagate :* range constant predicates to conjuncts involving date and timestamp :* columns. > I wonder if the propagation here happens only for the predicates appear in So the join scenario is handled differently. The planner has the notion of a ValueTransferGraph (see [1]) which is a strongly connected graph built using equijoin predicates. This allows transitive closure of the predicates and constant predicates are transitively applied to the other side of the join. In your example, yeah this will create a new predicate s.b = 2. The constant predicate propagation comes into the picture only when we build the Scan operator (after the above phase is done), so at this point only single table predicates are passed into the Expr.optimizeConjuncts(List conjuncts, Analyzer analyzer) method. [1] https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/Analyzer.java#L2358 http://gerrit.cloudera.org:8080/#/c/16346/8/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/8/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@410 PS8, Line 410: select * from functional_parquet.alltypes_date_partition : where date_col = cast(timestamp_col as date) : and timestamp_col between '2009-12-01' and '2010-12-01'; > Can we add one more test here? Actually were were never supporting constant propagation where the source is a disjunctive predicate. Even for equality conditions, this is the plan: explain select * from functional_parquet.alltypes_date_partition where int_col = cast(bigint_col as int) and (bigint_col = 50 or bigint_col = 100) | | | 00:SCAN HDFS [functional_parquet.alltypes_date_partition] |HDFS partitions=730/730 files=730 size=2.22MB |predicates: int_col = CAST(bigint_col AS INT), bigint_col IN (50, 100) |row-size=65B cardinality=326 Currently, we only allow BinaryPredicate to be the one that triggers the propagation, not CompoundPredicate. I suppose for simple CompoundPredicates such the one above where both children are BinaryPredicate, one could propagate but in the general case it would be complicated. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Aug 2020 17:34:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 8: (4 comments) Looks good to me! http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java: http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@78 PS8, Line 78: isRangeOp && !(constant instanceof DateLiteral || : constant instanceof TimestampLiteral)) For extensibility, maybe we could add a new argument to the method to indicate the type(s) of predicate to handle, rather hard-code the date and timestamp here. The names of the data members in this class probably will be renamed as well. http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@93 PS8, Line 93: Propagate equality constant predicates to other conjuncts. Propagate :* range constant predicates to conjuncts involving date and timestamp :* columns. I wonder if the propagation here happens only for the predicates appear in one place (say in the where clause in your example). propagation is also possible cross inner joins. select count(*) from t, s where t.a = s.b and t.a = 2; http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/Expr.java@1200 PS8, Line 1200: conjuncts, candidates May be add an example here. Specifying what are contained within these data structures are useful. http://gerrit.cloudera.org:8080/#/c/16346/8/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/8/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@410 PS8, Line 410: select * from functional_parquet.alltypes_date_partition : where date_col = cast(timestamp_col as date) : and timestamp_col between '2009-12-01' and '2010-12-01'; Can we add one more test here? select * from functional_parquet.alltypes_date_partition where date_col = cast(timestamp_col as date) and timestamp_col <= '2009-12-01' or timestamp_col > '2010-12-01'; At least on paper, the propagation should take place :-). -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Aug 2020 14:27:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Shant Hovsepian has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 8: (5 comments) http://gerrit.cloudera.org:8080/#/c/16346/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16346/7//COMMIT_MSG@7 PS7, Line 7: 10064 > Oops ! Changed it to 10064. Done http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java: http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@45 PS7, Line 45: ) { > Right..not needed. I had a version earlier using the arguments then forgot Done http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@65 PS7, Line 65: BinaryPredicate.IS_RANGE_PREDICATE.apply(bp); : if (!isRangeOp && !(bp.getOp() == BinaryPredicate.Operator.EQ)) continue; : SlotRef slotRef = bp.getBoundSlot(); : if (slotRef == null || !bp.getChild(1).isConsta > Makes sense to create a static api for this. I followed the com.google.com I was thinking just a simple method like BinaryPredicate::isEquivalence() but this works too. Thanks for the cleanup! http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@164 PS7, Line 164: & > Done Done http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@180 PS7, Line 180: : : > Done Done -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Aug 2020 07:25:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 8: (7 comments) http://gerrit.cloudera.org:8080/#/c/16346/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16346/7//COMMIT_MSG@7 PS7, Line 7: 10064 > Wrong JIRA? Oops ! Changed it to 10064. http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java: http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@45 PS7, Line 45: ) { > Are these constructor arguments required? Since you keep some state in this Right..not needed. I had a version earlier using the arguments then forgot to remove them. I have removed them. The second part of your comment wasn't quite clear. I did not want to maintain more state than necessary in the class. Also note below that the first method's bitset is the input candidate bitset while the second method's bitset is the output 'changed' bitset. http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@65 PS7, Line 65: BinaryPredicate.IS_RANGE_PREDICATE.apply(bp); : if (!isRangeOp && !(bp.getOp() == BinaryPredicate.Operator.EQ)) continue; : SlotRef slotRef = bp.getBoundSlot(); : if (slotRef == null || !bp.getChild(1).isConsta > *Suggestion* If you want to make this check a static public method in Bina Makes sense to create a static api for this. I followed the com.google.common.base.Predicate.apply() convention that is used in the Expr class but since the expr is already downcast to BinaryPredicate, I created the the api in the BinaryPredicate. Also modified HdfsScanNode accordingly. http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@164 PS7, Line 164: & > This can be static if you choose. Done http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@180 PS7, Line 180: : : > Not sure that this is a very useful exception. You could do: Done http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/datasets/functional/schema_constraints.csv File testdata/datasets/functional/schema_constraints.csv: http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/datasets/functional/schema_constraints.csv@22 PS7, Line 22: table_format:parquet > Usually these small test table are text types, any particular reason for ma Hmm..when I restricted it to text/none, the load script would do 2 loads..one through Impala and again through a Hive command . The latter generated this: 'Beginning execution of hive SQL' message which would fail due to something related to static vs dynamic partitioning. Is there a way to exclude Hive from the load process ? With parquet format I did not encounter this. http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@461 PS7, Line 461: timestamp_col <= '2010-12-01'; > There is a NormalizeBinaryPredicate rule that should rewrite "const http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Aug 2020 01:37:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7025/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Aug 2020 01:32:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java: http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@95 PS8, Line 95: public static final com.google.common.base.Predicate IS_RANGE_PREDICATE = line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Aug 2020 01:12:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16346 to look at the new patch set (#8). Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. IMPALA-10064: Support constant propagation for eligible range predicates This patch adds support for constant propagation of range predicates involving date and timestamp constants. Previously, only equality predicates were considered for propagation. The new type of propagation is shown by the following example: Before constant propagation: WHERE date_col = CAST(timestamp_col as DATE) AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01' After constant propagation: WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01' AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01' As a consequence, since Impala supports table partitioning by date columns but not timestamp columns, the above propagation enables partition pruning based on timestamp ranges. Existing code for equality based constant propagation was refactored and consolidated into a new class which handles both equality and range based constant propagation. Range based propagation is only applied to date and timestamp columns. Testing: - Added new range constant propagation tests to PlannerTest. - Added e2e test for range constant propagation based on a newly added date partitioned table. - Ran precommit tests. Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b --- M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test A testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test M tests/query_test/test_queries.py 9 files changed, 371 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/8 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong