[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 15: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 15
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 22:57:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..

IMPALA-10064: Support constant propagation for eligible range predicates

This patch adds support for constant propagation of range predicates
involving date and timestamp constants. Previously, only equality
predicates were considered for propagation. The new type of propagation
is shown by the following example:

Before constant propagation:
 WHERE date_col = CAST(timestamp_col as DATE)
  AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01'
After constant propagation:
 WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01'
  AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01'
  AND date_col = CAST(timestamp_col as DATE)

As a consequence, since Impala supports table partitioning by date
columns but not timestamp columns, the above propagation enables
partition pruning based on timestamp ranges.

Existing code for equality based constant propagation was refactored
and consolidated into a new class which handles both equality and
range based constant propagation. Range based propagation is only
applied to date and timestamp columns.

Testing:
 - Added new range constant propagation tests to PlannerTest.
 - Added e2e test for range constant propagation based on a newly
   added date partitioned table.
 - Ran precommit tests.

Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Reviewed-on: http://gerrit.cloudera.org:8080/16346
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M testdata/bin/compute-table-stats.sh
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
A 
testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test
M tests/query_test/test_queries.py
11 files changed, 412 insertions(+), 31 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 16
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 15:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6392/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 15
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 17:44:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 15: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 15
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 17:44:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 14: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 14
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 17:43:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 14:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7078/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 14
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 17:25:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 14:

> Patch Set 13:
>
> > Patch Set 12:
> >
> > > Patch Set 12: Verified-1
> > >
> > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6385/
> >
> > 1 failure in org.apache.impala.planner.PlannerTest.testConstantPropagation:
> > (this contains new tests added for this CR):  seems related to ordering of 
> > the predicates and stats difference. The test passes on my dev machine.
> >
> > Actual does not match expected result:
> > PLAN-ROOT SINK
> > |
> > 00:SCAN HDFS [functional.alltypes_date_partition]
> >partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE 
> > '2009-02-01'
> >HDFS partitions=32/55 files=32 size=15.99KB
> >predicates: functional.alltypes_date_partition.int_col < 100, 
> > functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 
> > 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP 
> > '2009-01-01 00:00:00', functional.alltypes_date_partition.bigint_col IN (5, 
> > 10), date_col = CAST(timestamp_col AS DATE)
> > ^^^
> >row-size=64B cardinality=26
> >
> > Expected:
> > PLAN-ROOT SINK
> > |
> > 00:SCAN HDFS [functional.alltypes_date_partition]
> >partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE 
> > '2009-02-01'
> >HDFS partitions=32/55 files=32 size=15.99KB
> >predicates: functional.alltypes_date_partition.bigint_col IN (5, 10), 
> > functional.alltypes_date_partition.int_col < 100, 
> > functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 
> > 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP 
> > '2009-01-01 00:00:00', date_col = CAST(timestamp_col AS DATE)
> >row-size=65B cardinality=13
> >
> > There's also 1 failure in 
> > org.apache.impala.service.FrontendTest.TestGetTablesTypeTable which I 
> > didn't quite understand yet.
>
> 1st failure was  due to stats not present. On my local machine I think I had 
> run the compute stats manually.  I have added the alltypes_date_partition to 
> the compute-table-stats.sh script.  Second failure needed a minor update to 
> the FrontendTest.java.

There's some discrepancy in the Jenkins run vs my local one..it seems the table 
'functional.alltypes_datasource' was never loaded on my local machine and I am 
not sure how to load it.  In any case, I have made an update to the 
FrontendTest.java in PatchSet14 that should hopefully work now.


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 14
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 17:13:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Aman Sinha (Code Review)
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16346

to look at the new patch set (#14).

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..

IMPALA-10064: Support constant propagation for eligible range predicates

This patch adds support for constant propagation of range predicates
involving date and timestamp constants. Previously, only equality
predicates were considered for propagation. The new type of propagation
is shown by the following example:

Before constant propagation:
 WHERE date_col = CAST(timestamp_col as DATE)
  AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01'
After constant propagation:
 WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01'
  AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01'
  AND date_col = CAST(timestamp_col as DATE)

As a consequence, since Impala supports table partitioning by date
columns but not timestamp columns, the above propagation enables
partition pruning based on timestamp ranges.

Existing code for equality based constant propagation was refactored
and consolidated into a new class which handles both equality and
range based constant propagation. Range based propagation is only
applied to date and timestamp columns.

Testing:
 - Added new range constant propagation tests to PlannerTest.
 - Added e2e test for range constant propagation based on a newly
   added date partitioned table.
 - Ran precommit tests.

Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M testdata/bin/compute-table-stats.sh
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
A 
testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test
M tests/query_test/test_queries.py
11 files changed, 412 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/14
--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 14
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 13:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7077/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 13
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 16:54:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 13:

> Patch Set 12:
>
> > Patch Set 12: Verified-1
> >
> > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6385/
>
> 1 failure in org.apache.impala.planner.PlannerTest.testConstantPropagation:
> (this contains new tests added for this CR):  seems related to ordering of 
> the predicates and stats difference. The test passes on my dev machine.
>
> Actual does not match expected result:
> PLAN-ROOT SINK
> |
> 00:SCAN HDFS [functional.alltypes_date_partition]
>partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE 
> '2009-02-01'
>HDFS partitions=32/55 files=32 size=15.99KB
>predicates: functional.alltypes_date_partition.int_col < 100, 
> functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 
> 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP 
> '2009-01-01 00:00:00', functional.alltypes_date_partition.bigint_col IN (5, 
> 10), date_col = CAST(timestamp_col AS DATE)
> ^^^
>row-size=64B cardinality=26
>
> Expected:
> PLAN-ROOT SINK
> |
> 00:SCAN HDFS [functional.alltypes_date_partition]
>partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE 
> '2009-02-01'
>HDFS partitions=32/55 files=32 size=15.99KB
>predicates: functional.alltypes_date_partition.bigint_col IN (5, 10), 
> functional.alltypes_date_partition.int_col < 100, 
> functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 
> 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP 
> '2009-01-01 00:00:00', date_col = CAST(timestamp_col AS DATE)
>row-size=65B cardinality=13
>
> There's also 1 failure in 
> org.apache.impala.service.FrontendTest.TestGetTablesTypeTable which I didn't 
> quite understand yet.

1st failure was  due to stats not present. On my local machine I think I had 
run the compute stats manually.  I have added the alltypes_date_partition to 
the compute-table-stats.sh script.  Second failure needed a minor update to the 
FrontendTest.java.


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 13
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 16:38:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Aman Sinha (Code Review)
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16346

to look at the new patch set (#13).

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..

IMPALA-10064: Support constant propagation for eligible range predicates

This patch adds support for constant propagation of range predicates
involving date and timestamp constants. Previously, only equality
predicates were considered for propagation. The new type of propagation
is shown by the following example:

Before constant propagation:
 WHERE date_col = CAST(timestamp_col as DATE)
  AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01'
After constant propagation:
 WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01'
  AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01'
  AND date_col = CAST(timestamp_col as DATE)

As a consequence, since Impala supports table partitioning by date
columns but not timestamp columns, the above propagation enables
partition pruning based on timestamp ranges.

Existing code for equality based constant propagation was refactored
and consolidated into a new class which handles both equality and
range based constant propagation. Range based propagation is only
applied to date and timestamp columns.

Testing:
 - Added new range constant propagation tests to PlannerTest.
 - Added e2e test for range constant propagation based on a newly
   added date partitioned table.
 - Ran precommit tests.

Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M testdata/bin/compute-table-stats.sh
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
A 
testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test
M tests/query_test/test_queries.py
11 files changed, 410 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/13
--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 13
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 12:

> Patch Set 12: Verified-1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6385/

1 failure in org.apache.impala.planner.PlannerTest.testConstantPropagation:
(this contains new tests added for this CR):  seems related to ordering of the 
predicates and stats difference. The test passes on my dev machine.

Actual does not match expected result:
PLAN-ROOT SINK
|
00:SCAN HDFS [functional.alltypes_date_partition]
   partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE 
'2009-02-01'
   HDFS partitions=32/55 files=32 size=15.99KB
   predicates: functional.alltypes_date_partition.int_col < 100, 
functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 
00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP 
'2009-01-01 00:00:00', functional.alltypes_date_partition.bigint_col IN (5, 
10), date_col = CAST(timestamp_col AS DATE)
^^^
   row-size=64B cardinality=26

Expected:
PLAN-ROOT SINK
|
00:SCAN HDFS [functional.alltypes_date_partition]
   partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE 
'2009-02-01'
   HDFS partitions=32/55 files=32 size=15.99KB
   predicates: functional.alltypes_date_partition.bigint_col IN (5, 10), 
functional.alltypes_date_partition.int_col < 100, 
functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 
00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP 
'2009-01-01 00:00:00', date_col = CAST(timestamp_col AS DATE)
   row-size=65B cardinality=13

There's also 1 failure in 
org.apache.impala.service.FrontendTest.TestGetTablesTypeTable which I didn't 
quite understand yet.


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 12
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 15:33:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 12: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6385/


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 12
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 12:34:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6385/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 12
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 07:19:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 11:

> Patch Set 10:
>
> Was able to find some time to look for local joins in TPCDS. There are not 
> many at all.
>
> query2.sql:WHERE  d_week_seq1 = d_week_seq2 - 53
> query59.sql:   AND d_week_seq1 = d_week_seq2 - 52
> query59.sql:WHERE  s_store_id1 = s_store_id2
>
> Sounds like they can be used for the min/max filtering at least.

Thanks Qifan.  Looking at query2, the d_week_seq1, d_week_seq2 come from 
different derived tables 'y' and 'z' even though the underlying base table is 
the same. From the planner perspective, they would be treated as a regular join.


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 11
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 07:16:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7072/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 11
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 07:13:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 11:

(4 comments)

> Patch Set 10:
>
> > Patch Set 10:
> >
> > Change looks good to me. I should wait for the e2e test you mentioned right?
>
> Thanks for the review. Yes, I plan to add at least 1 e2e test with the 
> modified dataset. I hope to do it later today after some other ongoing work.

Made these changes.

http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@95
PS10, Line 95:   public static final 
com.google.common.base.Predicate
> line too long (92 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/Expr.java@1234
PS10, Line 1234: BitSet changed = propagateConstants(tmpConjuncts, 
candidates, keepConjuncts,
> line too long (95 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@419
PS9, Line 419:predicates: timestamp_col <= TIMESTAMP '2009-02-01 00:00:00', 
timestamp_col >= TIMESTAMP '2009-01-01 00:00:00', date_col = CAST(timestamp_col 
AS DATE)
> Made the code change to preserve the original conjunct  date_col = cast(tim
I changed the alltypes_date_partition table to have  rows with odd values of 
'id' to have matching timestamp_col and date_col values and even values of 'id' 
to have non-matching (I added an INTERVAL to the timestamp). The same e2e test 
from previous patchset is used.


http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test
File 
testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test@4
PS9, Line 4: alltypes_date_part
> I would be ok with running with other data sets but I had some struggles in
Removed the functional_parquet prefix.  Also, I since these tests or the 
planner tests don't need parquet data, I modified the data generation script to 
only generate text version of alltypes_date_partition.  Also reduced the size 
of the table so it does not run into the default maximum partitions limit.



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 11
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 07:09:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-02 Thread Aman Sinha (Code Review)
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16346

to look at the new patch set (#11).

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..

IMPALA-10064: Support constant propagation for eligible range predicates

This patch adds support for constant propagation of range predicates
involving date and timestamp constants. Previously, only equality
predicates were considered for propagation. The new type of propagation
is shown by the following example:

Before constant propagation:
 WHERE date_col = CAST(timestamp_col as DATE)
  AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01'
After constant propagation:
 WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01'
  AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01'
  AND date_col = CAST(timestamp_col as DATE)

As a consequence, since Impala supports table partitioning by date
columns but not timestamp columns, the above propagation enables
partition pruning based on timestamp ranges.

Existing code for equality based constant propagation was refactored
and consolidated into a new class which handles both equality and
range based constant propagation. Range based propagation is only
applied to date and timestamp columns.

Testing:
 - Added new range constant propagation tests to PlannerTest.
 - Added e2e test for range constant propagation based on a newly
   added date partitioned table.
 - Ran precommit tests.

Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
A 
testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test
M tests/query_test/test_queries.py
9 files changed, 408 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/11
--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 11
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-01 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 10:

> Patch Set 10:
>
> Change looks good to me. I should wait for the e2e test you mentioned right?

Thanks for the review. Yes, I plan to add at least 1 e2e test with the modified 
dataset. I hope to do it later today after some other ongoing work.


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 10
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 22:48:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 10:

Change looks good to me. I should wait for the e2e test you mentioned right?


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 10
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 22:26:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-01 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 10:

Was able to find some time to look for local joins in TPCDS. There are not many 
at all.

query2.sql:WHERE  d_week_seq1 = d_week_seq2 - 53
query59.sql:   AND d_week_seq1 = d_week_seq2 - 52
query59.sql:WHERE  s_store_id1 = s_store_id2

Sounds like they can be used for the min/max filtering at least.


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 10
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 15:08:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7055/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 10
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 07:08:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-09-01 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 10:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java:

http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@55
PS9, Line 55:* predicates. The candidates BitSet is used to determine which 
members of
> Mention how 'candidates' is used?
Done


http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@61
PS9, Line 61: co
> nit: these parens prob aren't needed, right?
Done


http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@66
PS9, Line 66:  = BinaryPredicate.IS_RANGE_PREDICATE.apply(b
> can't this be !=?
Done


http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@128
PS9, Line 128: opagation
> Can't this be Map.Entry to keep it type-safe and avoid cast?
Done


http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@132
PS9, Line 132: dtExpr = d
> Map.Entry>?
Done


http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@419
PS9, Line 419:predicates: timestamp_col <= TIMESTAMP '2010-12-01 00:00:00', 
timestamp_col >= TIMESTAMP '2009-12-01 00:00:00', date_col = CAST(timestamp_col 
AS DATE)
> Good point.  All the use cases I have seen so far were ones where date_col
Made the code change to preserve the original conjunct  date_col = 
cast(timestamp_col as date).  Updated plans.
TODO:  Add e2e test against a dataset where date values != timestamp col's date 
component.



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 10
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 06:59:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-31 Thread Aman Sinha (Code Review)
Aman Sinha has removed a vote on this change.

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Removed Code-Review+1 by Aman Sinha 
--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 10
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-31 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 10: Code-Review+1

Carry Shant's +1


-- 
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 10
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 06:48:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@95
PS10, Line 95:   public static final 
com.google.common.base.Predicate IS_RANGE_PREDICATE =
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/Expr.java@1234
PS10, Line 1234: BitSet changed = propagateConstants(tmpConjuncts, 
candidates, keepConjuncts, analyzer);
line too long (95 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 10
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 06:48:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-31 Thread Aman Sinha (Code Review)
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16346

to look at the new patch set (#10).

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..

IMPALA-10064: Support constant propagation for eligible range predicates

This patch adds support for constant propagation of range predicates
involving date and timestamp constants. Previously, only equality
predicates were considered for propagation. The new type of propagation
is shown by the following example:

Before constant propagation:
 WHERE date_col = CAST(timestamp_col as DATE)
  AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01'
After constant propagation:
 WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01'
  AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01'

As a consequence, since Impala supports table partitioning by date
columns but not timestamp columns, the above propagation enables
partition pruning based on timestamp ranges.

Existing code for equality based constant propagation was refactored
and consolidated into a new class which handles both equality and
range based constant propagation. Range based propagation is only
applied to date and timestamp columns.

Testing:
 - Added new range constant propagation tests to PlannerTest.
 - Added e2e test for range constant propagation based on a newly
   added date partitioned table.
 - Ran precommit tests.

Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
A 
testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test
M tests/query_test/test_queries.py
9 files changed, 401 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/10
--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 10
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-31 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@419
PS9, Line 419:predicates: timestamp_col <= TIMESTAMP '2010-12-01 00:00:00', 
timestamp_col >= TIMESTAMP '2009-12-01 00:00:00'
> Don't we still need to keep the date_col = cast(timestamp_col as date) pred
Good point.  All the use cases I have seen so far were ones where date_col was 
derived from the timestamp column.  Yeah, for your example, we need to keep the 
cast predicate if the constant is a range predicate.  I  think the code change 
isn't much but I need to think about how to create a test data set for this.


http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test
File 
testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test@4
PS9, Line 4: functional_parquet
> We generally don't include database names in the test files, since the infr
I would be ok with running with other data sets but I had some struggles in 
loading the alltypes_date_partition table and had offline discussion with 
Shant.  For Text format loading, the following error occurred since it went 
through HIve load process rather than Impala:

The load-functional-planner-core-hive-generated-text-none-none.sql.log had the 
following error:
   "Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Dynamic 
partition strict mode requires at least one static partition column. To turn 
this off set hive.exec.dynamic.partition.mode=nonstrict"

Setting the partition.mode to nonstrict got past that but ran into a default 
limit of the # dynamic partitions:
"The maximum number of dynamic partitions is controlled by 
hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode. 
Maximum was set to 100 partitions per node, number of dynamic partitions on 
this node: 101"

I could bump this up too .. but the Tez job does take much longer to 
execute..so I wasn't sure if it is worthwhile.

I could move this to TestQueriesParquetTables unless you have other suggestions.



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 9
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 02:20:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 9:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java:

http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@55
PS9, Line 55:* predicates.
Mention how 'candidates' is used?


http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@61
PS9, Line 61: (E
nit: these parens prob aren't needed, right?


http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@66
PS9, Line 66:  !(bp.getOp() == BinaryPredicate.Operator.EQ)
can't this be !=?


http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@128
PS9, Line 128: Map.Entry
Can't this be Map.Entry to keep it type-safe and avoid cast?


http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@132
PS9, Line 132: Map.Entry
Map.Entry>?


http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@419
PS9, Line 419:predicates: timestamp_col <= TIMESTAMP '2010-12-01 00:00:00', 
timestamp_col >= TIMESTAMP '2009-12-01 00:00:00'
Don't we still need to keep the date_col = cast(timestamp_col as date) 
predicate for this to be correct in cases where this isn't guaranteed to be 
true in the underlying data set?

E.g. one counter-example would be

  date_col timestamp_col
  2009-12-01   2009-12-2 00:00:00

I.e. I think we need to keep the equality predicate around.


http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test
File 
testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test@4
PS9, Line 4: functional_parquet
We generally don't include database names in the test files, since the infra 
should switch to the appropriate functional database.

We can move it to TestQueriesParquetTables if we only want it to run on the 
parquet data set (not kudu, etc).



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 9
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 01:17:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-31 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 9: Code-Review+1

(1 comment)

LGTM nice little fix.

http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@461
PS7, Line 461: timestamp_col <= '2010-12-01';
> Done
Done



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 9
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 21:39:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-28 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java:

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@93
PS8, Line 93:  Propagate equality constant predicates to other conjuncts.  
Propagate
:* range constant predicates to conjuncts involving date and 
timestamp
:* columns.
> The runtime filter is the  bloom filter created after build side of the Has
Thanks. That matches the same practice in my previous job: min/max/bloom 
filters work for HJ only.



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 9
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 29 Aug 2020 02:31:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-28 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java:

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@93
PS8, Line 93:  Propagate equality constant predicates to other conjuncts.  
Propagate
:* range constant predicates to conjuncts involving date and 
timestamp
:* columns.
> OK. I see.
The runtime filter is the  bloom filter created after build side of the 
HashJoin is completed..so it is just based on the build values rather than the 
original predicate.



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 9
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 29 Aug 2020 02:08:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-28 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java:

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@78
PS8, Line 78: isRangeOp && !(constant instanceof DateLiteral ||
: constant instanceof TimestampLiteral))
> Yes, even without the CAST, in the simple cases e.g  a1 = b1 AND b1 > 10, w
Yes, I agree only the transitivity on equal predicate makes sense.

We can hold the change until such a use case is encountered. I was thinking 
about local join as the use case.


http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@93
PS8, Line 93:  Propagate equality constant predicates to other conjuncts.  
Propagate
:* range constant predicates to conjuncts involving date and 
timestamp
:* columns.
> I tried an example with a join but given the sequence in which the optimiza
OK. I see.

I was under the impression that in Impala run-time filtering is applied only 
for equi-HJ. For MJ or NJ, any range predicate propagation will be from the 
join transitivity.

Maybe I missed something here.



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 9
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 29 Aug 2020 01:41:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7040/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 9
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 28 Aug 2020 22:17:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-28 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@461
PS7, Line 461: timestamp_col <= '2010-12-01';
> Added a test for the swapped order.
Done



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 9
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 28 Aug 2020 22:17:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-28 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java:

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@78
PS8, Line 78: isRangeOp && !(constant instanceof DateLiteral ||
: constant instanceof TimestampLiteral))
> I am not aware of other products doing the same :-).
Yes, even without the CAST, in the simple cases e.g  a1 = b1 AND b1 > 10, where 
these are numeric columns it should be fine. But won't be ok for something like 
 'a1 = 1/b1  AND b1 > 10'.   So we would need to do additional checks for the 
expression.
I haven't seen a good use case for the range propagation for other types 
besides date, timestamp.


http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@93
PS8, Line 93:  Propagate equality constant predicates to other conjuncts.  
Propagate
:* range constant predicates to conjuncts involving date and 
timestamp
:* columns.
> Thanks for the background info. Appreciate it. The Trafodion has similar co
I tried an example with a join but given the sequence in which the 
optimizations are done (join transitivity is done earlier and then constant 
propagation), it doesn't do the propagation on the other side of the join.  See 
below. I think it is not straightforward.  But note that due to Impala's 
runtime filter propagation, the runtime filter eventually gets applied on the 
other side of the join.

 explain select t1.* from functional_parquet.alltypes_date_partition t1, 
functional_parquet.alltypes_date_partition t2  where t1.date_col = t2.date_col 
and t2.date_col = cast(t2.timestamp_col as date) and t2.timestamp_col <= 
'2010-10-01';

++
| Explain String
 |
++
| Max Per-Host Resource Reservation: Memory=3.02MB Threads=5
 |
| Per-Host Resource Estimates: Memory=152MB 
 |
| WARNING: The following tables are missing relevant table and/or column 
statistics. |
| functional_parquet.alltypes_date_partition
 |
|   
 |
| PLAN-ROOT SINK
 |
| | 
 |
| 04:EXCHANGE [UNPARTITIONED]   
 |
| | 
 |
| 02:HASH JOIN [INNER JOIN, BROADCAST]  
 |
| |  hash predicates: t1.date_col = t2.date_col 
 |
| |  runtime filters: RF000 <- t2.date_col  
 |
| |  row-size=84B cardinality=180.66K   
 |
| | 
 |
| |--03:EXCHANGE [BROADCAST]
 |
| |  |  
 |
| |  01:SCAN HDFS [functional_parquet.alltypes_date_partition t2]   
 |
| | partition predicates: t2.date_col <= DATE '2010-10-01'  
 |
| | HDFS partitions=639/730 files=639 size=1.94MB   
 |
| | predicates: t2.timestamp_col <= TIMESTAMP '2010-10-01 00:00:00' 
 |
| | row-size=20B cardinality=15.81K 
 |
| | 
 |
| 00:SCAN HDFS [functional_parquet.alltypes_date_partition t1]  
 |
|HDFS partitions=730/730 files=730 size=2.22MB  
 |
|runtime filters: RF000 -> t1.date_col  
 |
|row-size=64B cardinality=180.66K   
 |
++


Also, just to be clear if you have range predicate on one side of the join : 
t1.a1 = t2.a2 AND t2.a2 > 10,  this already will be propagated to the other 
side.


http://ge

[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@95
PS9, Line 95:   public static final 
com.google.common.base.Predicate IS_RANGE_PREDICATE =
line too long (92 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 9
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 28 Aug 2020 21:56:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-28 Thread Aman Sinha (Code Review)
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16346

to look at the new patch set (#9).

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..

IMPALA-10064: Support constant propagation for eligible range predicates

This patch adds support for constant propagation of range predicates
involving date and timestamp constants. Previously, only equality
predicates were considered for propagation. The new type of propagation
is shown by the following example:

Before constant propagation:
 WHERE date_col = CAST(timestamp_col as DATE)
  AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01'
After constant propagation:
 WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01'
  AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01'

As a consequence, since Impala supports table partitioning by date
columns but not timestamp columns, the above propagation enables
partition pruning based on timestamp ranges.

Existing code for equality based constant propagation was refactored
and consolidated into a new class which handles both equality and
range based constant propagation. Range based propagation is only
applied to date and timestamp columns.

Testing:
 - Added new range constant propagation tests to PlannerTest.
 - Added e2e test for range constant propagation based on a newly
   added date partitioned table.
 - Ran precommit tests.

Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
A 
testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test
M tests/query_test/test_queries.py
9 files changed, 390 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/9
--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 9
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-28 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 8:

(3 comments)

Thanks for the answers!

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java:

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@78
PS8, Line 78: isRangeOp && !(constant instanceof DateLiteral ||
: constant instanceof TimestampLiteral))
> It is kind of tricky to make it extensible in that sense because the logic
I am not aware of other products doing the same :-).

However, it seems a cast in the exact numeric (i.e. from smallint to int), or 
char to varchar, the equivalence can be transferred.


http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@93
PS8, Line 93:  Propagate equality constant predicates to other conjuncts.  
Propagate
:* range constant predicates to conjuncts involving date and 
timestamp
:* columns.
> So the join scenario is handled differently.  The planner has the notion of
Thanks for the background info. Appreciate it. The Trafodion has similar 
concept (called it VEG: value equivalent group), which is a set for all values 
connecting by equal relationship.

The new logic applies to a local join which is nice. On paper,  I would guess 
the range can be applied to other side of an equi-join as well, similar to the 
min/max optimization for HJ.


http://gerrit.cloudera.org:8080/#/c/16346/8/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/8/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@410
PS8, Line 410: select * from functional_parquet.alltypes_date_partition
 : where date_col = cast(timestamp_col as date)
 : and timestamp_col between '2009-12-01' and '2010-12-01';
> Actually were were never supporting constant propagation where the source i
OK.



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 8
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 28 Aug 2020 18:33:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-28 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 8:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java:

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@78
PS8, Line 78: isRangeOp && !(constant instanceof DateLiteral ||
: constant instanceof TimestampLiteral))
> For extensibility, maybe we could add a new argument to the method to indic
It is kind of tricky to make it extensible in that sense because the logic for 
the range predicate propagation is closely tied with date and timestamp types. 
It is intentional to keep it limited to these types since semantically it may 
not be correct for other types. Let me know if you are aware of other products 
doing the propagation for any other data type.


http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@93
PS8, Line 93:  Propagate equality constant predicates to other conjuncts.  
Propagate
:* range constant predicates to conjuncts involving date and 
timestamp
:* columns.
> I wonder if the propagation here happens only for the predicates appear in
So the join scenario is handled differently.  The planner has the notion of a 
ValueTransferGraph (see [1]) which is a strongly connected graph built using 
equijoin predicates.  This allows transitive closure of the predicates and 
constant predicates are transitively applied to the other side of the join.  In 
your example, yeah this will create a new predicate s.b = 2.

The constant predicate propagation comes into the picture only when we build 
the Scan operator (after the above phase is done), so at this point only single 
table predicates are passed into the Expr.optimizeConjuncts(List 
conjuncts, Analyzer analyzer) method.

[1] 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/Analyzer.java#L2358


http://gerrit.cloudera.org:8080/#/c/16346/8/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/8/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@410
PS8, Line 410: select * from functional_parquet.alltypes_date_partition
 : where date_col = cast(timestamp_col as date)
 : and timestamp_col between '2009-12-01' and '2010-12-01';
> Can we add one more test here?
Actually were were never supporting constant propagation where the source is a 
disjunctive predicate.  Even for equality conditions, this is the plan:

explain select * from functional_parquet.alltypes_date_partition
where int_col = cast(bigint_col as int) and (bigint_col = 50 or bigint_col = 
100)
| | 
| 00:SCAN HDFS [functional_parquet.alltypes_date_partition]
|HDFS partitions=730/730 files=730 size=2.22MB   
|predicates: int_col = CAST(bigint_col AS INT), bigint_col IN (50, 100)
|row-size=65B cardinality=326  

Currently, we only allow BinaryPredicate to be the one that triggers the 
propagation, not CompoundPredicate. I suppose for simple CompoundPredicates 
such the one above where both children are BinaryPredicate, one could propagate 
but in the general case it would be complicated.



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 8
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 28 Aug 2020 17:34:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-28 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 8:

(4 comments)

Looks good to me!

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java:

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@78
PS8, Line 78: isRangeOp && !(constant instanceof DateLiteral ||
: constant instanceof TimestampLiteral))
For extensibility, maybe we could add a new argument to the method to indicate 
the type(s) of predicate to handle, rather hard-code the date and timestamp 
here.

The names of the data members in this class probably will be renamed as well.


http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@93
PS8, Line 93:  Propagate equality constant predicates to other conjuncts.  
Propagate
:* range constant predicates to conjuncts involving date and 
timestamp
:* columns.
I wonder if the propagation here happens only for the predicates appear in one 
place (say in the where clause in your example).

propagation is also possible cross inner joins.

select count(*) from t, s where
t.a = s.b and t.a = 2;


http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/Expr.java@1200
PS8, Line 1200: conjuncts, candidates
May be add an example here. Specifying what are contained within these data 
structures are useful.


http://gerrit.cloudera.org:8080/#/c/16346/8/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/8/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@410
PS8, Line 410: select * from functional_parquet.alltypes_date_partition
 : where date_col = cast(timestamp_col as date)
 : and timestamp_col between '2009-12-01' and '2010-12-01';
Can we add one more test here?

select * from functional_parquet.alltypes_date_partition
where date_col = cast(timestamp_col as date)
and timestamp_col <= '2009-12-01' or timestamp_col > '2010-12-01';

At least on paper, the propagation should take place :-).



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 8
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 28 Aug 2020 14:27:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-28 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 8:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/16346/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16346/7//COMMIT_MSG@7
PS7, Line 7: 10064
> Oops ! Changed it to 10064.
Done


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java:

http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@45
PS7, Line 45: ) {
> Right..not needed.  I had a version earlier using the arguments then forgot
Done


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@65
PS7, Line 65: BinaryPredicate.IS_RANGE_PREDICATE.apply(bp);
:   if (!isRangeOp && !(bp.getOp() == 
BinaryPredicate.Operator.EQ)) continue;
:   SlotRef slotRef = bp.getBoundSlot();
:   if (slotRef == null || !bp.getChild(1).isConsta
> Makes sense to create a static api for this.  I followed the com.google.com
I was thinking just a simple method like BinaryPredicate::isEquivalence() but 
this works too. Thanks for the cleanup!


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@164
PS7, Line 164: &
> Done
Done


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@180
PS7, Line 180:
 :
 :
> Done
Done



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 8
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 28 Aug 2020 07:25:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-27 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 8:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/16346/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16346/7//COMMIT_MSG@7
PS7, Line 7: 10064
> Wrong JIRA?
Oops ! Changed it to 10064.


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java:

http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@45
PS7, Line 45: ) {
> Are these constructor arguments required? Since you keep some state in this
Right..not needed.  I had a version earlier using the arguments then forgot to 
remove them.  I have removed them.  The second part of your comment wasn't 
quite clear. I did not want to maintain more state than necessary in the class. 
Also note below that the first method's bitset is the input candidate bitset 
while the second method's bitset is the output 'changed' bitset.


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@65
PS7, Line 65: BinaryPredicate.IS_RANGE_PREDICATE.apply(bp);
:   if (!isRangeOp && !(bp.getOp() == 
BinaryPredicate.Operator.EQ)) continue;
:   SlotRef slotRef = bp.getBoundSlot();
:   if (slotRef == null || !bp.getChild(1).isConsta
> *Suggestion* If you want to make  this check a static public method in Bina
Makes sense to create a static api for this.  I followed the 
com.google.common.base.Predicate.apply() convention that is used in the Expr 
class but since the expr is already downcast to BinaryPredicate, I created the 
the api in the BinaryPredicate.  Also modified HdfsScanNode accordingly.


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@164
PS7, Line 164: &
> This can be static if you choose.
Done


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@180
PS7, Line 180:
 : 
 :
> Not sure that this is a very useful exception. You could do:
Done


http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/datasets/functional/schema_constraints.csv
File testdata/datasets/functional/schema_constraints.csv:

http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/datasets/functional/schema_constraints.csv@22
PS7, Line 22: table_format:parquet
> Usually these small test table are text types, any particular reason for ma
Hmm..when I restricted it to text/none,  the load script would do 2 loads..one 
through Impala and again through a Hive command .  The latter generated  this: 
'Beginning execution of hive SQL' message which would fail due to something 
related to static vs dynamic partitioning.
Is there a way to exclude Hive from the load process ?  With parquet format I 
did not encounter this.


http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@461
PS7, Line 461: timestamp_col <= '2010-12-01';
> There is a NormalizeBinaryPredicate rule that should rewrite "const  http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 8
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 28 Aug 2020 01:37:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7025/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 8
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 28 Aug 2020 01:32:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/16346/8/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@95
PS8, Line 95:   public static final 
com.google.common.base.Predicate IS_RANGE_PREDICATE =
line too long (92 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 8
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 28 Aug 2020 01:12:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-27 Thread Aman Sinha (Code Review)
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16346

to look at the new patch set (#8).

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..

IMPALA-10064: Support constant propagation for eligible range predicates

This patch adds support for constant propagation of range predicates
involving date and timestamp constants. Previously, only equality
predicates were considered for propagation. The new type of propagation
is shown by the following example:

Before constant propagation:
 WHERE date_col = CAST(timestamp_col as DATE)
  AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01'
After constant propagation:
 WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01'
  AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01'

As a consequence, since Impala supports table partitioning by date
columns but not timestamp columns, the above propagation enables
partition pruning based on timestamp ranges.

Existing code for equality based constant propagation was refactored
and consolidated into a new class which handles both equality and
range based constant propagation. Range based propagation is only
applied to date and timestamp columns.

Testing:
 - Added new range constant propagation tests to PlannerTest.
 - Added e2e test for range constant propagation based on a newly
   added date partitioned table.
 - Ran precommit tests.

Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
A 
testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test
M tests/query_test/test_queries.py
9 files changed, 371 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/8
--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 8
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong