[Impala-ASF-CR] IMPALA-8647: fix round-to-zero in planner estimates

2019-09-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14289 )

Change subject: IMPALA-8647: fix round-to-zero in planner estimates
..

IMPALA-8647: fix round-to-zero in planner estimates

Zero estimates in query plans can be quite dangerous
(for causing bad plans) because they propagate up the
plan tree, causing many more estimates to be zero.

E.g. if the estimate for the right side of the join
becomes 0, but the left side of the join has 1 trillion
rows, the planner will think that the join produces
zero rows and make further decisions accordingly.
This could be catastrophic if the right side of the
join produces even a single row.

This adds some helper functions that avoid rounding
to zero when multiplying selectivities and uses them
in places where selectivities were computed. The scan
nodes already bumped up the estimates anyway, so this
doesn't change cardinality estimates or plans, but
SelectNode and AggregationNode will not round to zero
after this change.

I looked at using Math.ceil() instead of Math.round(),
but it changed estimates slightly in a lot of plans,
which didn't seem worth the hassle.

Testing:
Added targeted cardinality tests for SelectNode and
AggregationNode.

Updated and sanity checked planner tests where estimates
changed.

Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Reviewed-on: http://gerrit.cloudera.org:8080/14289
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/views.test
12 files changed, 642 insertions(+), 637 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/14289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Gerrit-Change-Number: 14289
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8647: fix round-to-zero in planner estimates

2019-09-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14289 )

Change subject: IMPALA-8647: fix round-to-zero in planner estimates
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Gerrit-Change-Number: 14289
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Sep 2019 22:22:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8647: fix round-to-zero in planner estimates

2019-09-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14289 )

Change subject: IMPALA-8647: fix round-to-zero in planner estimates
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4652/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Gerrit-Change-Number: 14289
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Sep 2019 18:26:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8647: fix round-to-zero in planner estimates

2019-09-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14289 )

Change subject: IMPALA-8647: fix round-to-zero in planner estimates
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Gerrit-Change-Number: 14289
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Sep 2019 17:58:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8647: fix round-to-zero in planner estimates

2019-09-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14289 )

Change subject: IMPALA-8647: fix round-to-zero in planner estimates
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Gerrit-Change-Number: 14289
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Sep 2019 17:58:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8647: fix round-to-zero in planner estimates

2019-09-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14289 )

Change subject: IMPALA-8647: fix round-to-zero in planner estimates
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5008/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Gerrit-Change-Number: 14289
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Sep 2019 17:58:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8647: fix round-to-zero in planner estimates

2019-09-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14289 )

Change subject: IMPALA-8647: fix round-to-zero in planner estimates
..


Patch Set 1:

Csaba correctly pointed out that I was wrong on those three cases and that the 
lines were actually no-ops :)


--
To view, visit http://gerrit.cloudera.org:8080/14289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Gerrit-Change-Number: 14289
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Sep 2019 17:57:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8647: fix round-to-zero in planner estimates

2019-09-26 Thread Tim Armstrong (Code Review)
Hello Fang-Yu Rao, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14289

to look at the new patch set (#3).

Change subject: IMPALA-8647: fix round-to-zero in planner estimates
..

IMPALA-8647: fix round-to-zero in planner estimates

Zero estimates in query plans can be quite dangerous
(for causing bad plans) because they propagate up the
plan tree, causing many more estimates to be zero.

E.g. if the estimate for the right side of the join
becomes 0, but the left side of the join has 1 trillion
rows, the planner will think that the join produces
zero rows and make further decisions accordingly.
This could be catastrophic if the right side of the
join produces even a single row.

This adds some helper functions that avoid rounding
to zero when multiplying selectivities and uses them
in places where selectivities were computed. The scan
nodes already bumped up the estimates anyway, so this
doesn't change cardinality estimates or plans, but
SelectNode and AggregationNode will not round to zero
after this change.

I looked at using Math.ceil() instead of Math.round(),
but it changed estimates slightly in a lot of plans,
which didn't seem worth the hassle.

Testing:
Added targeted cardinality tests for SelectNode and
AggregationNode.

Updated and sanity checked planner tests where estimates
changed.

Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
---
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/views.test
12 files changed, 642 insertions(+), 637 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/14289/3
--
To view, visit http://gerrit.cloudera.org:8080/14289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Gerrit-Change-Number: 14289
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8647: fix round-to-zero in planner estimates

2019-09-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14289 )

Change subject: IMPALA-8647: fix round-to-zero in planner estimates
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5007/


--
To view, visit http://gerrit.cloudera.org:8080/14289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Gerrit-Change-Number: 14289
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Sep 2019 17:44:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8647: fix round-to-zero in planner estimates

2019-09-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14289 )

Change subject: IMPALA-8647: fix round-to-zero in planner estimates
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Gerrit-Change-Number: 14289
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Sep 2019 17:34:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8647: fix round-to-zero in planner estimates

2019-09-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14289 )

Change subject: IMPALA-8647: fix round-to-zero in planner estimates
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5007/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Gerrit-Change-Number: 14289
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Sep 2019 17:34:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8647: fix round-to-zero in planner estimates

2019-09-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14289 )

Change subject: IMPALA-8647: fix round-to-zero in planner estimates
..


Patch Set 1:

(3 comments)

The things you pointed out were legit, but I wanted to be conservative about 
making additional estimate changes outside of the scenario that I described.

http://gerrit.cloudera.org:8080/#/c/14289/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java:

http://gerrit.cloudera.org:8080/#/c/14289/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@352
PS1, Line 352:   // IMPALA-2165: Avoid setting the cardinality to 0 after 
rounding.
 :   cardinality_ = Math.max(1, cardinality_);
> This seems redundant.
It isn't exactly - if the cardinality was actually zero, it will increment it 
to 1. I didn't want to potentially mess with this.

I wanted to be a bit conservative here and avoid changing the estimate in 
another case unrelated to the problem I was trying to fix.


http://gerrit.cloudera.org:8080/#/c/14289/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/14289/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1148
PS1, Line 1148:   // IMPALA-2165: Avoid setting the cardinality to 0 after 
rounding.
  :   cardinality_ = Math.max(cardinality_, 1);
> This is redundant now.
It isn't exactly - if the cardinality was actually zero, it will increment it 
to 1. I didn't want to potentially mess with this.

I wanted to be a bit conservative here and avoid changing the estimate in 
another case unrelated to the problem I was trying to fix.


http://gerrit.cloudera.org:8080/#/c/14289/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

http://gerrit.cloudera.org:8080/#/c/14289/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@267
PS1, Line 267: cardinality_ = Math.min(Math.max(1, cardinality_), 
kuduTable_.getNumRows());
> If applyConjunctsSelectivity() cannot increase cardinality, then this seems
Same as others - didn't want to change the estimate in the 0 cardinality case.



--
To view, visit http://gerrit.cloudera.org:8080/14289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Gerrit-Change-Number: 14289
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Sep 2019 17:34:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8647: fix round-to-zero in planner estimates

2019-09-26 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14289 )

Change subject: IMPALA-8647: fix round-to-zero in planner estimates
..


Patch Set 1: Code-Review+2

(3 comments)

lgtm, just some comments about possible cleanups

http://gerrit.cloudera.org:8080/#/c/14289/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java:

http://gerrit.cloudera.org:8080/#/c/14289/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@352
PS1, Line 352:   // IMPALA-2165: Avoid setting the cardinality to 0 after 
rounding.
 :   cardinality_ = Math.max(1, cardinality_);
This seems redundant.


http://gerrit.cloudera.org:8080/#/c/14289/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/14289/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1148
PS1, Line 1148:   // IMPALA-2165: Avoid setting the cardinality to 0 after 
rounding.
  :   cardinality_ = Math.max(cardinality_, 1);
This is redundant now.


http://gerrit.cloudera.org:8080/#/c/14289/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

http://gerrit.cloudera.org:8080/#/c/14289/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@267
PS1, Line 267: cardinality_ = Math.min(Math.max(1, cardinality_), 
kuduTable_.getNumRows());
If applyConjunctsSelectivity() cannot increase cardinality, then this seems 
redundant.



--
To view, visit http://gerrit.cloudera.org:8080/14289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Gerrit-Change-Number: 14289
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 26 Sep 2019 14:49:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8647: fix round-to-zero in planner estimates

2019-09-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14289 )

Change subject: IMPALA-8647: fix round-to-zero in planner estimates
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4626/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Gerrit-Change-Number: 14289
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 24 Sep 2019 14:27:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8647: fix round-to-zero in planner estimates

2019-09-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14289


Change subject: IMPALA-8647: fix round-to-zero in planner estimates
..

IMPALA-8647: fix round-to-zero in planner estimates

Zero estimates in query plans can be quite dangerous
(for causing bad plans) because they propagate up the
plan tree, causing many more estimates to be zero.

E.g. if the estimate for the right side of the join
becomes 0, but the left side of the join has 1 trillion
rows, the planner will think that the join produces
zero rows and make further decisions accordingly.
This could be catastrophic if the right side of the
join produces even a single row.

This adds some helper functions that avoid rounding
to zero when multiplying selectivities and uses them
in places where selectivities were computed. The scan
nodes already bumped up the estimates anyway, so this
doesn't change cardinality estimates or plans, but
SelectNode and AggregationNode will not round to zero
after this change.

I looked at using Math.ceil() instead of Math.round(),
but it changed estimates slightly in a lot of plans,
which didn't seem worth the hassle.

Testing:
Added targeted cardinality tests for SelectNode and
AggregationNode.

Updated and sanity checked planner tests where estimates
changed.

Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
---
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/views.test
12 files changed, 642 insertions(+), 632 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/14289/1
--
To view, visit http://gerrit.cloudera.org:8080/14289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I148e9f1aede0a1e99b875b0e6af534f4bccf49b7
Gerrit-Change-Number: 14289
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong