Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17821 )

Change subject: IMPALA-2581: LIMIT can be propagated down into some aggregations
......................................................................


Patch Set 11:

(8 comments)

Looks pretty good.

http://gerrit.cloudera.org:8080/#/c/17821/11//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17821/11//COMMIT_MSG@10
PS11, Line 10: In some cases
nit. When both conditions below are true,


http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/aggregation-node-base.h
File be/src/exec/aggregation-node-base.h:

http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/aggregation-node-base.h@71
PS11, Line 71:   bool fast_limit_check_;
nit. May init to false here.


http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/aggregation-node-base.cc
File be/src/exec/aggregation-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/aggregation-node-base.cc@93
PS11, Line 93: if (num_aggs == 1 && limit() != -1 && 
pnode.aggs_[0]->GetNumGroupingExprs() > 0
             :       && pnode.aggs_[0]->aggregate_functions_.empty()
             :       && pnode.aggs_[0]->conjuncts_.empty()) {
             :     fast_limit_check_ = true;
             :     GroupingAggregator* gpAgg = 
dynamic_cast<GroupingAggregator*>(aggs_[0].get());
             :     gpAgg->UnsetLimit();
I also wonder if fast_limit_check_ can be moved to the plan node for 
Aggregation and set by FE.

The same comment goes to limit. That is, set the limit in TAggregator in FE.

In this way, the solution is cleaner.


http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/aggregation-node.cc
File be/src/exec/aggregation-node.cc:

http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/aggregation-node.cc@75
PS11, Line 75: LOG_QUERY << "the number of rows returned by agg node has 
exceeded"
             :               " the limit.";
nit. May make the message more specific by plugging in the numbers. For 
example, the number of rows (3) returned from the aggregation node has exceeded 
the limit of 2.


http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/streaming-aggregation-node.cc
File be/src/exec/streaming-aggregation-node.cc:

http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/streaming-aggregation-node.cc@129
PS11, Line 129:         if (aggs_[0]->GetNumKeys() >= limit()) {
nit. We probably should DCHECK(limit() > -1) here before the IF.


http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/streaming-aggregation-node.cc@133
PS11, Line 133: VLOG_QUERY << "the number of rows returned by streaming agg 
node has exceeded"
              :               " the limit.";
nit. same as the other one to make the message more specific.


http://gerrit.cloudera.org:8080/#/c/17821/11/testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test:

http://gerrit.cloudera.org:8080/#/c/17821/11/testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test@2934
PS11, Line 2934: limit: 2
nit. I wonder if this correct. On paper, we need to look at all distinct values 
(via grouping) to make a decision.


http://gerrit.cloudera.org:8080/#/c/17821/11/testdata/workloads/targeted-perf/queries/aggregation.test
File testdata/workloads/targeted-perf/queries/aggregation.test:

http://gerrit.cloudera.org:8080/#/c/17821/11/testdata/workloads/targeted-perf/queries/aggregation.test@2731
PS11, Line 2731: count(*)
This violates condition 1?



--
To view, visit http://gerrit.cloudera.org:8080/17821
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I930a6cb203615acfc03f23118d1bc1f0ea360995
Gerrit-Change-Number: 17821
Gerrit-PatchSet: 11
Gerrit-Owner: liuyao <liu...@sensorsdata.cn>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Thu, 09 Sep 2021 15:08:25 +0000
Gerrit-HasComments: Yes

Reply via email to