liuyao has posted comments on this change. ( http://gerrit.cloudera.org:8080/17821 )
Change subject: IMPALA-2581: LIMIT can be propagated down into some aggregations ...................................................................... Patch Set 12: (8 comments) http://gerrit.cloudera.org:8080/#/c/17821/11//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17821/11//COMMIT_MSG@10 PS11, Line 10: When both con > nit. When both conditions below are true, Done http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/aggregation-node-base.h File be/src/exec/aggregation-node-base.h: http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/aggregation-node-base.h@71 PS11, Line 71: bool fast_limit_check_ = false; > nit. May init to false here. Done http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/aggregation-node-base.cc File be/src/exec/aggregation-node-base.cc: http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/aggregation-node-base.cc@93 PS11, Line 93: : Status AggregationNodeBase::Prepare(RuntimeState* state) { : SCOPED_TIMER(runtime_profile_->total_time_counter()); : RETURN_IF_ERROR(ExecNode::Prepare(state)); : for (auto& agg : aggs_) { : agg->SetDebugOptions > I also wonder if fast_limit_check_ can be moved to the plan node for Aggreg Done Only GroupingAggregator has limit, maybe limit should not be added to TAggregator http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/aggregation-node.cc File be/src/exec/aggregation-node.cc: http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/aggregation-node.cc@75 PS11, Line 75: os = true; : VLOG_QUERY << "the > nit. May make the message more specific by plugging in the numbers. For exa Done http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/streaming-aggregation-node.cc File be/src/exec/streaming-aggregation-node.cc: http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/streaming-aggregation-node.cc@129 PS11, Line 129: DCHECK(limit() > -1); > nit. We probably should DCHECK(limit() > -1) here before the IF. Done http://gerrit.cloudera.org:8080/#/c/17821/11/be/src/exec/streaming-aggregation-node.cc@133 PS11, Line 133: child_batch_->Reset(); : VLOG_QUERY << "the > nit. same as the other one to make the message more specific. Done http://gerrit.cloudera.org:8080/#/c/17821/11/testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test File testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test: http://gerrit.cloudera.org:8080/#/c/17821/11/testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test@2934 PS11, Line 2934: limit: 2 > nit. I wonder if this correct. On paper, we need to look at all distinct va This is an SQL optimization. Look at SubqueryRewriter# mergeExpr. This is a scalar subquery, the number of rows returned by this subquery cannot be greater than 1, so set limit = 2, and then check cardinality of result of subquery http://gerrit.cloudera.org:8080/#/c/17821/11/testdata/workloads/targeted-perf/queries/aggregation.test File testdata/workloads/targeted-perf/queries/aggregation.test: http://gerrit.cloudera.org:8080/#/c/17821/11/testdata/workloads/targeted-perf/queries/aggregation.test@2731 PS11, Line 2731: > This violates condition 1? Done -- To view, visit http://gerrit.cloudera.org:8080/17821 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I930a6cb203615acfc03f23118d1bc1f0ea360995 Gerrit-Change-Number: 17821 Gerrit-PatchSet: 12 Gerrit-Owner: liuyao <liu...@sensorsdata.cn> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: liuyao <liu...@sensorsdata.cn> Gerrit-Comment-Date: Fri, 10 Sep 2021 09:39:12 +0000 Gerrit-HasComments: Yes