Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16942 )

Change subject: IMPALA-10296: Fix analytic limit pushdown when predicates are 
present
......................................................................


Patch Set 11:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/16942/11//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16942/11//COMMIT_MSG@9
PS11, Line 9: when
> nit: remove
Done


http://gerrit.cloudera.org:8080/#/c/16942/11//COMMIT_MSG@22
PS11, Line 22: limit
> did you mean partition limit >= order by limit ?
Done


http://gerrit.cloudera.org:8080/#/c/16942/11//COMMIT_MSG@32
PS11, Line 32: was
> nit: 'is'
Done


http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
File fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java:

http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@507
PS11, Line 507: upper top-n.
> Since we do a distributed top-n, there are 3 top-n's in the plan and the 'u
Good point - used final/analytic instead of upper/lower.


http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@511
PS11, Line 511: doesn't matter
> nit: worth clarifying that it doesn't matter for the purpose of the pushdow
Done


http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@530
PS11, Line 530: include all of the rows in the final
> This does not literally mean all rows in the final partition right ? Should
Yup that's true.


http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@531
PS11, Line 531: was
> nit: 'we'
Done


http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@538
PS11, Line 538:    * TODO: this should also return the amount that needs to be 
added to the limit
I already did this, removed TODO


http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@603
PS11, Line 603:     if (analyticLimit < limit) return falseStatus;
> One special case where this could work is if each partition had a maximum o
I think the partitioned top-n node would let us correctly apply the limit at 
runtime, if we wanted to further optimize this (i.e. I think at any point 
during the partitioned top-n, you could discard any in-memory partitions that 
didn't include the first N rows)


http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/SortNode.java
File fe/src/main/java/org/apache/impala/planner/SortNode.java:

http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/SortNode.java@83
PS11, Line 83: row.s
> nit: 'rows'
Done


http://gerrit.cloudera.org:8080/#/c/16942/11/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-analytic.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-analytic.test:

http://gerrit.cloudera.org:8080/#/c/16942/11/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-analytic.test@1139
PS11, Line 1139: # rank() predicate is not pushed down because TOPN_BYTES_LIMIT 
prevents conversion
> The plan shows the lower top-n which indicates the rank was pushed down. Pe
Thanks for catching this. The problem was that the table only had 8 rows, so 
the estimate for the top n was 18B * 8 = 144B, under the threshold. I switched 
to a different table with more rows and it now shows the expected behavior



--
To view, visit http://gerrit.cloudera.org:8080/16942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I801d7799b0d649c73d2dd1703729a9b58a662509
Gerrit-Change-Number: 16942
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Tue, 19 Jan 2021 19:40:05 +0000
Gerrit-HasComments: Yes

Reply via email to