Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16942 )
Change subject: IMPALA-10296: Fix analytic limit pushdown when predicates are present ...................................................................... Patch Set 11: (11 comments) http://gerrit.cloudera.org:8080/#/c/16942/11//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16942/11//COMMIT_MSG@9 PS11, Line 9: when > nit: remove Done http://gerrit.cloudera.org:8080/#/c/16942/11//COMMIT_MSG@22 PS11, Line 22: limit > did you mean partition limit >= order by limit ? Done http://gerrit.cloudera.org:8080/#/c/16942/11//COMMIT_MSG@32 PS11, Line 32: was > nit: 'is' Done http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java File fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java: http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@507 PS11, Line 507: upper top-n. > Since we do a distributed top-n, there are 3 top-n's in the plan and the 'u Good point - used final/analytic instead of upper/lower. http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@511 PS11, Line 511: doesn't matter > nit: worth clarifying that it doesn't matter for the purpose of the pushdow Done http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@530 PS11, Line 530: include all of the rows in the final > This does not literally mean all rows in the final partition right ? Should Yup that's true. http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@531 PS11, Line 531: was > nit: 'we' Done http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@538 PS11, Line 538: * TODO: this should also return the amount that needs to be added to the limit I already did this, removed TODO http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@603 PS11, Line 603: if (analyticLimit < limit) return falseStatus; > One special case where this could work is if each partition had a maximum o I think the partitioned top-n node would let us correctly apply the limit at runtime, if we wanted to further optimize this (i.e. I think at any point during the partitioned top-n, you could discard any in-memory partitions that didn't include the first N rows) http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/SortNode.java File fe/src/main/java/org/apache/impala/planner/SortNode.java: http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/SortNode.java@83 PS11, Line 83: row.s > nit: 'rows' Done http://gerrit.cloudera.org:8080/#/c/16942/11/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-analytic.test File testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-analytic.test: http://gerrit.cloudera.org:8080/#/c/16942/11/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-analytic.test@1139 PS11, Line 1139: # rank() predicate is not pushed down because TOPN_BYTES_LIMIT prevents conversion > The plan shows the lower top-n which indicates the rank was pushed down. Pe Thanks for catching this. The problem was that the table only had 8 rows, so the estimate for the top n was 18B * 8 = 144B, under the threshold. I switched to a different table with more rows and it now shows the expected behavior -- To view, visit http://gerrit.cloudera.org:8080/16942 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I801d7799b0d649c73d2dd1703729a9b58a662509 Gerrit-Change-Number: 16942 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Tue, 19 Jan 2021 19:40:05 +0000 Gerrit-HasComments: Yes