Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16242 )
Change subject: IMPALA-9979: part 2: partitioned top-n ...................................................................... Patch Set 33: (3 comments) Just one other comment after which I am ready to +2. http://gerrit.cloudera.org:8080/#/c/16242/30/be/src/exec/topn-node.cc File be/src/exec/topn-node.cc: http://gerrit.cloudera.org:8080/#/c/16242/30/be/src/exec/topn-node.cc@559 PS30, Line 559: // We evict heaps starting with the heaps that were least effective at filtering > Maybe I didn't quite get the point, but we should be able to start filterin I think I was partly thinking of the computation within the AnalyticEval node and conflated it with the Analytic Sort. I understand the policy now..it effectively amounts to keeping the more 'popular' partitions in memory and evicting the less popular ones. i.e if partition limit =10 and at steady state if there are 2 partitions : one which has seen 100 tuples so far and another which has seen 20 tuples, you would choose to evict the second one. http://gerrit.cloudera.org:8080/#/c/16242/30/fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java File fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java: http://gerrit.cloudera.org:8080/#/c/16242/30/fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java@461 PS30, Line 461: } else { > I modified this to check for whether they are all literals and added a test Nice. http://gerrit.cloudera.org:8080/#/c/16242/33/testdata/workloads/functional-planner/queries/PlannerTest/analytic-rank-pushdown.test File testdata/workloads/functional-planner/queries/PlannerTest/analytic-rank-pushdown.test: http://gerrit.cloudera.org:8080/#/c/16242/33/testdata/workloads/functional-planner/queries/PlannerTest/analytic-rank-pushdown.test@22 PS33, Line 22: limit predicate When browsing through the plans one more time, it occurred to me that some users may find it confusing that the terms 'predicate' and 'row_numberr()' appear here because row_number() has not been computed yet in the bottom-up plan. Since this is the 'source expression', one option may be to display 'source expression: row_number() ...' Curious if you have any thought about this. Also ok to ignore for this patch .. we can revisit this based on user feedback. -- To view, visit http://gerrit.cloudera.org:8080/16242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic638af9495981d889a4cb7455a71e8be0eb1a8e5 Gerrit-Change-Number: 16242 Gerrit-PatchSet: 33 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com> Gerrit-Reviewer: David Rorke <dro...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Shant Hovsepian <sh...@cloudera.com> Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Tue, 09 Feb 2021 07:39:17 +0000 Gerrit-HasComments: Yes