Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16242 )

Change subject: IMPALA-9979: part 2: partitioned top-n
......................................................................


Patch Set 33:

(3 comments)

Just one other comment after which I am ready to +2.

http://gerrit.cloudera.org:8080/#/c/16242/30/be/src/exec/topn-node.cc
File be/src/exec/topn-node.cc:

http://gerrit.cloudera.org:8080/#/c/16242/30/be/src/exec/topn-node.cc@559
PS30, Line 559:   // We evict heaps starting with the heaps that were least 
effective at filtering
> Maybe I didn't quite get the point, but we should be able to start filterin
I think I was partly thinking of the computation within the AnalyticEval node 
and conflated it with the Analytic Sort.  I understand the policy now..it 
effectively amounts to keeping the more 'popular' partitions in memory and 
evicting the less popular ones. i.e if partition limit =10 and at steady state 
if there are 2 partitions : one which has seen 100 tuples so far and another 
which has seen 20 tuples, you would choose to evict the second one.


http://gerrit.cloudera.org:8080/#/c/16242/30/fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
File fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java:

http://gerrit.cloudera.org:8080/#/c/16242/30/fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java@461
PS30, Line 461:           } else {
> I modified this to check for whether they are all literals and added a test
Nice.


http://gerrit.cloudera.org:8080/#/c/16242/33/testdata/workloads/functional-planner/queries/PlannerTest/analytic-rank-pushdown.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/analytic-rank-pushdown.test:

http://gerrit.cloudera.org:8080/#/c/16242/33/testdata/workloads/functional-planner/queries/PlannerTest/analytic-rank-pushdown.test@22
PS33, Line 22: limit predicate
When browsing through the plans one more time, it occurred to me that some 
users may find it confusing that the terms 'predicate'  and 'row_numberr()'  
appear here because row_number() has not been computed yet in the bottom-up 
plan.  Since this is the 'source expression',  one option may be to display 
'source expression: row_number() ...'   Curious if you have any thought about 
this.  Also ok to ignore for this patch .. we can revisit this based on user 
feedback.



--
To view, visit http://gerrit.cloudera.org:8080/16242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic638af9495981d889a4cb7455a71e8be0eb1a8e5
Gerrit-Change-Number: 16242
Gerrit-PatchSet: 33
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com>
Gerrit-Reviewer: David Rorke <dro...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Shant Hovsepian <sh...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Tue, 09 Feb 2021 07:39:17 +0000
Gerrit-HasComments: Yes

Reply via email to