Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16723 )

Change subject: IMPALA-10314: Optimize planning time for simple limits
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16723/3/fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
File fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java:

http://gerrit.cloudera.org:8080/#/c/16723/3/fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java@209
PS3, Line 209: for (FeFsPartition p : partitions) {
             :         numRows += p.getNumFileDescriptors();
             :         prunedPartitions.add(p);
             :         if (numRows >= analyzer.getSimpleLimitStatus().second) {
             :           // here we only limit the partitions; later in 
HdfsScanNode we will
             :           // limit the file descriptors within a partition
             :           break;
             :         }
> Yes, these are the pruned partitions that will be sent over to the runtime.
I think on balance it'd be better to make the partition selection 
deterministic, just cause it'll be easier to debug, support, etc.

Agree that it might be a good optimization for workloads that were mostly 
comprised of queries like this, but we're not really anticipating such 
workloads.


http://gerrit.cloudera.org:8080/#/c/16723/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/16723/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@845
PS3, Line 845:     long numRows = 0; // only used for the simple limit case
maybe simpleLimitNumRows to make it clearer what it is? Ok to ignore.



--
To view, visit http://gerrit.cloudera.org:8080/16723
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9d6a79263bc092e0f3e9a1d72da5618f3cc35574
Gerrit-Change-Number: 16723
Gerrit-PatchSet: 3
Gerrit-Owner: Aman Sinha <amsi...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Shant Hovsepian <sh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Nov 2020 19:50:42 +0000
Gerrit-HasComments: Yes

Reply via email to