Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20434 )

Change subject: IMPALA-12408: Optimize HdfsScanNode.computeScanRangeLocations()
......................................................................


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/20434/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20434/5//COMMIT_MSG@21
PS5, Line 21: in Impala
nit: during Impala planning,


http://gerrit.cloudera.org:8080/#/c/20434/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/20434/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@333
PS5, Line 333: Map<Long, List<FileDescriptor>> sampledFiles_ = null;
Please document what is the Long key in this map represent. Looks like it is a 
partition ID?


http://gerrit.cloudera.org:8080/#/c/20434/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1150
PS5, Line 1150: for (FeFsPartition partition: partitions_) {
General question: is it worth or even possible to parallelize this loop? Maybe 
using Java's parallel stream?


http://gerrit.cloudera.org:8080/#/c/20434/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1153
PS5, Line 1153: String partitionLocation = partition.getLocation();
              :       Path partitionPath = new Path(partitionLocation);
> question: is it make sense to pass down the partitionPath instead of the ra
Ok, so consistent hashCode from Java's String.hashCode() turns out to be 
important. Maybe a good idea to point that out as comment.



--
To view, visit http://gerrit.cloudera.org:8080/20434
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icf3e9c169d65c15df6a6762cc68fbb477fe64a7c
Gerrit-Change-Number: 20434
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Wed, 30 Aug 2023 17:02:26 +0000
Gerrit-HasComments: Yes

Reply via email to