Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 )
Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls ...................................................................... Patch Set 13: (4 comments) Thanks, that looks simpler and clearer now. Just some minor things. http://gerrit.cloudera.org:8080/#/c/8523/13/be/src/scheduling/scheduler.cc File be/src/scheduling/scheduler.cc: http://gerrit.cloudera.org:8080/#/c/8523/13/be/src/scheduling/scheduler.cc@302 PS13, Line 302: for (const TScanRangeLocationList& range : entry.second.concrete_range) { : expanded_locations.push_back(range); : } that could be just expanded_locations.insert(concrete_range.begin(), concrete_range.end())? http://gerrit.cloudera.org:8080/#/c/8523/13/common/thrift/Planner.thrift File common/thrift/Planner.thrift: http://gerrit.cloudera.org:8080/#/c/8523/13/common/thrift/Planner.thrift@108 PS13, Line 108: concrete_range plural since the field is a list: concrete_ranges, split_specs http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java: http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@224 PS13, Line 224: scanRangeSpecs_.getSplit_specSize() this seems wrong - a single spec may result in multiple hosts, no? Though I guess for hbase this won't be set so in practice doesn't matter. But should we instead just assert that the split_spec list size is 0? http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1126 PS13, Line 1126: scanRangeSpecs_.getSplit_specSize(); shouldn't that do some calculation based on file size and blocks size and is splittable? i.e. after your change we'll get a different number for numRemoteRanges when running on S3, right? -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 13 Gerrit-Owner: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Comment-Date: Fri, 18 May 2018 23:09:26 +0000 Gerrit-HasComments: Yes