Vuk Ercegovac has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8523
Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls (wip) ...................................................................... IMPALA-5931: Generates scan ranges in planner for s3/adls (wip) Currently, for filesystems that do not include physical block information (e.g., block replica locations, caching), synthetic blocks are generated and stored in the catalog when metadata is loaded. Example file systems for which this is done includes S3, ADLS, and local fs. This change avoids generating these blocks when metadata is loaded. Instead, scan ranges are directly generated from such files by the HDFSScanNode when planning. As a result, less space is used for the catalog and less nework bandwidth is needed during its replication. In addition a bug is avoided where non-splittable files were being split anyways to support the query parameter that places a limit on scan ranges. The WIP status is there pending tests for s3 and adls as well as to get initial feedback on the approach. Main thing I'm looking for is whether there are thoughts on pushing more of the logic directly into the coordinator. Testing: - local filesystem tests exercise this code path - manually tried larger local filesystem tables (tpch) with multiple partitions and observed the same scan ranges. - TODO: s3 and adls testing Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 --- M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java 3 files changed, 180 insertions(+), 154 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/1 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk Ercegovac <vercego...@cloudera.com>