lipeng...@apache.org has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/19379 )

Change subject: IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance
......................................................................

IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance

As the cost of directory listing on Cloud Storage Systems such as OSS or
S3 is higher than the cost on HDFS, we could create the file descriptors
from the rich metadata provided by Iceberg instead of using
org.apache.hadoop.fs.FileSystem#listFiles. The only thing missing there
is the last_modification_time of the files. But since Iceberg files are
immutable, we could just come up with a special timestamp for these
files.

At the same time, we can also construct file descriptors ourselves
during time travel to reduce the cost of requests with OSS services.

Test:
 * existing tests
 * test on COS with my local test environment

Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a
---
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
A fe/src/main/java/org/apache/impala/catalog/IcebergFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/GroupedContentFiles.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
9 files changed, 332 insertions(+), 107 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/19379/9
--
To view, visit http://gerrit.cloudera.org:8080/19379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a
Gerrit-Change-Number: 19379
Gerrit-PatchSet: 9
Gerrit-Owner: Anonymous Coward <lipeng...@apache.org>
Gerrit-Reviewer: Andrew Sherman <asher...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <lipeng...@apache.org>
Gerrit-Reviewer: Gergely Fürnstáhl <gfurnst...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tma...@apache.org>
Gerrit-Reviewer: Xiaoqing Gao <gaoxq...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to