Tamas Mate has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/18353 )

Change subject: IMPALA-10737: Optimize the number of Iceberg API Metadata 
requests
......................................................................

IMPALA-10737: Optimize the number of Iceberg API Metadata requests

Iceberg stores the table metadata next to the data files, when this is
accessed through the Iceberg API a filesystem call is executed (HDFS,
S3, ADLS). These calls were used in various places during query
processing and this patch unifies the Iceberg metadata request in the
CatalogD and ImpalaD:
 - CatalogD loads and caches the org.apache.iceberg.Table object.
 - When ImpalaDs request the Table metadata, the current catalog
   snapshot id is sent over and the ImpalaD loads and caches the
   org.apache.iceberg.Table object throught Iceberg API as well.

This approach (loading the Iceberg table twice) was choosen because
the org.apache.iceberg.Table could not be meaningfully serialized and
deserialized. The result of a serialized Table is a lightweight
SerializableTable object which is in the Iceberg core package.

As a result REFRESH/INVALIDATE METADATA is required to reload any
Iceberg metadata changes and the metadata load time is improved.
This improvement is more significant for smaller queries, where the
metadata request has larger impact on the query execution time.

Additionally, the dependency on the Iceberg core package has been
reduced and the TableMetadata/BaseTable class uses has been replaced
with the Table class from the Iceberg api package in most places.

Testing:
 - Passed Iceberg E2E tests.

Change-Id: I5492e0cdb31602f0276029c2645d14ff5cb2f672
---
M common/thrift/CatalogObjects.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
A fe/src/main/java/org/apache/impala/catalog/IcebergTableLoadingException.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test
18 files changed, 269 insertions(+), 254 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/18353/5
--
To view, visit http://gerrit.cloudera.org:8080/18353
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5492e0cdb31602f0276029c2645d14ff5cb2f672
Gerrit-Change-Number: 18353
Gerrit-PatchSet: 5
Gerrit-Owner: Tamas Mate <tma...@apache.org>
Gerrit-Reviewer: Gergely Fürnstáhl <gfurnst...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tma...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to