Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/24049 )
Change subject: POC IMPALA-14805: LocalIcebergTable loads files in coordinator ...................................................................... Patch Set 7: (12 comments) Nice change! And sorry for the late response! http://gerrit.cloudera.org:8080/#/c/24049/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/24049/7//COMMIT_MSG@24 PS7, Line 24: weak pointer nit: weak reference http://gerrit.cloudera.org:8080/#/c/24049/7//COMMIT_MSG@49 PS7, Line 49: inconsistant nit: inconsistent http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java File fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java: http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@162 PS7, Line 162: // The following members are considered immutable and shouldn't change after : // constructing the object: At some places we have BEGIN / END sections to make it more explicit. See e.g. StatementBase.java, parquet-column-readers.h http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@231 PS7, Line 231: new IcebergContentFileStore(); I wonder if Java is smart enough to skip the creation of the empty containers. Alternatively, we could introduce a new constructor that takes every immutable member. http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@886 PS7, Line 886: prefixes nit: Could be just 'List.of(getLocation())' http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1257 PS7, Line 1257: getIcebergContentFileStore(final TableMetaRef table, nit: fits earlier line http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1263 PS7, Line 1263: do nit: to http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1271 PS7, Line 1271: WeakReference<MetaProvider.CachedIcebergFiles> lastFileStore = : (WeakReference<MetaProvider.CachedIcebergFiles>) : cache_.getIfPresent(tableKey); Is it possible to move this inside the call() method? http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1299 PS7, Line 1299: // Mutable concurrent hash maps are used to cache files from time-travel queries, : // which is thread-safe, but would increase the size of the cached object without : // updating weight. : // TODO: come up with a way to cache files in time-travel queries I wonder if it even make sense to store time-travel FDs and partitions in the IcebergContentFileStore, since they are not being re-used anyway (except maybe in legacy catalog mode). http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java: http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@138 PS7, Line 138: isInIcebergTable_ Is it necessary to have this? The internal FS table should be minimal since IMPALA-13737. http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java: http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java@98 PS7, Line 98: loadedInCatalog_ It is only written, never read. http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java@158 PS7, Line 158: // TODO: understand if this is needed with local loading LocalTable.load() invokes t.loadColumnStats() -> metaProvider.loadTableColumnStatistics() -> getPartialCatalogObject() after loading the table. Since file loading happens between the first getPartialCatalogObject() and the last, I'd assume we still need to warmup the cache. Or it would be nice to avoid the CatalogD interaction in this case. -- To view, visit http://gerrit.cloudera.org:8080/24049 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6732af76a2e040fa57e39260302951466037b934 Gerrit-Change-Number: 24049 Gerrit-PatchSet: 7 Gerrit-Owner: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 24 Mar 2026 14:08:52 +0000 Gerrit-HasComments: Yes
