Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24049 )

Change subject: POC IMPALA-14805: LocalIcebergTable loads files in coordinator
......................................................................


Patch Set 7:

(3 comments)

answered some of the high level questions

created a review that includes some part of this change and caches 
IcebergContentFileStore: https://gerrit.cloudera.org/#/c/24177/

it is also in WIP state, but is much closer to being mergable than this one

http://gerrit.cloudera.org:8080/#/c/24049/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/24049/7//COMMIT_MSG@15
PS7, Line 15: This assumes that local catalog mode is used, I didn't
            : test without it.
> To move out of PoC, will need some validation to ensure local catalog mode
yes - ideally legacy catalog mode would be dropped, but if it is not possible, 
than we have to enforce that the table is loaded on catalogd


http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/local/IcebergMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/IcebergMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/local/IcebergMetaProvider.java@570
PS7, Line 570:     // TODO: does it make sense to skip file loading here?
> Where else would file loading be skipped if not here?
This patch was mainly aimed at Iceberg tables from catalog, while this file 
deals with the Iceberg REST catalog. I didn't think that use case through 
properly.


http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/local/MultiMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/MultiMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/24049/7/fe/src/main/java/org/apache/impala/catalog/local/MultiMetaProvider.java@219
PS7, Line 219:   // TODO: this doesn't really make sense
> It might not right now, but in the future it could.  The `tryAllProviders`
My problem with MultiMetaProvider is that it checks each catalog in each 
function, not just discovery functions. E.g. if a table is discovered at a 
specific catalog, we should always load it from that one, and not from any 
catalog around. I think that this could be nicely implemented by storing the 
MetaProvider in TableMetaRef, but I would not deal with this in the current 
patch.



--
To view, visit http://gerrit.cloudera.org:8080/24049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6732af76a2e040fa57e39260302951466037b934
Gerrit-Change-Number: 24049
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Thu, 09 Apr 2026 18:37:46 +0000
Gerrit-HasComments: Yes

Reply via email to