Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17857 )
Change subject: IMPALA-10914: Consistently schedule scan ranges for Iceberg tables ...................................................................... Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/17857/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17857/1//COMMIT_MSG@31 PS1, Line 31: service ID and new catalog version for the Iceberg table > Doesn't Iceberg have something similar to catalog version, e.g. something l +1 if we have native version number support in Iceberg it is better to use that detect self-events. I am okay doing it in a separate patch if needed. The self-events logic for ALTER events is up for a overhaul in https://gerrit.cloudera.org/#/c/17756/ which would use the eventId based detection instead of catalog version number. http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@348 PS1, Line 348: hdfsTable_ : .load(false, msClient, msTable_, true, true, false, null, null,null, reason); : pathHashToFileDescMap_ = Utils.loadAllPartition(this); This is probably unrelated to this patch I was curious to understand what is the difference between the file descriptors which are loaded by hdfsTable_.load() and the loadAllPartition(this) method here. Is there a overlap in these 2 collections of file descriptors? http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@481 PS1, Line 481: TGetPartialCatalogObjectResponse resp = : getHdfsTable().getPartialInfo(req, missingPartialInfos); : if (req.table_info_selector.want_iceberg_snapshot) { : resp.table_info.setIceberg_snapshot( : FeIcebergTable.Utils.createTIcebergSnapshot(this)); : } are we sending some file descriptors twice here? Some from the hdfsTable and some from the pathHashToFileDescMap? http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java: http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java@116 PS1, Line 116: pathHashToFileDescMap_ Am I correct in understanding that pathHashToFileDescMap_ is always loaded from the snapshot? If yes, does catalogd need to load the filedescriptors in the hdfsTable_ backing this iceberg table? http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2915 PS1, Line 2915: newCatalogVersion = catalog_.incrementAndGetCatalogVersion(); It is not clear to me why do we need to increment the catalog version here again? http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2918 PS1, Line 2918: iceTxn.commitTransaction(); : return newCatalogVersion; Can you move these statements into a finally block and avoid code duplication in the if block and later down below? http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6070 PS1, Line 6070: addVersionsForInflightEvents Can we add test coverage for this in the test_self_events? -- To view, visit http://gerrit.cloudera.org:8080/17857 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb8216b37d350469b573dad7fcefdd0ee0599ed5 Gerrit-Change-Number: 17857 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tamas Mate <tm...@cloudera.com> Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Comment-Date: Mon, 20 Sep 2021 20:48:33 +0000 Gerrit-HasComments: Yes