Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17857 )

Change subject: IMPALA-10914: Consistently schedule scan ranges for Iceberg 
tables
......................................................................


Patch Set 1:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/17857/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17857/1//COMMIT_MSG@31
PS1, Line 31: service ID and new catalog version for the Iceberg table
> Doesn't Iceberg have something similar to catalog version, e.g. something l
+1 if we have native version number support in Iceberg it is better to use that 
detect self-events. I am okay doing it in a separate patch if needed. The 
self-events logic for ALTER events is up for a overhaul in 
https://gerrit.cloudera.org/#/c/17756/ which would use the eventId based 
detection instead of catalog version number.


http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@348
PS1, Line 348: hdfsTable_
             :             .load(false, msClient, msTable_, true, true, false, 
null, null,null, reason);
             :         pathHashToFileDescMap_ = Utils.loadAllPartition(this);
This is probably unrelated to this patch I was curious to understand what is 
the difference between the file descriptors which are loaded by 
hdfsTable_.load() and the loadAllPartition(this) method here. Is there a 
overlap in these 2 collections of file descriptors?


http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@481
PS1, Line 481:     TGetPartialCatalogObjectResponse resp =
             :         getHdfsTable().getPartialInfo(req, missingPartialInfos);
             :     if (req.table_info_selector.want_iceberg_snapshot) {
             :       resp.table_info.setIceberg_snapshot(
             :           FeIcebergTable.Utils.createTIcebergSnapshot(this));
             :     }
are we sending some file descriptors twice here? Some from the hdfsTable and 
some from the pathHashToFileDescMap?


http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java@116
PS1, Line 116: pathHashToFileDescMap_
Am I correct in understanding that pathHashToFileDescMap_ is always loaded from 
the snapshot? If yes, does catalogd need to load the filedescriptors in the 
hdfsTable_ backing this iceberg table?


http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2915
PS1, Line 2915: newCatalogVersion = catalog_.incrementAndGetCatalogVersion();
It is not clear to me why do we need to increment the catalog version here 
again?


http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2918
PS1, Line 2918:       iceTxn.commitTransaction();
              :       return newCatalogVersion;
Can you move these statements into a finally block and avoid code duplication 
in the if block and later down below?


http://gerrit.cloudera.org:8080/#/c/17857/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6070
PS1, Line 6070: addVersionsForInflightEvents
Can we add test coverage for this in the test_self_events?



--
To view, visit http://gerrit.cloudera.org:8080/17857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb8216b37d350469b573dad7fcefdd0ee0599ed5
Gerrit-Change-Number: 17857
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com>
Gerrit-Comment-Date: Mon, 20 Sep 2021 20:48:33 +0000
Gerrit-HasComments: Yes

Reply via email to