Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20022 )

Change subject: IMPALA-11535: Skip older events in the event processor based on 
the latestRefreshEventID
......................................................................


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/20022/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20022/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1527
PS4, Line 1527:             reloadFileMetadata, reloadTableSchema, false, 
partitionsToUpdate,
              :             debugAction, partitionToEventId, reason);
Shouldn't we consider reloadFileMetadata/reloadTableSchema/partitionsToUpdate?

My concern is that we may update LastRefreshEventId() even for a partial reload 
of the table and ignore events that would trigger a full reload.


http://gerrit.cloudera.org:8080/#/c/20022/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1532
PS4, Line 1532:       // Update the lastRefreshEventId at the table level if it 
is unpartitioned table
              :       // if it is partitioned table, partitions are updated in 
HdfsTable#load() method
This also applies to other table types, e.g. Iceberg/Kudu/HBase, right?


http://gerrit.cloudera.org:8080/#/c/20022/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/20022/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3222
PS4, Line 3222:     createTable(testTblName, true);
Both this and the Python test used a partitioned table. As this feature is 
implemented differently for non-partitioned tables, it would be nice to test 
them too.


http://gerrit.cloudera.org:8080/#/c/20022/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3231
PS4, Line 3231:     alterTableAddParameter(testTblName, "somekey", "someval");
Why is calling alterTable from Impala relevant here?


http://gerrit.cloudera.org:8080/#/c/20022/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3233
PS4, Line 3233:     testTbl = (HdfsTable)catalog_.getOrLoadTable(TEST_DB_NAME, 
testTblName,
              :         "test", null);
              :     assertTrue(testTbl.getLastRefreshEventId()
Shouldn't we load a table and call getLastRefreshEventId before 
eventsProcessor_.processEvents();?



--
To view, visit http://gerrit.cloudera.org:8080/20022
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0dc5c7396d80616680d8a5805ce80db293b72e1
Gerrit-Change-Number: 20022
Gerrit-PatchSet: 4
Gerrit-Owner: Sai Hemanth Gantasala <saihema...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Sai Hemanth Gantasala <saihema...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Mon, 07 Aug 2023 13:55:46 +0000
Gerrit-HasComments: Yes

Reply via email to