Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23942 )

Change subject: IMPALA-14230: Add catch-up mode for event processing
......................................................................


Patch Set 7:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/23942/6/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/23942/6/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@956
PS6, Line 956:     protected Collection<TableName> getTableNames() {
             :       return tableWriteIds_.stream()
             :           .map(writeId -> new TableName(writeId.getDbName(), 
writeId.getTblName()))
             :           .collect(Collectors.toSet());
> That's a good point. This won't make a big difference in performance, just
Thanks. I added getTableNames() as suggested. To avoid making backports 
difficult, I minimized the changes. As it's supposed to call only once during 
the event lifecycle, so it should be fine


http://gerrit.cloudera.org:8080/#/c/23942/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/23942/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3427
PS6, Line 3427: skipIfInvalidate
> nit: let's replace "incomplete" with "unloaded" or "invalidated" since they
Done


http://gerrit.cloudera.org:8080/#/c/23942/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/23942/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1978
PS4, Line 1978: canBeBatched(event));
> > a. my understanding is that each coordinator request will start to load t
That is a good idea. I looked into checking createEventId and 
lastRefreshEventId before catch-up mode. However, doing it correctly requires 
more consideration because some events (like AddPartitionEvent and 
CommitTxnEvent) don't seem to use isOlderEvent() at all. Would it be okay to 
defer this? I can create a follow-up ticket for this optimization, and this 
patch will provide the basic catch-up mode first


http://gerrit.cloudera.org:8080/#/c/23942/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/23942/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1087
PS6, Line 1087:         infoLog("Catch-up Mode: Skipping event due to event lag 
of {}s", lag);
> I think we can put the invalidate code here and just let subclass implement
Done


http://gerrit.cloudera.org:8080/#/c/23942/5/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/23942/5/tests/custom_cluster/test_events_custom_configs.py@680
PS5, Line 680:       impalad_args="--use_local_catalog=false",
> Oh, that may be true. I prefer to avoid magic like that.
I added a check for "events-skipped" metric in the catchup_mode_txn_test. But 
since the metric doesn't distinguish between event types, the log is still 
helpful to ensure we actually entered catch-up mode for the right event


http://gerrit.cloudera.org:8080/#/c/23942/6/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/23942/6/tests/custom_cluster/test_events_custom_configs.py@2101
PS6, Line 2101:
> Though we are using ALTER TABLE statements, the HMS events in this test are
I added the two test cases. For the Iceberg one, I wasn't completely sure how 
to measure that there is 'no lag', the process seems fast, so the test just 
ensures that event processing into the catch-up mode with the dalay



--
To view, visit http://gerrit.cloudera.org:8080/23942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib906c06346d5d3159999eeac632e1318bc060065
Gerrit-Change-Number: 23942
Gerrit-PatchSet: 7
Gerrit-Owner: Yida Wu <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Mon, 02 Mar 2026 04:03:09 +0000
Gerrit-HasComments: Yes

Reply via email to