Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/23942 )
Change subject: IMPALA-14230: Add catch-up mode for event processing ...................................................................... Patch Set 7: (6 comments) http://gerrit.cloudera.org:8080/#/c/23942/6/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/23942/6/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@956 PS6, Line 956: protected Collection<TableName> getTableNames() { : return tableWriteIds_.stream() : .map(writeId -> new TableName(writeId.getDbName(), writeId.getTblName())) : .collect(Collectors.toSet()); > That's a good point. This won't make a big difference in performance, just Thanks. I added getTableNames() as suggested. To avoid making backports difficult, I minimized the changes. As it's supposed to call only once during the event lifecycle, so it should be fine http://gerrit.cloudera.org:8080/#/c/23942/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/23942/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3427 PS6, Line 3427: skipIfInvalidate > nit: let's replace "incomplete" with "unloaded" or "invalidated" since they Done http://gerrit.cloudera.org:8080/#/c/23942/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/23942/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1978 PS4, Line 1978: canBeBatched(event)); > > a. my understanding is that each coordinator request will start to load t That is a good idea. I looked into checking createEventId and lastRefreshEventId before catch-up mode. However, doing it correctly requires more consideration because some events (like AddPartitionEvent and CommitTxnEvent) don't seem to use isOlderEvent() at all. Would it be okay to defer this? I can create a follow-up ticket for this optimization, and this patch will provide the basic catch-up mode first http://gerrit.cloudera.org:8080/#/c/23942/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/23942/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1087 PS6, Line 1087: infoLog("Catch-up Mode: Skipping event due to event lag of {}s", lag); > I think we can put the invalidate code here and just let subclass implement Done http://gerrit.cloudera.org:8080/#/c/23942/5/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/23942/5/tests/custom_cluster/test_events_custom_configs.py@680 PS5, Line 680: impalad_args="--use_local_catalog=false", > Oh, that may be true. I prefer to avoid magic like that. I added a check for "events-skipped" metric in the catchup_mode_txn_test. But since the metric doesn't distinguish between event types, the log is still helpful to ensure we actually entered catch-up mode for the right event http://gerrit.cloudera.org:8080/#/c/23942/6/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/23942/6/tests/custom_cluster/test_events_custom_configs.py@2101 PS6, Line 2101: > Though we are using ALTER TABLE statements, the HMS events in this test are I added the two test cases. For the Iceberg one, I wasn't completely sure how to measure that there is 'no lag', the process seems fast, so the test just ensures that event processing into the catch-up mode with the dalay -- To view, visit http://gerrit.cloudera.org:8080/23942 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib906c06346d5d3159999eeac632e1318bc060065 Gerrit-Change-Number: 23942 Gerrit-PatchSet: 7 Gerrit-Owner: Yida Wu <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Mon, 02 Mar 2026 04:03:09 +0000 Gerrit-HasComments: Yes
