Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20367 to look at the new patch set (#45). Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs ...................................................................... IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs The idea is that when any DDL/DML operation is performed by Impala, it also syncs the db/table to its latest event ID as per HMS. This way updates to a db/table's are applied in the same order as they appear in the Notification log table in HMS which ensures consistency. Currently catalogD applies any updates received from Impala clients in-place. Instead it should perform an HMS operation first and then replay all the HMS events since the last synced event id. Implementation: when the enable_sync_to_latest_event_on_ddls flag is set to true, we do the DDL/DML operation first, i.e., perform HMS operation and then sync the db/table in the catalogD's cache to the latest event in HMS for the corresponding db/table. By leveraging HIVE-27499 we are directly fetching the events only for the respective db/table and process them. Set 'enable_sync_to_latest_event_on_ddls'to true to enable this feature. Performance impact: DDL/DML might need more time to execute due to fetching and applying other events for corresponding metadata object. Note: We don't modify the cache using MetastoreEventsProcessor for alter table rename operation as this is a complex operation regarding cache modification (IMPALA-12553 has more details about this). We also don't modify the cache this way for the truncate table operation, unless the table is replicated or an Iceberg table. The same applies to insert operation if the table is in Iceberg format or transactional table. We don't modify cache using above process for 'refresh table'/ 'invalidate metadata table' commands. Testing: 1) Added few tests in the MetaStoreEventProcessorForTest to verify this feature that simulates the metadata sync between HMS and Impala. 2) Added few tests in the CatalogHmsSyncToLatestEventIdTest class to the metadata sync between HMS end point, Catalog Metastore Server and Impala. The HMS end point serves as common interface to metadata changes outside the current Impala service such as Hive, Spark or other Impala service. Also verified the table lastSyncEventId is updated after the events are sync and confirmed that metastore event processor ignored these synced events. 3) Added some end-to-end tests in test_sync_to_latest_hms_events.py Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java M fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java A tests/custom_cluster/test_sync_to_latest_hms_events.py A tests/metadata/test_common_ddl.py M tests/metadata/test_ddl.py M tests/metadata/test_ddl_base.py M tests/metadata/test_event_processing.py M tests/metadata/test_recover_partitions.py 20 files changed, 1,341 insertions(+), 545 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/20367/45 -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 45 Gerrit-Owner: Sai Hemanth Gantasala <saihema...@cloudera.com> Gerrit-Reviewer: Anonymous Coward <k.venureddy2...@gmail.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Sai Hemanth Gantasala <saihema...@cloudera.com>