[ https://issues.apache.org/jira/browse/IMPALA-12709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818229#comment-17818229 ]
Maxwell Guo edited comment on IMPALA-12709 at 2/19/24 11:24 AM: ---------------------------------------------------------------- Hi [~VenuReddy] ,After reading the code ,I only found [EventsProcessorStressTest title|https://github.com/apache/impala/blob/master/fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java] which may has some relations with performance, But I think some function customization is required if we want to the code. [~stigahuang] [~mylogi...@gmail.com] any more suggestions? Besides, What about make this patch configurable, one of the benefits is that you can visually see the comparison results through configuration without changing this code, and I think new features are generally turned off by default. was (Author: maxwellguo): Hi [~VenuReddy] ,After reading the code ,I only found [EventsProcessorStressTest title|https://github.com/apache/impala/blob/master/fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java] which may has some relations with performance, But I think some function customization is required if we want to you the code. [~stigahuang] [~mylogi...@gmail.com] any more suggestions? Besides, What about make this patch configurable, one of the benefits is that you can visually see the comparison results through configuration without changing this code, and I think new features are generally turned off by default. > Hierarchical metastore event processing > --------------------------------------- > > Key: IMPALA-12709 > URL: https://issues.apache.org/jira/browse/IMPALA-12709 > Project: IMPALA > Issue Type: Improvement > Components: Catalog > Reporter: Venugopal Reddy K > Assignee: Venugopal Reddy K > Priority: Major > Attachments: Hierarchical metastore event processing.docx > > > *Current Issue:* > At present, metastore event processor is single threaded. Notification events > are processed sequentially with a maximum limit of 1000 events fetched and > processed in a single batch. Multiple locks are used to address the > concurrency issues that may arise when catalog DDL operation processing and > metastore event processing tries to access/update the catalog objects > concurrently. Waiting for a lock or file metadata loading of a table can slow > the event processing and can affect the processing of other events following > it. Those events may not be dependent on the previous event. Altogether it > takes a very long time to synchronize all the HMS events. > *Proposal:* > Existing metastore event processing can be turned into multi-level event > processing. Idea is to segregate the events based on their dependency, > maintain the order of events as they occur within the dependency and process > them independently as much as possible: > # All the events of a table are processed in the same order they have > actually occurred. > # Events of different tables are processed in parallel. > # When a database is altered, all the events relating to the database(i.e., > for all its tables) occurring after the alter db event are processed only > after the alter database event is processed ensuring the order. > Have attached an initial proposal design document > https://docs.google.com/document/d/1KZ-ANko-qn5CYmY13m4OVJXAYjLaS1VP-c64Pumipq8/edit?pli=1#heading=h.qyk8qz8ez37b -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org