[ 
https://issues.apache.org/jira/browse/IMPALA-12709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818229#comment-17818229
 ] 

Maxwell Guo edited comment on IMPALA-12709 at 2/19/24 11:24 AM:
----------------------------------------------------------------

Hi [~VenuReddy] ,After reading the code ,I only found 
[EventsProcessorStressTest 
title|https://github.com/apache/impala/blob/master/fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java]
 which may has some relations with performance, But I think some function 
customization is required if we want to  the code.  [~stigahuang] 
[~mylogi...@gmail.com] any more suggestions?
Besides, What about make this patch configurable, one of the benefits is that 
you can visually see the comparison results through configuration without 
changing this code, and 
I think new features are generally turned off by default. 


was (Author: maxwellguo):
Hi [~VenuReddy] ,After reading the code ,I only found 
[EventsProcessorStressTest 
title|https://github.com/apache/impala/blob/master/fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java]
 which may has some relations with performance, But I think some function 
customization is required if we want to you the code.  [~stigahuang] 
[~mylogi...@gmail.com] any more suggestions?
Besides, What about make this patch configurable, one of the benefits is that 
you can visually see the comparison results through configuration without 
changing this code, and 
I think new features are generally turned off by default. 

> Hierarchical metastore event processing
> ---------------------------------------
>
>                 Key: IMPALA-12709
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12709
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>            Reporter: Venugopal Reddy K
>            Assignee: Venugopal Reddy K
>            Priority: Major
>         Attachments: Hierarchical metastore event processing.docx
>
>
> *Current Issue:*
> At present, metastore event processor is single threaded. Notification events 
> are processed sequentially with a maximum limit of 1000 events fetched and 
> processed in a single batch. Multiple locks are used to address the 
> concurrency issues that may arise when catalog DDL operation processing and 
> metastore event processing tries to access/update the catalog objects 
> concurrently. Waiting for a lock or file metadata loading of a table can slow 
> the event processing and can affect the processing of other events following 
> it. Those events may not be dependent on the previous event. Altogether it 
> takes a very long time to synchronize all the HMS events.
> *Proposal:*
> Existing metastore event processing can be turned into multi-level event 
> processing. Idea is to segregate the events based on their dependency, 
> maintain the order of events as they occur within the dependency and process 
> them independently as much as possible:
>  # All the events of a table are processed in the same order they have 
> actually occurred.
>  # Events of different tables are processed in parallel.
>  # When a database is altered, all the events relating to the database(i.e., 
> for all its tables) occurring after the alter db event are processed only 
> after the alter database event is processed ensuring the order.
> Have attached an initial proposal design document
> https://docs.google.com/document/d/1KZ-ANko-qn5CYmY13m4OVJXAYjLaS1VP-c64Pumipq8/edit?pli=1#heading=h.qyk8qz8ez37b



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to