Venugopal Reddy K created IMPALA-12709:
------------------------------------------

             Summary: Hierarchical metastore event processing
                 Key: IMPALA-12709
                 URL: https://issues.apache.org/jira/browse/IMPALA-12709
             Project: IMPALA
          Issue Type: Improvement
          Components: Catalog
            Reporter: Venugopal Reddy K


*Current Issue:*

At present, metastore event processor is single threaded. Notification events 
are processed sequentially with a maximum limit of 1000 events fetched and 
processed in a single batch. Multiple locks are used to address the concurrency 
issues that may arise when catalog DDL operation processing and metastore event 
processing tries to access/update the catalog objects concurrently. Waiting for 
a lock or file metadata loading of a table can slow the event processing and 
can affect the processing of other events following it. Those events may not be 
dependent on the previous event. Altogether it takes a very long time to 
synchronize all the HMS events.

 

*Proposal:*

Existing metastore event processing can be turned into multi-level event 
processing. Idea is to segregate the events based on their dependency, maintain 
the order of events as they occur within the dependency and process them 
independently as much as possible:
 # All the events of a table are processed in the same order they have actually 
occurred.
 # Events of different tables are processed in parallel.
 # When a database is altered, all the events relating to the database(i.e., 
for all its tables) occurring after the alter db event are processed only after 
the alter database event is processed ensuring the order.A

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to