Csaba Ringhofer created IMPALA-12463:
----------------------------------------

             Summary: Allow batching of non consecutive metastore events
                 Key: IMPALA-12463
                 URL: https://issues.apache.org/jira/browse/IMPALA-12463
             Project: IMPALA
          Issue Type: Improvement
          Components: Catalog
            Reporter: Csaba Ringhofer


Currently Impala tries to batch events like partition insert/creation only if:
1. the next event is for the same table as the previous one
2. the next event's id is the previous one's + 1
3. the next event has the same type as the previous one
(2 can be stricter than 1 if some events were filtered between the two)

See 
https://github.com/apache/impala/blob/94f4f1d82461d8f71fbd0d2e9082aa29b5f53a89/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L315

Another limit is that only events in the same batch from HMS can be merged. 
Currently 1000 events are polled at the same time: 
https://github.com/apache/impala/blob/94f4f1d82461d8f71fbd0d2e9082aa29b5f53a89/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L218

Making this configurable could be also useful.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to