Benoit Tellier created JAMES-4159:
-------------------------------------
Summary: EventBus: improve slow listener handling
Key: JAMES-4159
URL: https://issues.apache.org/jira/browse/JAMES-4159
Project: James Server
Issue Type: Improvement
Reporter: Benoit Tellier
h3. Context
At LINAGORA we developped some custom listeners responsible of doing AI related
tasks, either querying or feeding AI models. Mailbox listeners are a good fit
because they are not blocking / delaying mail reception.
The EventBus works the following way: it publishes a message onto RabbitMQ, and
initially is consummed once by a James node that would execute all listeners.
Using QOS we are consuming up to 10 messages simultaneously per node. Failed
messages would then be re-published to individual per-listener retry queue.
This "bundling" is done so that we limit the chatter of deserialization +
wiring of the event bus, where each listener would have needed those
operations, with separated execution. The amount of CPU time dedicated to those
operations was significant ( > 40%).
However this bundling means we are not efficient handling "slow" listeners
which at scale can sinificantly distrurb and delay event treatment.
h3. Proposal
Be able to configure a timeout on the initial eventbus run, for each listener.
That way, slow listeners would be individually retried.
In `listeners.xml`:
{code:xml}
<listeners>
<executeGroupListeners>true</executeGroupListeners>
<!-- New block -->
<qos>20</qos>
<initialExecutionTimeout>30s</initialExecutionTimeout>
<executionTimeout>120s</executionTimeout>
<!-- -->
<listener>
<class>org.apache.james.mailbox.cassandra.MailboxOperationLoggingListener</class>
</listener>
</listeners>
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]