Re: RabbitMQ MailQueue & delay : implementation proposal

Matthieu Baechler Fri, 20 Sep 2019 02:00:46 -0700

Hi Benoit,

Thank you for bringing that subject to the mailing list.

On Fri, 2019-09-20 at 13:46 +0700, Tellier Benoit wrote:
> Hello all,
> 
> As off 20/09/2019 delays are not supported on top of RabbitMQ
> MailQueue.
> 
> While this is not a problem for a "Mail Delivery Agent" server, this
> is
> a major concern for a "Mail Exchange" server, as stated out by
> @splainez
> on the gitter channel.
> 
> A possible implementation came to my mind regarding this concern:
> 
>  - When a delay is specified, we save the message in the object
> storage,
>  fire a message on a **MailQueueDelayExchange**, and persist it on
> the
> MailQueueView.
>  - Each James listens on a single Queue plugged to the
> **MailQueueDelayExchange**.
>  - For each incoming message, the receiver will position a timer
> until
> planned delivery (date).
>  - Upon timer completion, we ack the message of
> MailQueueDelayExchange,
> then we put the corresponding message in the mail RabbitMQMailQueue
> (no
> need to update the mailQueueView nor store again the blob).
>  - Upon connection loss, the message will be nack and will be then
> handled by another s/consumer/jamesServer/.
> 
> Obviously:
>  - We need synchronized clocks "best effort" - think NTP
>  - This solution can duplicate emails upon connection loss - a local
> James needs invalidate the entries he is waiting for upon connection
> loss.
> 

It may work. However, I'm not satisfied by the state of the RabbitMQ
Mailqueue implementation.

The design is very complex (mostly because of the coupling with
Cassandra) and it looks very brittle to me. We managed to break it
recently without even noticing the problem (AFAIR, we broke delete
feature).

I would like to challenge the initial choice once more.

Here is a list of facts:

1. Given that we are not able, for now, to setup a reliable cluster of
RabbitMQ servers, we probably don't gain anything at using RabbitMQ vs
not-embedded ActiveMQ

2. Once we'll have a clustered RabbitMQ, chances are high that we'll
need to fix some issues (see https://www.rabbitmq.com/ha.html about
mirrored queues).

3. The code is very complex and add some load to Cassandra.

Here is the list of question I think we should try to answer: 

1. Do we have any evidence ActiveMQ is a limiting factor of MailQueue
handling?

2. Do we have any evidence that single-node RabbitMQ is better than
ActiveMQ?

3. What is the estimated load we think we can handle with ActiveMQ if
we invest in reasonable optimizations?

4. Do we think we'll ever get a robust MailQueue with that design? At
which kind of cost?

All these questions should be evaluated in a short-term and long-term
perspective: 

* Should we stop investing for now in RabbitMQ because it solves no
issue without adding some investment?

* Do we think it's the right solution in a long-term perspective or
will we switch to a better alternative?

Sorry to bring my uncertainties to this discussion but it's probably
still time to look at what we've done, evaluate it and maybe change our
strategy if needed.

Cheers,

-- 
Matthieu Baechler

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org

Re: RabbitMQ MailQueue & delay : implementation proposal

Reply via email to