Benoit,

This is very interesting work, thank you for contributing. I am interested in 
learning more about the event system.

Robert


> On Nov 28, 2015, at 10:09 AM, Tellier Benoit <btell...@apache.org> wrote:
> 
> Hi,
> 
> I just wanted to present my work on James event system.
> 
> ## What is James event system ?
> 
> The mailbox event system conveys notifications about modifications of
> the mailboxes and messages states. You can register listener to it so
> that you can be notified.
> 
> ## What it is used for ?
> 
> It is used for :
> 
> - IMAP IDLE : allow one to subscribe to a specific mailbox and gets
> notified about changes without to pull the mailbox.
> 
> - Quota system : updates about stored quota are made outside the
> MailboxManager as it may involve large quota calculations
> 
> - Indexing of messages for the Search feature (ElasticSearch and Lucene
> implementation )
> 
> - IMAP Sequence Number handling.
> 
> - Cache invalidation (caching project, not yet exposed to configuration)
> 
> - Many others
> 
> ## Why do we need it to be distributed ?
> 
> I want to see this feature distributed as I personally really love IDLE
> feature. I want my Thunderbird to be allowed to use this in a
> distributed environment.
> 
> I also think one might be interested to make several James work in
> parallel with any kind of architecture (Quotas, messages search indexes).
> 
> ## What are different configuration options ?
> 
> I reviewed the event system.
> 
> First thing is to explicitly specify a listener distributed status. It
> can be either :
> 
> - Registered per mailbox
> - The listener needs just to be notified about all local events
> - The listener needs to be notified about all events in your James cluster.
> 
> Then, we keep the in memory default implementation (little reworked
> using guava). And I added two other architectures for the event system.
> 
> #### Registration based event system
> 
> With this implementation, you want to exchange events on the network.
> You want a James system to be only notified about events it explicitly
> registered to. Because of that :
> 
> - This approach is thought for architecture with a large number of
> James server
> - It does not support event listener that needs to be notified of all
> events in the cluster.
> 
> Each server listens on a message queue and a registration mechanism is
> used to identify to which server we need to send the events. Of course
> you have event serialization / deserialization.
> 
> Today :
> - Kafka is used for the messaging
> - Cassandra is used for registration management
> 
> This solution was presented at Paris Cassandra Meet-up.
> 
> #### Broadcast event system
> 
> With this implementation, you want to have several James working
> together but you relies on Mailbox Listeners that needs to be notified
> about every event in your data center.
> 
> These listeners could be :
> 
> - Lucene document indexing
> - In memory quotas
> - In memory cache
> 
> The idea here is to naively broadcast the events to all your James. They
> are notified about every events (so scalability will be limited).
> 
> You also have to be aware that events can be duplicated /non emitted
> (james server crash, network partitions) so local data might be
> inconsistent. It seems OK for instance for quota calculation.
> 
> ## What do I need to know as an administrator ?
> 
> Distributed use of Message Sequence Number (that demands high degree of
> coordination) is risky. The inconsistency window between server may be
> large, and the corresponding between UID and message sequence number is
> not eventually consistent. This topic is in discussion on the dev
> mailing list.
> 
> I corrected an issue I spotted month before : a faulty mailbox listener
> might stop the event delivery chain and generate IMAP service
> unavailability. I added a commit to not propagate errors inside mailbox
> Listeners.
> 
> I want to finish this section by speaking of event serialization. You
> can either choose :
> 
> - JSON
> - MessagePack
> 
> The first one is faster to compute but larger. So it let you trade
> compute power versus network.
> 
> ## Event delivery modes
> 
> As you might have noticed, Mailbox Listener can take a long time to
> execute, and for some of them, they can safely be executed
> asynchronously (IDLE, indexation and even quotas).
> 
> I added an Event Delivery abstraction. Thanks to this, you can configure
> your James to :
> 
> - Synchronously deliver events (todays behavior)
> - Asynchronously deliver events ( returns before having delivered
> events, Mailbox Listener are notified in parallel in a thread pool)
> - Mixed mode : Every Mailbox Listener indicates if it should be
> synchronously or asynchronously executed.
> 
> The asynchronous option can be considered as risky. The mixed one is
> safe, and significantly reduces latencies if you rely on document indexing.
> 
> ## Re indexers
> 
> I also added the availability to re index documents in a Message Search
> index using the CLI :
> 
> - per mailbox : the event system is used to track changes made to the
> given mailbox and significantly reduce the concurrent changes window.
> - your whole James mailboxes : the event system is used to keep track
> of deleted mailboxes.
> 
> ## My future works on the event system.
> 
> Finish the work on MAILBOX-257 : one should be able to recalculate quotas.
> 
> Unfortunately it is not yet planned in my todo list...
> 
> Benoit
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: server-user-unsubscr...@james.apache.org
> For additional commands, e-mail: server-user-h...@james.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: server-user-unsubscr...@james.apache.org
For additional commands, e-mail: server-user-h...@james.apache.org

Reply via email to