For sure :) Just having testing infrastructure that tests murder would go a long way to avoiding that mess again.
The more I think about it, the more having the SAME mailboxes.db for both local and remote data doesn't make sense. We should have a separate central database that the mupdate_activate, etc write to. It can just be a standalone SQL database, or a cluster database, or who cares... the main thing is that only a few of the MBOXLIST commands need to care (because they will return the remote information if needed) Bron. On Sat, Mar 14, 2015, at 09:54 AM, Dave McMurtrie wrote: > From my phone, so excuse brevity and top-posting, but Fastmail running > murder would be a huge bonus. I not-so-fondly recall the intimate > relationship I developed with gdb debugging murder issues when we > upgraded from 2.3 to 2.4 :) > > > Sent via the Samsung GALAXY S® 5, an AT&T 4G LTE smartphone > > > > -------- Original message -------- > From: Bron Gondwana <br...@fastmail.fm> > Date:03/13/2015 6:50 PM (GMT-05:00) > To: Cyrus Devel <cyrus-devel@lists.andrew.cmu.edu> > Cc: > Subject: What would it take for FastMail to run murder > > > So I've been doing a lot of thinking about Cyrus clustering, with the > underlying question being "what would it take to make FastMail run a > murder". We've written a fair bit about our infrastructure - we use > nginx as a frontend proxy to direct traffic to backend servers, and have > no interdependencies between the backends, so that we can scale > indefinitely. With murder as it exists now, we would be pushing the > limits of the system already - particularly with the globally > distributed datacentres. > > Why would FastMail consider running murder, given our existing > nice system? > > a) we support folder sharing within businesses, so at the moment we are > limited by the size of a single slot. Some businesses already push > that limit. > b) it's good to dogfood the server we put so much work into. > > Here are our deal-breaker requirements: > > 1) unified murder - we don't want to run both a frontend AND a backend > imapd process for every single connection. We already have nginx, > which is non-blocking, for the initial connection and auth handling. > 2) no table scans - anything that requires a parse and ACL lookup for > every single row of mailboxes.db is going to be a non- starter when > you multiply the existing mailboxes.db size by hundreds. > 3) no single-point-of-failure - having one mupdate master which can stop > the entire cluster working if it's offline, no thanks. > > Thankfully, the state of the art in distributed databases has moved a > long way since mupdate was written. We'd have to at least change the > mupdate protocol anyway to handle newly added fields, so why not just do > away with it and have every server run a local node of a distributed > database protocol for its mailboxes.db. > > Along with this, we need a reverse lookup for ACLs, so that any one user > doesn't ever need to scan the entire mailboxes.db. This might be hooked > into the distributed DB as well, or calculated locally on each node. > > And that's pretty much it. There are some interesting factors around > replication, and I suspect the answer here is to have either multi- > value support or embed the backend name into the mailboxes.db key > (postfix) such that you wind up listing the same mailbox multiple > times. We already suppress duplicates in the LIST command, so all we > need then is logic for choosing the actual master. Rob N has done some > work with consul and etcd already at FastMail, and we would use either > that or a flag in the distributed DB to drive master choice for backend > connection purposes. > > There are a bunch of "nice to have"s on top of this, but I think this > would be enough for us to convert our existing standalone servers over > to a murder. > > Bron. > > -- > Bron Gondwana > br...@fastmail.fm > -- Bron Gondwana br...@fastmail.fm