Re: What would it take for FastMail to run murder
On 2015-03-18 01:51, Bron Gondwana wrote: On Wed, Mar 18, 2015, at 09:00 AM, Jeroen van Meeuwen (Kolab Systems) wrote: We promote a standby frontend not otherwise used, to become the new mupdate server. The interruption is a matter of seconds this way, unless of course you're in the typical stalemate. Hmm so maybe it's affordable. It scales up with number-of-servers as well though. Making sure it's up to date costs at least O(number of backends). I suppose in your specific case, which I'm not at all too familiar with, perhaps enhancing murder/mupdate to allow cascading and/or (geo-based) replication and/or multi-master would serve your deployment yet even better? I'm suggesting so because I would be concerned with the round-trip times between datacenters if there were only one mupdate master across all -- and perhaps the replicas are faster in issuing the cmd_set() than the mupdate master is(?). Interesting. Does it also handle the case where the same mailbox gets accidentally created on two servers which aren't replica pairs though? Or do you get a mailbox fork? The race condition is not addressed with it, like it is not addressed currently. I'm not 100% happy living with unaddressed race conditions. Addressing this would be an important part of making FastMail happy to run it. Ghe, neither am I, but c'est la vie. That said, in ~5 years and dozens of murder deployments, I have yet to encounter a situation or even a support case in which one mailbox is -- accidentally or otherwise -- created in two locations without the second failing / being rejected. It solely makes the MUPDATE server not reject the reservation request from a server that uses the same servername if it already has an entry for the same servername!partition, so that the replica successfully creates the local copy -- after which replication is happy. Yeah, that makes sense. Of course, the backend should probably not be reserving so much. There are two things conflated here: 1) I'm running cmd_create in an IMAPd and I want to see if this folder already exists. 2) I'm a replica backend getting a copy of an existing folder (or indeed, a backend which already has a folder) and I'm informing mupdate of the fact. Those two should be treated differently. The first is does this already exist, which is a legitimate question to ask. The second should always succeed. MUPDATE is a representation of facts, and the backends are the masters of those facts. With two-way replication safety however (and in your case, channelled as well, right?), which end of the replication (just in case things end up load-balanced across replicas?) gets to submit the original cmd_set() is up in the air, no? So this would build a scenario in which: pair-1-replica-1.example.org and pair-1-replica-2.example.org present themselves as pair-1.example.org A DNS IN A RR is created for the fail-over address(es) for pair- 1.example.org and attached to whichever replica in the pair is considered the active node. Both replicas would be configured to replicate to one another, which works in a PoC scenario but may seem to require lmtpd/AF_INET delivery. So they both have the same server name in mupdate. Yes, and frontends proxy the connections for mailboxes on the backend to the same fake server address. My plan is that they have different server names in mupdate. There's a separate channel somehow to say which is the primary out of those servers, which can be switched however (failover tooling) based on which servers are up, but the murder has the facts about where the mailbox really exists. It may even have statuscache. Man, how awesome would distributed statuscache be. So there are multiple records for the same mailbox, with different server names, in the murder DB. Would this not open back up a route to entertaining a variety of race conditions (that would need to be addressed somehow) though? Should then one of the duplicate mailboxes be marked as the primary? A scenario that comes up often is the geographically close-yet-distant secondary site for disaster recovery, where a set of backends on the primary site replicate to a set of backends on the secondary site. While initially this succeeds perfectly fine, and the backends on the secondary site can participate in a local-to-site murder, transferring mailboxes from one backend to another on the primary site will fail to replicate to the secondary site's backends (because of their participation in the murder). This is in part because it is not the XFER being replicated as such, but the target backend's CREATE/cmd_set(), which will fail because the mailbox already resides on another backend. I suppose a scenario in which the mupdate master is in fact able to hold multiple records for the same mailbox might also allow us to overcome this conundrum? Would using shared memory address the in-memory problem?
Re: What would it take for FastMail to run murder
On Wed, Mar 18, 2015, at 09:49 PM, Jeroen van Meeuwen (Kolab Systems) wrote: On 2015-03-18 01:51, Bron Gondwana wrote: On Wed, Mar 18, 2015, at 09:00 AM, Jeroen van Meeuwen (Kolab Systems) wrote: We promote a standby frontend not otherwise used, to become the new mupdate server. The interruption is a matter of seconds this way, unless of course you're in the typical stalemate. Hmm so maybe it's affordable. It scales up with number-of- servers as well though. Making sure it's up to date costs at least O(number of backends). I suppose in your specific case, which I'm not at all too familiar with, perhaps enhancing murder/mupdate to allow cascading and/or (geo- based) replication and/or multi-master would serve your deployment yet even better? Hmm, yeah - geo updates and mailboxes.db changes. I'm not super- concerned that it's a slightly slow path - they are rare. Might suck if you're making a ton of changes all at once - but that should be OK too - just make all the changes locally and then blat the whole lot in a single transaction to the murder DB. Or hell, make it eventually consistent. All you need is a zookeeper- style way to anoint one server as the owner of each fragment of namespace. So you can only create a new user's mailbox in one place at once, and then every user can only create mailboxes on their home server. Stop clashes from ever forming that way. There are safe ways to do this that aren't a single mupdate master (which already sucks when you're geographically distributed I'm sure - ask CMU, I'm pretty sure they are running it globally) I'm not 100% happy living with unaddressed race conditions. Addressing this would be an important part of making FastMail happy to run it. Ghe, neither am I, but c'est la vie. That said, in ~5 years and dozens of murder deployments, I have yet to encounter a situation or even a support case in which one mailbox is -- accidentally or otherwise -- created in two locations without the second failing / being rejected. Yeah, it's a rare case because normal users can't do it, and at least in our setup, the user creation itself is brokered through a singleton database, and the location to create the user is calculated then. 1) I'm running cmd_create in an IMAPd and I want to see if this folder already exists. 2) I'm a replica backend getting a copy of an existing folder (or indeed, a backend which already has a folder) and I'm informing mupdate of the fact. Those two should be treated differently. The first is does this already exist, which is a legitimate question to ask. The second should always succeed. MUPDATE is a representation of facts, and the backends are the masters of those facts. With two-way replication safety however (and in your case, channelled as well, right?), which end of the replication (just in case things end up load-balanced across replicas?) gets to submit the original cmd_set() is up in the air, no? Er, not really. Worst case they both do and you resolve them, RIAK style, when they discover each other. So they both have the same server name in mupdate. Yes, and frontends proxy the connections for mailboxes on the backend to the same fake server address. Yeah, of course. We used to do this with FastMail - failover IP - but it doesn't work across datacentres, so instead we have a source of truth (a DB backed daemon for now, but consul soon) which says where the master is right now, and nginx just connects directly to the slot IP for the master end - so we can proxy to a different datacentre transparently. It may even have statuscache. Man, how awesome would distributed statuscache be. So there are multiple records for the same mailbox, with different server names, in the murder DB. Would this not open back up a route to entertaining a variety of race conditions (that would need to be addressed somehow) though? Not really - because writes are always sourced from the backend you are connected to. What it COULD create, in theory, is stale reads - but only stale in the way that if would have been if you'd done the same read a second ago. IMAP makes no guarantees about parallel connections. Should then one of the duplicate mailboxes be marked as the primary? Of course. But not in mailboxes.db itself, separately with either a per-server scoping or a per-user/nameroot scoping. There are arguments for both, per nameroot is a lot more data, particularly to update in a failover case, but it also allows you to do really amazing stuff like have per-user replicas configured directly with annotations or something - such that any one user can be moved to a set of machines within the murder and there's no need to actually define pairs of machines at all. I'd almost certainly have an external process monitor that though, monitor disk usage, user size, user locations, etc - and rebalance users by issuing the correct commands
Re: What would it take for FastMail to run murder
On 2015-03-14 22:48, Bron Gondwana wrote: On Sun, Mar 15, 2015, at 07:18 AM, Jeroen van Meeuwen (Kolab Systems) wrote: How, though, do you ensure that a mailbox for a new user in such business is created on the same backend as all the other users of said business? If the business already exists, the create user code will fetch the server name from the business database table and make that the creation server. There's a cron job which runs every hour and looks for users who aren't on the right server, so if we import a user to the business, they get moved. Right -- so you seem to agree that one business is limited to one backend server, which is precisely what the larger businesses that are our customers need to work around, when the number of mailboxes is typically tens of thousands, and the mechanism you describe stops working. There's one particular problem with using NGINX as the IMAP proxy -- it requires that external service that responds with the address to proxy to. T108 I say problem in quotes to emphasize I use the term problem very loosely -- whether it be a functioning backend+mupdate+frontend or a functioning backend+mupdate+frontend+nginx+service is a rather futile distinction, relatively speaking. Sure, but backend+distributed mailbox service+nginx would be a much simpler setup. Yes, T108 here ;-) I don't understand how this is an established problem already -- or not as much as I probably should. If 72k users can be happy on a murder topology, surely 4 times as many could also be happen -- inefficiencies notwithstanding, they're only a vertical scaling limitation. happy is a relative term. You can get most of the benefit from using foolstupidclients, but otherwise you're paying O(N) for the number of users - and taking 4 times as long to do every list command is not ideal. Sure -- the majority of the processing delays seem to lay on the client side taking off the wire what is being dumped on it, however. You're far better entitled to speak to what is in a mailboxes.db and/or its in-memory representation by the time you get to scanning the complete list for items to which a user might have access, I just have to say we've not found this particular part to be as problematic for tens of thousands of users (yet). That said of course I understand it has it's upper limit, but getting updated lookup tables in-memory pushed there when an update happens would seem to resolve the problem, no? Solving the problem is having some kind of index/lookup table indeed. Whether this is done all in-memory by some sort of LIST service which scans the mailboxes.db at startup time and then gets updates from mupdate. For frontends specifically (discrete murder), we're able to use tmpfs for mailboxes.db (and some other stuff of course) solving a bit of the I/O constraints, but it's still a list of folders with parameters containing whether the user has access, and what I meant was perhaps the list can (in addition) be inverted to be a list of users with folders (and rights?). This is not necessarily what a failed mupdate server does though -- new folders and folder renames (includes deletions!) and folder transfers won't work, but the cluster remains functional under both the SMTP-to-backend and LMTP-proxy-via-frontend topology -- autocreate for Sieve fileinto notwithstanding, and mailbox hierarchies distributed over multiple backends when also using the SMTP-to-backend topoplogy notwithstanding. Yeah, until you start up the mupdate server again or configure a new one. Again, you get user visible failures (folder create, etc) while the server is down. The reason I want to shave off all these edge cases is that in a big enough system over a long enough time, you will hit every one of them. We promote a standby frontend not otherwise used, to become the new mupdate server. The interruption is a matter of seconds this way, unless of course you're in the typical stalemate. Thankfully, the state of the art in distributed databases has moved a long way since mupdate was written. I have also written a one-or-two line patch that enables backends that replicate, to both be a part of the same murder topology, to prevent the replica slave from bailing out on the initial creation of a mailbox -- consulting mupdate and finding that it would already exist. Interesting. Does it also handle the case where the same mailbox gets accidentally created on two servers which aren't replica pairs though? Or do you get a mailbox fork? The race condition is not addressed with it, like it is not addressed currently. It solely makes the MUPDATE server not reject the reservation request from a server that uses the same servername if it already has an entry for the same servername!partition, so that the replica successfully creates the local copy -- after which replication is happy. So this would build a scenario in which:
Re: What would it take for FastMail to run murder
On Wed, Mar 18, 2015, at 09:00 AM, Jeroen van Meeuwen (Kolab Systems) wrote: On 2015-03-14 22:48, Bron Gondwana wrote: On Sun, Mar 15, 2015, at 07:18 AM, Jeroen van Meeuwen (Kolab Systems) wrote: How, though, do you ensure that a mailbox for a new user in such business is created on the same backend as all the other users of said business? If the business already exists, the create user code will fetch the server name from the business database table and make that the creation server. There's a cron job which runs every hour and looks for users who aren't on the right server, so if we import a user to the business, they get moved. Right -- so you seem to agree that one business is limited to one backend server, which is precisely what the larger businesses that are our customers need to work around, when the number of mailboxes is typically tens of thousands, and the mechanism you describe stops working. Exactly. It's a limit that we want to avoid, hence looking for a murder-that-scales. happy is a relative term. You can get most of the benefit from using foolstupidclients, but otherwise you're paying O(N) for the number of users - and taking 4 times as long to do every list command is not ideal. Sure -- the majority of the processing delays seem to lay on the client side taking off the wire what is being dumped on it, however. With over a million mailboxes in a single mailboxes.db I was seeing parsing cost go up, particularly with DLIST. I've written a dlist_sax interface, which cuts out some of the cost, but it's still not free. The easiest way to make things more efficient is not do them at all ;) You're far better entitled to speak to what is in a mailboxes.db and/or its in-memory representation by the time you get to scanning the complete list for items to which a user might have access, I just have to say we've not found this particular part to be as problematic for tens of thousands of users (yet). It's going to hurt when you get to millions. That's our issue. If we merged all the mailboxes.db across all our servers into one place, that's a huge database. For frontends specifically (discrete murder), we're able to use tmpfs for mailboxes.db (and some other stuff of course) solving a bit of the I/O constraints, but it's still a list of folders with parameters containing whether the user has access, and what I meant was perhaps the list can (in addition) be inverted to be a list of users with folders (and rights?). That's pretty much exactly the idea. That and avoiding the SPOF that's a murder master right now. They're kind of separate goals, we could do one without the other. We promote a standby frontend not otherwise used, to become the new mupdate server. The interruption is a matter of seconds this way, unless of course you're in the typical stalemate. Hmm so maybe it's affordable. It scales up with number-of-servers as well though. Making sure it's up to date costs at least O(number of backends). Interesting. Does it also handle the case where the same mailbox gets accidentally created on two servers which aren't replica pairs though? Or do you get a mailbox fork? The race condition is not addressed with it, like it is not addressed currently. I'm not 100% happy living with unaddressed race conditions. Addressing this would be an important part of making FastMail happy to run it. It solely makes the MUPDATE server not reject the reservation request from a server that uses the same servername if it already has an entry for the same servername!partition, so that the replica successfully creates the local copy -- after which replication is happy. Yeah, that makes sense. Of course, the backend should probably not be reserving so much. There are two things conflated here: 1) I'm running cmd_create in an IMAPd and I want to see if this folder already exists. 2) I'm a replica backend getting a copy of an existing folder (or indeed, a backend which already has a folder) and I'm informing mupdate of the fact. Those two should be treated differently. The first is does this already exist, which is a legitimate question to ask. The second should always succeed. MUPDATE is a representation of facts, and the backends are the masters of those facts. So this would build a scenario in which: pair-1-replica-1.example.org and pair-1-replica-2.example.org present themselves as pair-1.example.org A DNS IN A RR is created for the fail-over address(es) for pair- 1.example.org and attached to whichever replica in the pair is considered the active node. Both replicas would be configured to replicate to one another, which works in a PoC scenario but may seem to require lmtpd/AF_INET delivery. So they both have the same server name in mupdate. My plan is that they have different server names in mupdate. There's a separate channel somehow to say which is the primary
Re: What would it take for FastMail to run murder
On 2015-03-13 23:50, Bron Gondwana wrote: So I've been doing a lot of thinking about Cyrus clustering, with the underlying question being what would it take to make FastMail run a murder. We've written a fair bit about our infrastructure - we use nginx as a frontend proxy to direct traffic to backend servers, and have no interdependencies between the backends, so that we can scale indefinitely. With murder as it exists now, we would be pushing the limits of the system already - particularly with the globally distributed datacentres. Why would FastMail consider running murder, given our existing nice system? a) we support folder sharing within businesses, so at the moment we are limited by the size of a single slot. Some businesses already push that limit. How, though, do you ensure that a mailbox for a new user in such business is created on the same backend as all the other users of said business? Here are our deal-breaker requirements: 1) unified murder - we don't want to run both a frontend AND a backend imapd process for every single connection. We already have nginx, which is non-blocking, for the initial connection and auth handling. There's one particular problem with using NGINX as the IMAP proxy -- it requires that external service that responds with the address to proxy to. I say problem in quotes to emphasize I use the term problem very loosely -- whether it be a functioning backend+mupdate+frontend or a functioning backend+mupdate+frontend+nginx+service is a rather futile distinction, relatively speaking. 2) no table scans - anything that requires a parse and ACL lookup for every single row of mailboxes.db is going to be a non- starter when you multiply the existing mailboxes.db size by hundreds. I don't understand how this is an established problem already -- or not as much as I probably should. If 72k users can be happy on a murder topology, surely 4 times as many could also be happen -- inefficiencies notwithstanding, they're only a vertical scaling limitation. That said of course I understand it has it's upper limit, but getting updated lookup tables in-memory pushed there when an update happens would seem to resolve the problem, no? 3) no single-point-of-failure - having one mupdate master which can stop the entire cluster working if it's offline, no thanks. This is not necessarily what a failed mupdate server does though -- new folders and folder renames (includes deletions!) and folder transfers won't work, but the cluster remains functional under both the SMTP-to-backend and LMTP-proxy-via-frontend topology -- autocreate for Sieve fileinto notwithstanding, and mailbox hierarchies distributed over multiple backends when also using the SMTP-to-backend topoplogy notwithstanding. Thankfully, the state of the art in distributed databases has moved a long way since mupdate was written. I have also written a one-or-two line patch that enables backends that replicate, to both be a part of the same murder topology, to prevent the replica slave from bailing out on the initial creation of a mailbox -- consulting mupdate and finding that it would already exist. Along with this, we need a reverse lookup for ACLs, so that any one user doesn't ever need to scan the entire mailboxes.db. This might be hooked into the distributed DB as well, or calculated locally on each node. I reckon this may be the rebuild more efficient lookup trees in-memory or otherwise I may have referred to just now just not in so many words. And that's pretty much it. There are some interesting factors around replication, and I suspect the answer here is to have either multi- value support or embed the backend name into the mailboxes.db key (postfix) such that you wind up listing the same mailbox multiple times. In a scenario where only one backend is considered active for the given (set of) mailbox(es), and the other is passive, this has been more of a one-line patch in mupdate plus the proper infrastructure in DNS/keepalived type of failover service IP addresses than it has been about allowing duplicates and suppressing them. Kind regards, Jeroen van Meeuwen -- Systems Architect, Kolab Systems AG e: vanmeeuwen at kolabsys.com m: +41 79 951 9003 w: https://kolabsystems.com pgp: 9342 BF08
RE: What would it take for FastMail to run murder
On 2015-03-13 23:54, Dave McMurtrie wrote: From my phone, so excuse brevity and top-posting, but Fastmail running murder would be a huge bonus. I not-so-fondly recall the intimate relationship I developed with gdb debugging murder issues when we upgraded from 2.3 to 2.4 :) You won't have to for 2.5 (as much) because we're running it at supported customer sites, and I'm to blame for the alleged fixes ;-) Kind regards, Jeroen van Meeuwen -- Systems Architect, Kolab Systems AG e: vanmeeuwen at kolabsys.com m: +41 79 951 9003 w: https://kolabsystems.com pgp: 9342 BF08
Re: What would it take for FastMail to run murder
14.03.2015 01:50, Bron Gondwana wrote: So I've been doing a lot of thinking about Cyrus clustering, with the underlying question being what would it take to make FastMail run a murder. We've written a fair bit about our infrastructure - we use nginx as a frontend proxy to direct traffic to backend servers, and have no interdependencies between the backends, so that we can scale indefinitely. With murder as it exists now, we would be pushing the limits of the system already - particularly with the globally distributed datacentres. Btw (as you speak about clusters), I've developed a Proof-of-the-concept for a cyrus-imapd cluster a long ago using pacemaker as a cluster resource manager. There are many things happened to the linux clustering after that, including remote-node support in the pacemaker, so that concept may be reworked to be even more perfect and scalable. The only thing I did not like that time is that cyrus replication was a bit weak to detect changes after a rolling multi-node failure (node1 goes down, node2 takes over the replica, node2 goes down, node1 goes up and changes made to node2 during node1 was down are lost). Please drop me a note (or just post here as I'm a long time silent reader) if you're interested in making cyrus-imapd rock-solid from the ha-clustering perspective and need some guidance in that so I'll share more details. Best, Vladislav
Re: What would it take for FastMail to run murder
For sure :) Just having testing infrastructure that tests murder would go a long way to avoiding that mess again. The more I think about it, the more having the SAME mailboxes.db for both local and remote data doesn't make sense. We should have a separate central database that the mupdate_activate, etc write to. It can just be a standalone SQL database, or a cluster database, or who cares... the main thing is that only a few of the MBOXLIST commands need to care (because they will return the remote information if needed) Bron. On Sat, Mar 14, 2015, at 09:54 AM, Dave McMurtrie wrote: From my phone, so excuse brevity and top-posting, but Fastmail running murder would be a huge bonus. I not-so-fondly recall the intimate relationship I developed with gdb debugging murder issues when we upgraded from 2.3 to 2.4 :) Sent via the Samsung GALAXY S® 5, an ATT 4G LTE smartphone Original message From: Bron Gondwana br...@fastmail.fm Date:03/13/2015 6:50 PM (GMT-05:00) To: Cyrus Devel cyrus-devel@lists.andrew.cmu.edu Cc: Subject: What would it take for FastMail to run murder So I've been doing a lot of thinking about Cyrus clustering, with the underlying question being what would it take to make FastMail run a murder. We've written a fair bit about our infrastructure - we use nginx as a frontend proxy to direct traffic to backend servers, and have no interdependencies between the backends, so that we can scale indefinitely. With murder as it exists now, we would be pushing the limits of the system already - particularly with the globally distributed datacentres. Why would FastMail consider running murder, given our existing nice system? a) we support folder sharing within businesses, so at the moment we are limited by the size of a single slot. Some businesses already push that limit. b) it's good to dogfood the server we put so much work into. Here are our deal-breaker requirements: 1) unified murder - we don't want to run both a frontend AND a backend imapd process for every single connection. We already have nginx, which is non-blocking, for the initial connection and auth handling. 2) no table scans - anything that requires a parse and ACL lookup for every single row of mailboxes.db is going to be a non- starter when you multiply the existing mailboxes.db size by hundreds. 3) no single-point-of-failure - having one mupdate master which can stop the entire cluster working if it's offline, no thanks. Thankfully, the state of the art in distributed databases has moved a long way since mupdate was written. We'd have to at least change the mupdate protocol anyway to handle newly added fields, so why not just do away with it and have every server run a local node of a distributed database protocol for its mailboxes.db. Along with this, we need a reverse lookup for ACLs, so that any one user doesn't ever need to scan the entire mailboxes.db. This might be hooked into the distributed DB as well, or calculated locally on each node. And that's pretty much it. There are some interesting factors around replication, and I suspect the answer here is to have either multi- value support or embed the backend name into the mailboxes.db key (postfix) such that you wind up listing the same mailbox multiple times. We already suppress duplicates in the LIST command, so all we need then is logic for choosing the actual master. Rob N has done some work with consul and etcd already at FastMail, and we would use either that or a flag in the distributed DB to drive master choice for backend connection purposes. There are a bunch of nice to haves on top of this, but I think this would be enough for us to convert our existing standalone servers over to a murder. Bron. -- Bron Gondwana br...@fastmail.fm -- Bron Gondwana br...@fastmail.fm
RE: What would it take for FastMail to run murder
From my phone, so excuse brevity and top-posting, but Fastmail running murder would be a huge bonus. I not-so-fondly recall the intimate relationship I developed with gdb debugging murder issues when we upgraded from 2.3 to 2.4 :) Sent via the Samsung GALAXY S® 5, an ATT 4G LTE smartphone Original message From: Bron Gondwana br...@fastmail.fm Date:03/13/2015 6:50 PM (GMT-05:00) To: Cyrus Devel cyrus-devel@lists.andrew.cmu.edu Cc: Subject: What would it take for FastMail to run murder So I've been doing a lot of thinking about Cyrus clustering, with the underlying question being what would it take to make FastMail run a murder. We've written a fair bit about our infrastructure - we use nginx as a frontend proxy to direct traffic to backend servers, and have no interdependencies between the backends, so that we can scale indefinitely. With murder as it exists now, we would be pushing the limits of the system already - particularly with the globally distributed datacentres. Why would FastMail consider running murder, given our existing nice system? a) we support folder sharing within businesses, so at the moment we are limited by the size of a single slot. Some businesses already push that limit. b) it's good to dogfood the server we put so much work into. Here are our deal-breaker requirements: 1) unified murder - we don't want to run both a frontend AND a backend imapd process for every single connection. We already have nginx, which is non-blocking, for the initial connection and auth handling. 2) no table scans - anything that requires a parse and ACL lookup for every single row of mailboxes.db is going to be a non- starter when you multiply the existing mailboxes.db size by hundreds. 3) no single-point-of-failure - having one mupdate master which can stop the entire cluster working if it's offline, no thanks. Thankfully, the state of the art in distributed databases has moved a long way since mupdate was written. We'd have to at least change the mupdate protocol anyway to handle newly added fields, so why not just do away with it and have every server run a local node of a distributed database protocol for its mailboxes.db. Along with this, we need a reverse lookup for ACLs, so that any one user doesn't ever need to scan the entire mailboxes.db. This might be hooked into the distributed DB as well, or calculated locally on each node. And that's pretty much it. There are some interesting factors around replication, and I suspect the answer here is to have either multi- value support or embed the backend name into the mailboxes.db key (postfix) such that you wind up listing the same mailbox multiple times. We already suppress duplicates in the LIST command, so all we need then is logic for choosing the actual master. Rob N has done some work with consul and etcd already at FastMail, and we would use either that or a flag in the distributed DB to drive master choice for backend connection purposes. There are a bunch of nice to haves on top of this, but I think this would be enough for us to convert our existing standalone servers over to a murder. Bron. -- Bron Gondwana br...@fastmail.fm