On 30 Oct 2017, at 11.05, Ralf Becker <r...@egroupware.org> wrote: > > It happened now twice that replication created folders and mails in the > wrong mailbox :( > > Here's the architecture we use: > - 2 Dovecot (2.2.32) backends in two different datacenters replicating > via a VPN connection > - Dovecot directors in both datacenters talks to both backends with > vhost_count of 100 vs 1 for local vs remote backend > - backends use proxy dict via a unix domain socket and socat to talk via > tcp to a dict on a different server (kubernetes cluster) > - backends have a local sqlite userdb for iteration (also containing > home directories, as just iteration is not possible) > - serving around 7000 mailboxes in a roughly 200 different domains > > Everything works as expected, until dict is not reachable eg. due to a > server failure or a planed reboot of a node of the kubernetes cluster. > In that situation it can happen that some requests are not answered, > even with Kubernetes running multiple instances of the dict. > I can only speculate what happens then: it seems the connection failure > to the remote dict is not correctly handled and leads to situation in > which last mailbox/home directory is used for the replication :(
It sounds to me like a userdb lookup changes the username during a dict failure. Although I can't really think of how that could happen. The only thing that comes to my mind is auth_cache, but in that case I'd expect the same problem to happen even when there aren't dict errors. For testing you could see if it's reproducible with: - get random username - do doveadm user <user> - verify that the result contains the same input user Then do that in a loop rapidly and restart your test kubernetes once in a while.