Hi,

Checking in - this is still an issue with 2.3-master as of today (2.3.devel (3a6537d59)).

I haven't been able to narrow the problem down to a specific commit. The best I have been able to get to is that this commit is relatively good (not perfect but good enough):

d9a1a7cbec19f4c6a47add47688351f8c3a0e372 (from Feb 19, 2018)

whereas this commit:

6418419ec282c887b67469dbe3f541fc4873f7f0 (From Mar 12, 2018)

is pretty bad. Somewhere in between some commit has caused the problem (which may have been introduced earlier) to get much worse.

There seem to be a handful of us with broken systems who are prepared to assist in debugging this and put in our own time to patch, test and get to the bottom of it, but it is starting to look like we're basically on our own.

What sort of debugging, short of bisecting 100+ patches between the commits above, can we do to progress this?

Reuben



On 7/05/2018 5:54 am, Thore Bödecker wrote:
Hey all,

I've been affected by these replication issues too and finally downgraded
back to 2.2.35 since some newly created virtual domains/mailboxes
weren't replicated *at all* due to the bug(s).

My setup is more like a master-slave, where I only have a rather small
virtual machine as the slave host, which is also only MX 20.
The idea was to replicate all mails through dovecot and perform
individual (independent) backups on each host.

The clients use a CNAME with a low TTL of 60s so in case my "master"
(physical dedicated machine) goes down for a longer period I can simply
switch to the slave.

In order for this concept to work, the replication has to work without
any issue. Otherwise clients might notice missing mails or it might
even result in conflicts when the master cames back online if the
slave was out of sync beforehand.


On 06.05.18 - 21:34, Michael Grimm wrote:
And please have a look for processes like:
        doveadm-server: [IP4 <user> INBOX import:1/3] (doveadm-server)

These processes will "survive" a dovecot reboot ...

This is indeed the case. Once the replication processes
(doveadm-server) get stuck I had to resort to `kill -9` to get rid of
them. Something is really wrong there.

As stated multiple times in the #dovecot irc channel I'm happy to test
any patches for the 2.3 series in my setup and provide further details
if required.

Thanks to all who are participating in this thread and finally these
issues get some attention :)


Cheers,
Thore


Reply via email to