Re: Duplicate mails on pop3 expunge with dsync replication on 2.2.35 (2.2.33.2 works)
Hi Jan, unfortunately no news on this topic after 6 months. I understand this bug might not be a top priority but perhaps someone from the dovecot team can clarify if they will investigate eventually or if we're on our own. Thanks, Gerald > Am 18.09.2018 um 23:20 schrieb Jan Münnich : > > Hi, > > Has anyone any idea how to solve or further debug this issue? It seems indeed > that it was introduced in 2.2.34 and is still there in 2.3.2.1. I found a > couple of posts for this on the mailing list and elsewhere, but no solution: > > When a message is retrieved and immediately expunged, it gets replicated back > from the other dsync node. This usually happens with POP3 but with IMAP as > well, when the MUA fetches the mail and the user opens and reads it > immediately within seconds. It does not seem to happen when the message is > retrieved and only expunged a while after, which is mostly the case with IMAP. > > The bug occurs and is reproducible when the message is delivered to node A > and then fetched by the client from node B. If the message is delivered to > and fetched from the same node, the message does not get duplicated. > > I'm attaching the debug logs from both nodes for a full example transaction. > The message is delivered via lmtp to node A with UID 175261, fetched and > deleted on node B and then appears again with the new UID 175262. > > Thanks, > Jan > > > [...]
Re: Duplicate mails on pop3 expunge with dsync replication on 2.2.35 (2.2.33.2 works)
Hi, Has anyone any idea how to solve or further debug this issue? It seems indeed that it was introduced in 2.2.34 and is still there in 2.3.2.1. I found a couple of posts for this on the mailing list and elsewhere, but no solution: When a message is retrieved and immediately expunged, it gets replicated back from the other dsync node. This usually happens with POP3 but with IMAP as well, when the MUA fetches the mail and the user opens and reads it immediately within seconds. It does not seem to happen when the message is retrieved and only expunged a while after, which is mostly the case with IMAP. The bug occurs and is reproducible when the message is delivered to node A and then fetched by the client from node B. If the message is delivered to and fetched from the same node, the message does not get duplicated. I'm attaching the debug logs from both nodes for a full example transaction. The message is delivered via lmtp to node A with UID 175261, fetched and deleted on node B and then appears again with the new UID 175262. Thanks, Jan Node A: 2018-09-18 23:03:17 lmtp(u...@example.org)<6916>: Debug: Loading modules from directory: /usr/lib/dovecot/modules 2018-09-18 23:03:17 lmtp(u...@example.org)<6916>: Debug: Module loaded: /usr/lib/dovecot/modules/lib01_acl_plugin.so 2018-09-18 23:03:17 lmtp(u...@example.org)<6916>: Debug: Module loaded: /usr/lib/dovecot/modules/lib10_quota_plugin.so 2018-09-18 23:03:17 lmtp(u...@example.org)<6916>: Debug: Module loaded: /usr/lib/dovecot/modules/lib15_notify_plugin.so 2018-09-18 23:03:17 lmtp(u...@example.org)<6916>: Debug: Module loaded: /usr/lib/dovecot/modules/lib20_fts_plugin.so 2018-09-18 23:03:17 lmtp(u...@example.org)<6916>: Debug: Module loaded: /usr/lib/dovecot/modules/lib20_replication_plugin.so 2018-09-18 23:03:17 lmtp(u...@example.org)<6916>: Debug: Module loaded: /usr/lib/dovecot/modules/lib21_fts_solr_plugin.so 2018-09-18 23:03:17 lmtp(u...@example.org)<6916>: Debug: Module loaded: /usr/lib/dovecot/modules/lib90_sieve_plugin.so 2018-09-18 23:03:17 lmtp(u...@example.org)<6916>: Debug: auth USER input: u...@example.org home=/var/vmail/user/u...@example.org/ uid=2000 gid=2000 quota_rule=*:bytes=10737418240 2018-09-18 23:03:17 lmtp(u...@example.org)<6916>: Debug: Added userdb setting: plugin/quota_rule=*:bytes=10737418240 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: Effective uid=2000, gid=2000, home=/var/vmail/user/u...@example.org/ 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: Quota root: name=User quota backend=count args= 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: Quota rule: root=User quota mailbox=* bytes=10737418240 messages=0 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: Quota rule: root=User quota mailbox=Trash bytes=+1073741824 messages=0 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: Quota warning: bytes=10630044057 (99%) messages=0 reverse=no command=quota-warning 100 u...@example.org 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: Quota warning: bytes=10200547328 (95%) messages=0 reverse=no command=quota-warning 95 u...@example.org 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: Quota warning: bytes=9663676416 (90%) messages=0 reverse=no command=quota-warning 90 u...@example.org 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: Quota warning: bytes=8589934592 (80%) messages=0 reverse=no command=quota-warning 80 u...@example.org 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: Quota grace: root=User quota bytes=1073741824 (10%) 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: Namespace inbox: type=private, prefix=, sep=/, inbox=yes, hidden=no, list=yes, subscriptions=yes location=mdbox:~/mdbox 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: fs: root=/var/vmail/user/u...@example.org//mdbox, index=, indexpvt=, control=, inbox=, alt= 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: acl: initializing backend with data: vfile 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: acl: acl username = u...@example.org 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: acl: owner = 1 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: acl vfile: Global ACLs disabled 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: Namespace : type=shared, prefix=shared/%u/, sep=/, inbox=no, hidden=no, list=children, subscriptions=no location=mdbox:%h:INDEX=~/shared/%u 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: shared: root=/var/run/dovecot, index=, indexpvt=, control=, inbox=, alt= 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: fts: Indexes disabled for namespace 'shared/%u/' 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: acl: initializing backend with data: vfile 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: acl: acl username = u...@example.org 2018-09-18 23:03:17 lmtp(6916, u...@example.org): Debug: acl: owner = 0
Re: Duplicate mails on pop3 expunge with dsync replication on 2.2.35 (2.2.33.2 works)
Hi Tim, > Do you have any new insights on the problem with reappearing mail using > dovecot replication + pop3? > > It's driving me mad. I'm running dovecot 2.2.34 (874deae) on OpenBSD and it > looks like I have the same problem as you have. unfortunately there has been no response, I'm stuck with 2.2.33.2 for the time being. I can only suspect it has something to do with mailbox locking which was introduced in 2.2.34 (int dsync_mailbox_lock) or maybe with some cached values (drop_older_timestamp) as it just happens when the mail is deleted on the node that it has been synced to but not if it's deleted on the node where it has originally been received. Best regards, Gerald diff -Nru dovecot-2.2.33.2/src/doveadm/dsync/dsync-brain-mailbox.c dovecot-2.2.34/src/doveadm/dsync/dsync-brain-mailbox.c --- dovecot-2.2.33.2/src/doveadm/dsync/dsync-brain-mailbox.c2017-10-05 19:10:44.0 +0200 +++ dovecot-2.2.34/src/doveadm/dsync/dsync-brain-mailbox.c 2018-02-28 15:45:34.0 +0100 @@ -522,25 +529,33 @@ } /* mailbox appears to have changed. do a full sync here and get the - state again */ + state again. Lock before syncing. */ + if (dsync_mailbox_lock(brain, box, ) < 0) { + brain->failed = TRUE; + mailbox_free(); + return -1; + } if (mailbox_sync(box, MAILBOX_SYNC_FLAG_FULL_READ) < 0) { i_error("Can't sync mailbox %s: %s", mailbox_get_vname(box), mailbox_get_last_internal_error(box, >mail_error)); ... @@ -599,6 +615,7 @@ static void dsync_cache_fields_update(const struct dsync_mailbox *local_box, const struct dsync_mailbox *remote_box, + struct mailbox *box, struct mailbox_update *update) { ARRAY_TYPE(mailbox_cache_field) local_sorted, remote_sorted, changes; @@ -630,7 +647,8 @@ local_fields = array_get(_sorted, _count); remote_fields = array_get(_sorted, _count); t_array_init(, local_count + remote_count); - drop_older_timestamp = ioloop_time - MAIL_CACHE_FIELD_DROP_SECS; + drop_older_timestamp = ioloop_time - + box->index->optimization_set.cache.unaccessed_field_drop_secs; diff -Nru dovecot-2.2.33.2/src/doveadm/dsync/dsync-mailbox.c dovecot-2.2.34/src/doveadm/dsync/dsync-mailbox.c --- dovecot-2.2.33.2/src/doveadm/dsync/dsync-mailbox.c 2017-06-23 13:18:28.0 +0200 +++ dovecot-2.2.34/src/doveadm/dsync/dsync-mailbox.c2018-02-28 15:45:34.0 +0100 @@ -1,7 +1,9 @@ -/* Copyright (c) 2013-2017 Dovecot authors, see the included COPYING file */ +/* Copyright (c) 2013-2018 Dovecot authors, see the included COPYING file */ #include "lib.h" #include "istream.h" +#include "mail-storage-private.h" +#include "dsync-brain-private.h" #include "dsync-mailbox.h" void dsync_mailbox_attribute_dup(pool_t pool, @@ -20,3 +22,40 @@ dest_r->last_change = src->last_change; dest_r->modseq = src->modseq; } + +int dsync_mailbox_lock(struct dsync_brain *brain, struct mailbox *box, + struct file_lock **lock_r) +{ + const char *path, *error; + int ret; + + /* Make sure the mailbox is open - locking requires it */ + if (mailbox_open(box) < 0) { + i_error("Can't open mailbox %s: %s", mailbox_get_vname(box), + mailbox_get_last_internal_error(box, >mail_error)); + return -1; + } + + ret = mailbox_get_path_to(box, MAILBOX_LIST_PATH_TYPE_INDEX, ); + if (ret < 0) { + i_error("Can't get mailbox %s path: %s", mailbox_get_vname(box), + mailbox_get_last_internal_error(box, >mail_error)); + return -1; + } + if (ret == 0) { + /* No index files - don't do any locking. In theory we still + could, but this lock is mainly meant to prevent replication + problems, and replication wouldn't work without indexes. */ + *lock_r = NULL; + return 0; + } + + if (mailbox_lock_file_create(box, DSYNC_MAILBOX_LOCK_FILENAME, +brain->mailbox_lock_timeout_secs, +lock_r, ) <= 0) { + i_error("Failed to lock mailbox %s for dsyncing: %s", + box->vname, error); + return -1; + } + return 0; +}
Duplicate mails on pop3 expunge with dsync replication on 2.2.35 (2.2.33.2 works)
Hello, consider the following setup with dovecot 2.2.35: smtp/587 (subject: test 1535) | | mx2a.example.com --> dsync/ssh --> mx2b.example.com | pop3 fetch/expunge (uid 23) | !! dsync (copy from INBOX -> uid 24) /| dsync (expunge uid 23) / The pop3 client deletes mail from the server which triggers a copy from INBOX before it is expunged. On the next pop3 fetch you get the copy of the mail you thought had been expunged. This occurs only if mail is received by smtp on mx2a, synced to mx2b via dsync/ssh and then expunged via pop3 on mx2b. It does not occur if mail is received and expunged on mx2b. As a temporary workaround the system has been downgraded to 2.2.33.2. There are no duplicate emails after expunge with this version. 2.2.34 has not been tested. Does anyone know if there were changes in the dsync code from 2.2.33.2 to 2.2.35? Log: (mail received on mx2a.example.com and delivered via dsync to mx2b.example.com, then expunged via pop3 on mx2b.example.com -> copy/duplicate) Mar 26 15:35:57 mx2b.example.com dovecot[3825]: pop3-login: Login: user=, method=PLAIN, rip=91.0.0.1, lip=188.0.0.1, mpid=3922, TLS, TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits) Mar 26 15:35:58 mx2b.example.com dovecot[3825]: pop3(popt...@example.com): expunge: box=INBOX, uid=23, msgid= , size=1210, subject=test 1535 Mar 26 15:35:58 mx2b.example.com dovecot[3825]: pop3(popt...@example.com): Disconnected: Logged out top=0/0, retr=1/1259, del=1/1, size=1242 Mar 26 15:35:58 mx2b.example.com dovecot[3825]: doveadm: Error: dsync-remote(popt...@example.com): Info: copy from INBOX: box=INBOX, uid=24, msgid= , size=1210, subject=test 1535 Mar 26 15:35:58 mx2b.example.com dovecot[3825]: doveadm: Error: dsync-remote(popt...@example.com): Info: expunge: box=INBOX, uid=23, msgid= , size=1210, subject=test 1535 (mail received on mx2b.example.com and expunged via pop3 on mx2b.example.com -> no copy/duplicate) Mar 26 15:36:09 mx2b.example.com dovecot[3825]: pop3-login: Login: user= , method=PLAIN, rip=91.0.0.1, lip=188.0.0.1, mpid=3927, TLS, TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits) Mar 26 15:36:09 mx2b.example.com dovecot[3825]: pop3(popt...@example.com): expunge: box=INBOX, uid=24, msgid= , size=1210, subject=test 1535 Mar 26 15:36:09 mx2b.example.com dovecot[3825]: pop3(popt...@example.com): Disconnected: Logged out top=0/0, retr=1/1259, del=1/1, size=1242 Mar 26 15:36:10 mx2b.example.com dovecot[3825]: doveadm: Error: dsync-remote(popt...@example.com): Info: expunge: box=INBOX, uid=24, msgid= , size=1210, subject=test 1535 Thanks for looking into this Gerald