Hi Aki, We just installed 2.3.19, and are seeing a couple of users throwing the "INBOX/dovecot.index reset, view is now inconsistent" and their replicator status erroring out. Tried force-resync on the full mailbox, but to no avail just yet. Not sure if this bug was supposedly fixed in 2.3.19?
Thanks, Cassidy On Thu, Apr 28, 2022 at 5:02 AM Aki Tuomi <aki.tu...@open-xchange.com> wrote: > 2.3.19 is round the corner, so not long. I cannot yet promise an exact > date but hopefully within week or two. > > Aki > > > On 28/04/2022 13:57 Paul Kudla (SCOM.CA Internet Services Inc.) < > p...@scom.ca> wrote: > > > > > > Thanks for the update. > > > > is this for both replication issues (folders +300 etc) > > > > Just Asking - Any ETA > > > > > > > > > > > > Happy Thursday !!! > > Thanks - paul > > > > Paul Kudla > > > > > > Scom.ca Internet Services <http://www.scom.ca> > > 004-1009 Byron Street South > > Whitby, Ontario - Canada > > L1N 4S3 > > > > Toronto 416.642.7266 > > Main 1.866.411.7266 > > Fax 1.888.892.7266 > > > > On 4/27/2022 9:01 AM, Aki Tuomi wrote: > > > > > > Hi! > > > > > > This is probably going to get fixed in 2.3.19, this looks like an > issue we are already fixing. > > > > > > Aki > > > > > >> On 26/04/2022 16:38 Paul Kudla (SCOM.CA Internet Services Inc.) < > p...@scom.ca> wrote: > > >> > > >> > > >> Agreed there seems to be no way of posting these kinds of issues to > see > > >> if they are even being addressed or even known about moving forward on > > >> new updates > > >> > > >> i read somewhere there is a new branch soming out but nothing as of > yet? > > >> > > >> 2.4 maybe .... > > >> 5.0 ........ > > >> > > >> my previous replication issues (back in feb) went unanswered. > > >> > > >> not faulting anyone, but the developers do seem to be disconnected > from > > >> issues as of late? or concentrating on other issues. > > >> > > >> I have no problem with support contracts for day to day maintence > > >> however as a programmer myself they usually dont work as the other end > > >> relies on the latest source code anyways. Thus can not help. > > >> > > >> I am trying to take a part the replicator c programming based on > 2.3.18 > > >> as most of it does work to some extent. > > >> > > >> tcps just does not work (ie 600 seconds default in the c programming) > > >> > > >> My thoughts are tcp works ok but fails when the replicator through > > >> dsync-client.c when asked to return the folder list? > > >> > > >> > > >> replicator-brain.c seems to control the overall process and timing. > > >> > > >> replicator-queue.c seems to handle the que file that does seem to > carry > > >> acurate info. > > >> > > >> > > >> things in the source code are documented enough to figure this out > but i > > >> am still going through all the related .h files documentation wise > which > > >> are all over the place. > > >> > > >> there is no clear documentation on the .h lib files so i have to walk > > >> through the tree one at a time finding relative code. > > >> > > >> since the dsync from doveadm does see to work ok i have to assume the > > >> dsync-client used to compile the replicator is at fault somehow or a > > >> call from it upstream? > > >> > > >> Thanks for your input on the other issues noted below, i will keep > that > > >> in mind when disassembling the source code. > > >> > > >> No sense in fixing one thing and leaving something else behind, > probably > > >> all related anyways. > > >> > > >> i have two test servers avaliable so i can play with all this offline > to > > >> reproduce the issues > > >> > > >> Unfortunately I have to make a living first, this will be addressed > when > > >> possible as i dont like systems that are live running this way and > > >> currently only have 5 accounts with this issue (mine included) > > >> > > >> > > >> > > >> > > >> Happy Tuesday !!! > > >> Thanks - paul > > >> > > >> Paul Kudla > > >> > > >> > > >> Scom.ca Internet Services <http://www.scom.ca> > > >> 004-1009 Byron Street South > > >> Whitby, Ontario - Canada > > >> L1N 4S3 > > >> > > >> Toronto 416.642.7266 > > >> Main 1.866.411.7266 > > >> Fax 1.888.892.7266 > > >> > > >> On 4/26/2022 9:03 AM, Reuben Farrelly wrote: > > >>> > > >>> I ran into this back in February and documented a reproducible test > case > > >>> (and sent it to this list). In short - I was able to reproduce this > by > > >>> having a valid and consistent mailbox on the source/local, creating a > > >>> very standard empty Maildir/(new|cur|tmp) folder on the remote > replica, > > >>> and then initiating the replicate from the source. This consistently > > >>> caused dsync to fail replication with the error "dovecot.index reset, > > >>> view is now inconsistent" and sync aborted, leaving the replica > mailbox > > >>> in a screwed up inconsistent state. Client connections on the source > > >>> replica were also dropped when this error occurred. You can see the > > >>> error by enabling debug level logging if you initiate dsync manually > on > > >>> a test mailbox. > > >>> > > >>> The only workaround I found was to remove the remote Maildir and let > > >>> Dovecot create the whole thing from scratch. Dovecot did not like > any > > >>> existing folders on the destination replica even if they were the > same > > >>> names as the source and completely empty. I was able to reproduce > this > > >>> the bare minimum of folders - just an INBOX! > > >>> > > >>> I have no idea if any of the developers saw my post or if the bug has > > >>> been fixed for the next release. But it seemed to be quite a common > > >>> problem over time (saw a few posts from people going back a long way > > >>> with the same problem) and it is seriously disruptive to clients. > The > > >>> error message is not helpful in tracking down the problem either. > > >>> > > >>> Secondly, I also have had an ongoing and longstanding problem using > > >>> tcps: for replication. For some reason using tcps: (with no other > > >>> changes at all to the config) results in a lot of timeout messages > > >>> "Error: dsync I/O has stalled, no activity for 600 seconds". This > goes > > >>> away if I revert back to tcp: instead of tcps - with tcp: I very > rarely > > >>> get timeouts. No idea why, guess this is a bug of some sort also. > > >>> > > >>> It's disappointing that there appears to be no way to have these > sorts > > >>> or problems addressed like there once was. I am not using Dovecot > for > > >>> commercial purposes so paying a fortune for a support contract for a > > >>> high end installation just isn't going to happen, and this list > seems to > > >>> be quite ordinary for getting support and reporting bugs nowadays.... > > >>> > > >>> Reuben > > >>> > > >>> On 26/04/2022 7:21 pm, Paul Kudla (SCOM.CA Internet Services Inc.) > wrote: > > >>> > > >>>> > > >>>> side issue > > >>>> > > >>>> if you are getting inconsistant dsyncs there is no real way to fix > > >>>> this in the long run. > > >>>> > > >>>> i know its a pain (already had to my self) > > >>>> > > >>>> i needed to do a full sync, take one server offline, delete the user > > >>>> dir (with dovecot offline) and then rsync (or somehow duplicate the > > >>>> main server's user data) over the the remote again. > > >>>> > > >>>> then bring remote back up and it kind or worked worked > > >>>> > > >>>> best suggestion is to bring the main server down at night so the > copy > > >>>> is clean? > > >>>> > > >>>> if using postfix you can enable the soft bounce option and the mail > > >>>> will back spool until everything comes back online > > >>>> > > >>>> (needs to be enable on bother servers) > > >>>> > > >>>> replication was still an issue on accounts with 300+ folders in > them, > > >>>> still working on a fix for that. > > >>>> > > >>>> > > >>>> Happy Tuesday !!! > > >>>> Thanks - paul > > >>>> > > >>>> Paul Kudla > > >>>> > > >>>> > > >>>> Scom.ca Internet Services <http://www.scom.ca> > > >>>> 004-1009 Byron Street South > > >>>> Whitby, Ontario - Canada > > >>>> L1N 4S3 > > >>>> > > >>>> Toronto 416.642.7266 > > >>>> Main 1.866.411.7266 > > >>>> Fax 1.888.892.7266 > > >>>> > > >>>> On 4/25/2022 10:01 AM, Arnaud Abélard wrote: > > >>>>> Ah, I'm now getting errors in the logs, that would explains the > > >>>>> increasing number of failed sync requests: > > >>>>> > > >>>>> dovecot: imap(xxxxx)<2961235><Bs6w43rdQPAqAcsFiXEmAInUhhA3Rfqh>: > > >>>>> Error: Mailbox INBOX: /vmail/l/i/xxxxx/dovecot.index reset, view is > > >>>>> now inconsistent > > >>>>> > > >>>>> > > >>>>> And sure enough: > > >>>>> > > >>>>> # dovecot replicator status xxxxx > > >>>>> > > >>>>> xxxxx none 00:02:54 07:11:28 - y > > >>>>> > > >>>>> > > >>>>> What could explain that error? > > >>>>> > > >>>>> Arnaud > > >>>>> > > >>>>> > > >>>>> > > >>>>> On 25/04/2022 15:13, Arnaud Abélard wrote: > > >>>>>> Hello, > > >>>>>> > > >>>>>> On my side we are running Linux (Debian Buster). > > >>>>>> > > >>>>>> I'm not sure my problem is actually the same as Paul or you > > >>>>>> Sebastian since I have a lot of boxes but those are actually small > > >>>>>> (quota of 110MB) so I doubt any of them have more than a dozen > imap > > >>>>>> folders. > > >>>>>> > > >>>>>> The main symptom is that I have tons of full sync requests > awaiting > > >>>>>> but even though no other sync is pending the replicator just waits > > >>>>>> for something to trigger those syncs. > > >>>>>> > > >>>>>> Today, with users back I can see that normal and incremental syncs > > >>>>>> are being done on the 15 connections, with an occasional full sync > > >>>>>> here or there and lots of "Waiting 'failed' requests": > > >>>>>> > > >>>>>> Queued 'sync' requests 0 > > >>>>>> > > >>>>>> Queued 'high' requests 0 > > >>>>>> > > >>>>>> Queued 'low' requests 0 > > >>>>>> > > >>>>>> Queued 'failed' requests 122 > > >>>>>> > > >>>>>> Queued 'full resync' requests 28785 > > >>>>>> > > >>>>>> Waiting 'failed' requests 4294 > > >>>>>> > > >>>>>> Total number of known users 42512 > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> So, why didn't the replicator take advantage of the weekend to > > >>>>>> replicate the mailboxes while no user were using them? > > >>>>>> > > >>>>>> Arnaud > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> On 25/04/2022 13:54, Sebastian Marske wrote: > > >>>>>>> Hi there, > > >>>>>>> > > >>>>>>> thanks for your insights and for diving deeper into this Paul! > > >>>>>>> > > >>>>>>> For me, the users ending up in 'Waiting for dsync to finish' all > have > > >>>>>>> more than 256 Imap folders as well (ranging from 288 up to >5500; > > >>>>>>> as per > > >>>>>>> 'doveadm mailbox list -u <username> | wc -l'). For more details > on my > > >>>>>>> setup please see my post from February [1]. > > >>>>>>> > > >>>>>>> @Arnaud: What OS are you running on? > > >>>>>>> > > >>>>>>> > > >>>>>>> Best > > >>>>>>> Sebastian > > >>>>>>> > > >>>>>>> > > >>>>>>> [1] > https://dovecot.org/pipermail/dovecot/2022-February/124168.html > > >>>>>>> > > >>>>>>> > > >>>>>>> On 4/24/22 19:36, Paul Kudla (SCOM.CA Internet Services Inc.) > wrote: > > >>>>>>>> > > >>>>>>>> Question having similiar replication issues > > >>>>>>>> > > >>>>>>>> pls read everything below and advise the folder counts on the > > >>>>>>>> non-replicated users? > > >>>>>>>> > > >>>>>>>> i find the total number of folders / account seems to be a > factor > > >>>>>>>> and > > >>>>>>>> NOT the size of the mail box > > >>>>>>>> > > >>>>>>>> ie i have customers with 40G of emails no problem over 40 or so > > >>>>>>>> folders > > >>>>>>>> and it works ok > > >>>>>>>> > > >>>>>>>> 300+ folders seems to be the issue > > >>>>>>>> > > >>>>>>>> i have been going through the replication code > > >>>>>>>> > > >>>>>>>> no errors being logged > > >>>>>>>> > > >>>>>>>> i am assuming that the replication --> dhclient --> other > server is > > >>>>>>>> timing out or not reading the folder lists correctly (ie dies > after X > > >>>>>>>> folders read) > > >>>>>>>> > > >>>>>>>> thus i am going through the code patching for log entries etc > to find > > >>>>>>>> the issues. > > >>>>>>>> > > >>>>>>>> see > > >>>>>>>> > > >>>>>>>> [13:33:57] mail18.scom.ca [root:0] /usr/local/var/lib/dovecot > > >>>>>>>> # ll > > >>>>>>>> total 86 > > >>>>>>>> drwxr-xr-x 2 root wheel uarch 4B Apr 24 11:11 . > > >>>>>>>> drwxr-xr-x 4 root wheel uarch 4B Mar 8 2021 .. > > >>>>>>>> -rw-r--r-- 1 root wheel uarch 73B Apr 24 11:11 instances > > >>>>>>>> -rw-r--r-- 1 root wheel uarch 160K Apr 24 13:33 > replicator.db > > >>>>>>>> > > >>>>>>>> [13:33:58] mail18.scom.ca [root:0] /usr/local/var/lib/dovecot > > >>>>>>>> # > > >>>>>>>> > > >>>>>>>> replicator.db seems to get updated ok but never processed > properly. > > >>>>>>>> > > >>>>>>>> # sync.users > > >>>>>>>> n...@elirpa.com high 00:09:41 463:47:01 > - y > > >>>>>>>> ke...@elirpa.com high 00:09:23 463:45:43 > - y > > >>>>>>>> p...@scom.ca high 00:09:41 463:46:51 > - y > > >>>>>>>> e...@scom.ca high 00:09:43 463:47:01 > - y > > >>>>>>>> ed.ha...@dssmgmt.com high 00:09:42 463:46:58 > - y > > >>>>>>>> p...@paulkudla.net high 00:09:44 463:47:03 > > >>>>>>>> 580:35:07 > > >>>>>>>> y > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> so .... > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> two things : > > >>>>>>>> > > >>>>>>>> first to get the production stuff to work i had to write a > script > > >>>>>>>> that > > >>>>>>>> whould find the bad sync's and the force a dsync between the > servers > > >>>>>>>> > > >>>>>>>> i run this every five minutes or each server. > > >>>>>>>> > > >>>>>>>> in crontab > > >>>>>>>> > > >>>>>>>> */10 * * * * root /usr/bin/nohup > > >>>>>>>> /programs/common/sync.recover > /dev/null > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> python script to sort things out > > >>>>>>>> > > >>>>>>>> # cat /programs/common/sync.recover > > >>>>>>>> #!/usr/local/bin/python3 > > >>>>>>>> > > >>>>>>>> #Force sync between servers that are reporting bad? > > >>>>>>>> > > >>>>>>>> import os,sys,django,socket > > >>>>>>>> from optparse import OptionParser > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> from lib import * > > >>>>>>>> > > >>>>>>>> #Sample Re-Index MB > > >>>>>>>> #doveadm -D force-resync -u p...@scom.ca -f INBOX* > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> USAGE_TEXT = '''\ > > >>>>>>>> usage: %%prog %s[options] > > >>>>>>>> ''' > > >>>>>>>> > > >>>>>>>> parser = OptionParser(usage=USAGE_TEXT % '', version='0.4') > > >>>>>>>> > > >>>>>>>> parser.add_option("-m", "--send_to", dest="send_to", help="Send > > >>>>>>>> Email To") > > >>>>>>>> parser.add_option("-e", "--email", dest="email_box", help="Box > to > > >>>>>>>> Index") > > >>>>>>>> parser.add_option("-d", "--detail",action='store_true', > > >>>>>>>> dest="detail",default =False, help="Detailed report") > > >>>>>>>> parser.add_option("-i", "--index",action='store_true', > > >>>>>>>> dest="index",default =False, help="Index") > > >>>>>>>> > > >>>>>>>> options, args = parser.parse_args() > > >>>>>>>> > > >>>>>>>> print (options.email_box) > > >>>>>>>> print (options.send_to) > > >>>>>>>> print (options.detail) > > >>>>>>>> > > >>>>>>>> #sys.exit() > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> print ('Getting Current User Sync Status') > > >>>>>>>> command = commands("/usr/local/bin/doveadm replicator status > '*'") > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> #print command > > >>>>>>>> > > >>>>>>>> sync_user_status = command.output.split('\n') > > >>>>>>>> > > >>>>>>>> #print sync_user_status > > >>>>>>>> > > >>>>>>>> synced = [] > > >>>>>>>> > > >>>>>>>> for n in range(1,len(sync_user_status)) : > > >>>>>>>> user = sync_user_status[n] > > >>>>>>>> print ('Processing User : %s' %user.split(' ')[0]) > > >>>>>>>> if user.split(' ')[0] != options.email_box : > > >>>>>>>> if options.email_box != None : > > >>>>>>>> continue > > >>>>>>>> > > >>>>>>>> if options.index == True : > > >>>>>>>> command = '/usr/local/bin/doveadm -D > force-resync > > >>>>>>>> -u %s > > >>>>>>>> -f INBOX*' %user.split(' ')[0] > > >>>>>>>> command = commands(command) > > >>>>>>>> command = command.output > > >>>>>>>> > > >>>>>>>> #print user > > >>>>>>>> for nn in range (len(user)-1,0,-1) : > > >>>>>>>> #print nn > > >>>>>>>> #print user[nn] > > >>>>>>>> > > >>>>>>>> if user[nn] == '-' : > > >>>>>>>> #print 'skipping ... %s' > %user.split(' ')[0] > > >>>>>>>> > > >>>>>>>> break > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> if user[nn] == 'y': #Found a Bad Mailbox > > >>>>>>>> print ('syncing ... %s' %user.split(' > ')[0]) > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> if options.detail == True : > > >>>>>>>> command = > '/usr/local/bin/doveadm -D > > >>>>>>>> sync -u %s -d -N -l 30 -U' %user.split(' ')[0] > > >>>>>>>> print (command) > > >>>>>>>> command = commands(command) > > >>>>>>>> command = > command.output.split('\n') > > >>>>>>>> print (command) > > >>>>>>>> print ('Processed Mailbox for > ... > > >>>>>>>> %s' > > >>>>>>>> %user.split(' ')[0] ) > > >>>>>>>> synced.append('Processed > Mailbox > > >>>>>>>> for ... > > >>>>>>>> %s' %user.split(' ')[0]) > > >>>>>>>> for nnn in > range(len(command)): > > >>>>>>>> synced.append(command[nnn] + '\n') > > >>>>>>>> break > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> if options.detail == False : > > >>>>>>>> #command = > > >>>>>>>> '/usr/local/bin/doveadm -D > > >>>>>>>> sync -u %s -d -N -l 30 -U' %user.split(' ')[0] > > >>>>>>>> #print (command) > > >>>>>>>> #command = os.system(command) > > >>>>>>>> command = subprocess.Popen( > > >>>>>>>> ["/usr/local/bin/doveadm sync -u %s -d -N -l 30 -U" > %user.split(' > > >>>>>>>> ')[0] > > >>>>>>>> ], \ > > >>>>>>>> shell = True, stdin=None, > > >>>>>>>> stdout=None, > > >>>>>>>> stderr=None, close_fds=True) > > >>>>>>>> > > >>>>>>>> print ( 'Processed Mailbox for > > >>>>>>>> ... %s' > > >>>>>>>> %user.split(' ')[0] ) > > >>>>>>>> synced.append('Processed > Mailbox > > >>>>>>>> for ... > > >>>>>>>> %s' %user.split(' ')[0]) > > >>>>>>>> #sys.exit() > > >>>>>>>> break > > >>>>>>>> > > >>>>>>>> if len(synced) != 0 : > > >>>>>>>> #send email showing bad synced boxes ? > > >>>>>>>> > > >>>>>>>> if options.send_to != None : > > >>>>>>>> send_from = 'moni...@scom.ca' > > >>>>>>>> send_to = ['%s' %options.send_to] > > >>>>>>>> send_subject = 'Dovecot Bad Sync Report for : > %s' > > >>>>>>>> %(socket.gethostname()) > > >>>>>>>> send_text = '\n\n' > > >>>>>>>> for n in range (len(synced)) : > > >>>>>>>> send_text = send_text + synced[n] + > '\n' > > >>>>>>>> > > >>>>>>>> send_files = [] > > >>>>>>>> sendmail (send_from, send_to, send_subject, > > >>>>>>>> send_text, > > >>>>>>>> send_files) > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> sys.exit() > > >>>>>>>> > > >>>>>>>> second : > > >>>>>>>> > > >>>>>>>> i posted this a month ago - no response > > >>>>>>>> > > >>>>>>>> please appreciate that i am trying to help .... > > >>>>>>>> > > >>>>>>>> after much testing i can now reporduce the replication issues > at hand > > >>>>>>>> > > >>>>>>>> I am running on freebsd 12 & 13 stable (both test and production > > >>>>>>>> servers) > > >>>>>>>> > > >>>>>>>> sdram drives etc ... > > >>>>>>>> > > >>>>>>>> Basically replication works fine until reaching a folder > quantity > > >>>>>>>> of ~ > > >>>>>>>> 256 or more > > >>>>>>>> > > >>>>>>>> to reproduce using doveadm i created folders like > > >>>>>>>> > > >>>>>>>> INBOX/folder-0 > > >>>>>>>> INBOX/folder-1 > > >>>>>>>> INBOX/folder-2 > > >>>>>>>> INBOX/folder-3 > > >>>>>>>> and so forth ...... > > >>>>>>>> > > >>>>>>>> I created 200 folders and they replicated ok on both servers > > >>>>>>>> > > >>>>>>>> I created another 200 (400 total) and the replicator got stuck > and > > >>>>>>>> would > > >>>>>>>> not update the mbox on the alternate server anymore and is still > > >>>>>>>> updating 4 days later ? > > >>>>>>>> > > >>>>>>>> basically replicator goes so far and either hangs or more likely > > >>>>>>>> bails > > >>>>>>>> on an error that is not reported to the debug reporting ? > > >>>>>>>> > > >>>>>>>> however dsync will sync the two servers but only when run > manually > > >>>>>>>> (ie > > >>>>>>>> all the folders will sync) > > >>>>>>>> > > >>>>>>>> I have two test servers avaliable if you need any kind of > access - > > >>>>>>>> again > > >>>>>>>> here to help. > > >>>>>>>> > > >>>>>>>> [07:28:42] mail18.scom.ca [root:0] ~ > > >>>>>>>> # sync.status > > >>>>>>>> Queued 'sync' requests 0 > > >>>>>>>> Queued 'high' requests 6 > > >>>>>>>> Queued 'low' requests 0 > > >>>>>>>> Queued 'failed' requests 0 > > >>>>>>>> Queued 'full resync' requests 0 > > >>>>>>>> Waiting 'failed' requests 0 > > >>>>>>>> Total number of known users 255 > > >>>>>>>> > > >>>>>>>> username type status > > >>>>>>>> p...@scom.ca normal Waiting for dsync to > > >>>>>>>> finish > > >>>>>>>> ke...@elirpa.com incremental Waiting for dsync to > > >>>>>>>> finish > > >>>>>>>> ed.ha...@dssmgmt.com incremental Waiting for dsync to > > >>>>>>>> finish > > >>>>>>>> e...@scom.ca incremental Waiting for dsync to > > >>>>>>>> finish > > >>>>>>>> n...@elirpa.com incremental Waiting for dsync to > > >>>>>>>> finish > > >>>>>>>> p...@paulkudla.net incremental Waiting for dsync to > > >>>>>>>> finish > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> i have been going through the c code and it seems the > replication > > >>>>>>>> gets > > >>>>>>>> requested ok > > >>>>>>>> > > >>>>>>>> replicator.db does get updated ok with the replicated request > for the > > >>>>>>>> mbox in question. > > >>>>>>>> > > >>>>>>>> however i am still looking for the actual replicator function > in the > > >>>>>>>> lib's that do the actual replication requests > > >>>>>>>> > > >>>>>>>> the number of folders & subfolders is defanately the issue - > not the > > >>>>>>>> mbox pyhsical size as thought origionally. > > >>>>>>>> > > >>>>>>>> if someone can point me in the right direction, it seems either > the > > >>>>>>>> replicator is not picking up on the number of folders to > replicat > > >>>>>>>> properly or it has a hard set limit like 256 / 512 / 65535 etc > and > > >>>>>>>> stops > > >>>>>>>> the replication request thereafter. > > >>>>>>>> > > >>>>>>>> I am mainly a machine code programmer from the 80's and have > > >>>>>>>> concentrated on python as of late, 'c' i am starting to go > through > > >>>>>>>> just > > >>>>>>>> to give you a background on my talents. > > >>>>>>>> > > >>>>>>>> It took 2 months to finger this out. > > >>>>>>>> > > >>>>>>>> this issue also seems to be indirectly causing the duplicate > messages > > >>>>>>>> supression not to work as well. > > >>>>>>>> > > >>>>>>>> python programming to reproduce issue (loops are for last run > > >>>>>>>> started @ > > >>>>>>>> 200 - fyi) : > > >>>>>>>> > > >>>>>>>> # cat mbox.gen > > >>>>>>>> #!/usr/local/bin/python2 > > >>>>>>>> > > >>>>>>>> import os,sys > > >>>>>>>> > > >>>>>>>> from lib import * > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> user = 'p...@paulkudla.net' > > >>>>>>>> > > >>>>>>>> """ > > >>>>>>>> for count in range (0,600) : > > >>>>>>>> box = 'INBOX/folder-%s' %count > > >>>>>>>> print count > > >>>>>>>> command = '/usr/local/bin/doveadm mailbox create -s > -u %s > > >>>>>>>> %s' > > >>>>>>>> %(user,box) > > >>>>>>>> print command > > >>>>>>>> a = commands.getoutput(command) > > >>>>>>>> print a > > >>>>>>>> """ > > >>>>>>>> > > >>>>>>>> for count in range (0,600) : > > >>>>>>>> box = 'INBOX/folder-0/sub-%' %count > > >>>>>>>> print count > > >>>>>>>> command = '/usr/local/bin/doveadm mailbox create -s > -u %s > > >>>>>>>> %s' > > >>>>>>>> %(user,box) > > >>>>>>>> print command > > >>>>>>>> a = commands.getoutput(command) > > >>>>>>>> print a > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> #sys.exit() > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> Happy Sunday !!! > > >>>>>>>> Thanks - paul > > >>>>>>>> > > >>>>>>>> Paul Kudla > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> Scom.ca Internet Services <http://www.scom.ca> > > >>>>>>>> 004-1009 Byron Street South > > >>>>>>>> Whitby, Ontario - Canada > > >>>>>>>> L1N 4S3 > > >>>>>>>> > > >>>>>>>> Toronto 416.642.7266 > > >>>>>>>> Main 1.866.411.7266 > > >>>>>>>> Fax 1.888.892.7266 > > >>>>>>>> > > >>>>>>>> On 4/24/2022 10:22 AM, Arnaud Abélard wrote: > > >>>>>>>>> Hello, > > >>>>>>>>> > > >>>>>>>>> I am working on replicating a server (and adding compression > on the > > >>>>>>>>> other side) and since I had "Error: dsync I/O has stalled, no > > >>>>>>>>> activity > > >>>>>>>>> for 600 seconds (version not received)" errors I upgraded both > > >>>>>>>>> source > > >>>>>>>>> and destination server with the latest 2.3 version (2.3.18). > While > > >>>>>>>>> before the upgrade all the 15 replication connections were busy > > >>>>>>>>> after > > >>>>>>>>> upgrading dovecot replicator dsync-status shows that most of > the > > >>>>>>>>> time > > >>>>>>>>> nothing is being replicated at all. I can see some brief > > >>>>>>>>> replications > > >>>>>>>>> that last, but 99,9% of the time nothing is happening at all. > > >>>>>>>>> > > >>>>>>>>> I have a replication_full_sync_interval of 12 hours but I have > > >>>>>>>>> thousands of users with their last full sync over 90 hours ago. > > >>>>>>>>> > > >>>>>>>>> "doveadm replicator status" also shows that i have over 35,000 > > >>>>>>>>> queued > > >>>>>>>>> full resync requests, but no sync, high or low queued requests > so > > >>>>>>>>> why > > >>>>>>>>> aren't the full requests occuring? > > >>>>>>>>> > > >>>>>>>>> There are no errors in the logs. > > >>>>>>>>> > > >>>>>>>>> Thanks, > > >>>>>>>>> > > >>>>>>>>> Arnaud > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > > >>>>>> > > >>>>> > > >>> > > > >