sync_client problems

2017-05-10 Thread Eric Cunningham
Hi list, I'm running cyrus replication on cyrus 2.5.10.  It seems 
whenever a folder is encountered in log-run that doesn't exist on the 
client and/or the server, the replication process logs "Error in 
do_sync(): bailing out! Bad protocol".  Sometimes, it dies completely, 
other times it restarts at the beginning of the log-run file.  But when 
it re-encounters the folder in question, the same failure occurs and the 
problem continuously loops until some manual action is taken.  I have a 
checker script that attempts to restart a stopped sync but I haven't 
tried to code it to get past these types of folder errors.  In the 
meantime, replication effectively stops and the replication log 
continuously builds in size.  You might imagine my horror at discovering 
yesterday that replication of my production server with 1200 accounts 
had been failing for over 7 weeks unnoticed, generating 9.5 million 
lines in the replication log files.  I had gotten out of the habit of 
daily monitoring of replication as it had been running fine and I 
eventually moved on to other things.  I'm trying to get replication to 
run through the enormous backlog but am running into more similar folder 
issues that cause the process to fail, I make manual fixes, and start 
the process again.  In my experience with this, reconstruction of the 
affected folders on the master host and copy host has no effect.


Can this otherwise excellent feature be made more robust to work past 
these folder issues?


Here are a couple of examples I've observed thus far of how the process 
is failing for me.


--

In this example, the subfolder "Fly America Foreign Travel" no longer 
existed on the master host but did exist on the copy host.  To resolve 
this, I had to manually delete that folder from the copy host.


May  9 15:17:48 imap1 sync_client[34614]: inefficient replication (8 > 
5) user.jtaft.PO.Fly America Foreign Travel
May  9 15:17:48 imap1 sync_client[34614]: FULLMAILBOX received NO 
response: IMAP_MAILBOX_NONEXISTENT No Such Mailbox
May  9 15:17:48 imap1 sync_client[34614]: do_folders(): update failed: 
user.jtaft.PO.Fly America Foreign Travel 'Mailbox does not exist'

May  9 15:17:48 imap1 sync_client[34614]: IOERROR: Mailbox does not exist
May  9 15:17:48 imap1 sync_client[34614]: Error in do_sync(): bailing 
out! Mailbox does not exist
May  9 15:17:48 imap1 sync_client[34614]: Processing sync log file 
/var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run failed: Mailbox does not exist
May  9 15:17:48 imap1 sync_client[34614]: Reprocessing sync log file 
/var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run


--

In this example, the subfolder "Sohn" did not exist on the master host 
nor on the copy host.  To resolve, I had to edit log-run and completely 
remove all entries for this "Sohn" subfolder.


May 10 10:13:26 imap1 sync_client[47011]: IOERROR: zero length response 
to MAILBOXES (end of file reached)
May 10 10:13:26 imap1 sync_client[47011]: IOERROR: zero length response 
to MAILBOXES (end of file reached)
May 10 10:13:26 imap1 sync_client[47011]: IOERROR: zero length response 
to UNMAILBOX (end of file reached)
May 10 10:13:26 imap1 sync_client[47011]: folder_delete(): failed: 
user.gg-chair.Trash.Sohn 'Bad protocol'
May 10 10:13:26 imap1 sync_client[47011]: Error in do_sync(): bailing 
out! Bad protocol
May 10 10:13:26 imap1 sync_client[47011]: Processing sync log file 
/var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run failed: Bad protocol


--

Often, errors are logged without "bailing out!" and replication 
continues to run:


May 10 10:45:00 imap1 sync_client[87438]: sync_mailboxes: doing 1000
May 10 10:45:01 imap1 sync_client[87438]: sync_mailboxes: doing 1000
May 10 10:45:24 imap1 sync_client[87438]: IOERROR: GUID mismatch 
/var/spool/cyrus/mail/c/user/cdm-lit/Sent/148.
May 10 10:45:37 imap1 sync_client[87438]: IOERROR: zero length response 
to MAILBOX (end of file reached)
May 10 10:45:37 imap1 sync_client[87438]: do_folders(): update failed: 
user.cdm-lit.Sent 'Bad protocol'

May 10 10:45:37 imap1 sync_client[87438]: sync_mailboxes: doing 1000

--

Thank you, list!

-Eric


Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Re: sync_client problems

2017-05-10 Thread Bron Gondwana
Looks like you have a corrupted mailboxes database - if you run a reconstruct 
at each end then it should be fine.  Replication won't bail for an 
actually-doesn't-exist mailbox, but it can't fix broken mailboxes, that's what 
reconstruct is for.

Bron.


On Thu, 11 May 2017, at 01:01, Eric Cunningham wrote:
> Hi list, I'm running cyrus replication on cyrus 2.5.10.  It seems 
> whenever a folder is encountered in log-run that doesn't exist on the 
> client and/or the server, the replication process logs "Error in 
> do_sync(): bailing out! Bad protocol".  Sometimes, it dies completely, 
> other times it restarts at the beginning of the log-run file.  But when 
> it re-encounters the folder in question, the same failure occurs and the 
> problem continuously loops until some manual action is taken.  I have a 
> checker script that attempts to restart a stopped sync but I haven't 
> tried to code it to get past these types of folder errors.  In the 
> meantime, replication effectively stops and the replication log 
> continuously builds in size.  You might imagine my horror at discovering 
> yesterday that replication of my production server with 1200 accounts 
> had been failing for over 7 weeks unnoticed, generating 9.5 million 
> lines in the replication log files.  I had gotten out of the habit of 
> daily monitoring of replication as it had been running fine and I 
> eventually moved on to other things.  I'm trying to get replication to 
> run through the enormous backlog but am running into more similar folder 
> issues that cause the process to fail, I make manual fixes, and start 
> the process again.  In my experience with this, reconstruction of the 
> affected folders on the master host and copy host has no effect.
> 
> Can this otherwise excellent feature be made more robust to work past 
> these folder issues?
> 
> Here are a couple of examples I've observed thus far of how the process 
> is failing for me.
> 
> --
> 
> In this example, the subfolder "Fly America Foreign Travel" no longer 
> existed on the master host but did exist on the copy host.  To resolve 
> this, I had to manually delete that folder from the copy host.
> 
> May  9 15:17:48 imap1 sync_client[34614]: inefficient replication (8 > 
> 5) user.jtaft.PO.Fly America Foreign Travel
> May  9 15:17:48 imap1 sync_client[34614]: FULLMAILBOX received NO 
> response: IMAP_MAILBOX_NONEXISTENT No Such Mailbox
> May  9 15:17:48 imap1 sync_client[34614]: do_folders(): update failed: 
> user.jtaft.PO.Fly America Foreign Travel 'Mailbox does not exist'
> May  9 15:17:48 imap1 sync_client[34614]: IOERROR: Mailbox does not exist
> May  9 15:17:48 imap1 sync_client[34614]: Error in do_sync(): bailing 
> out! Mailbox does not exist
> May  9 15:17:48 imap1 sync_client[34614]: Processing sync log file 
> /var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run failed: Mailbox does not exist
> May  9 15:17:48 imap1 sync_client[34614]: Reprocessing sync log file 
> /var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run
> 
> --
> 
> In this example, the subfolder "Sohn" did not exist on the master host 
> nor on the copy host.  To resolve, I had to edit log-run and completely 
> remove all entries for this "Sohn" subfolder.
> 
> May 10 10:13:26 imap1 sync_client[47011]: IOERROR: zero length response 
> to MAILBOXES (end of file reached)
> May 10 10:13:26 imap1 sync_client[47011]: IOERROR: zero length response 
> to MAILBOXES (end of file reached)
> May 10 10:13:26 imap1 sync_client[47011]: IOERROR: zero length response 
> to UNMAILBOX (end of file reached)
> May 10 10:13:26 imap1 sync_client[47011]: folder_delete(): failed: 
> user.gg-chair.Trash.Sohn 'Bad protocol'
> May 10 10:13:26 imap1 sync_client[47011]: Error in do_sync(): bailing 
> out! Bad protocol
> May 10 10:13:26 imap1 sync_client[47011]: Processing sync log file 
> /var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run failed: Bad protocol
> 
> --
> 
> Often, errors are logged without "bailing out!" and replication 
> continues to run:
> 
> May 10 10:45:00 imap1 sync_client[87438]: sync_mailboxes: doing 1000
> May 10 10:45:01 imap1 sync_client[87438]: sync_mailboxes: doing 1000
> May 10 10:45:24 imap1 sync_client[87438]: IOERROR: GUID mismatch 
> /var/spool/cyrus/mail/c/user/cdm-lit/Sent/148.
> May 10 10:45:37 imap1 sync_client[87438]: IOERROR: zero length response 
> to MAILBOX (end of file reached)
> May 10 10:45:37 imap1 sync_client[87438]: do_folders(): update failed: 
> user.cdm-lit.Sent 'Bad protocol'
> May 10 10:45:37 imap1 sync_client[87438]: sync_mailboxes: doing 1000
> 
> --
> 
> Thank you, list!
> 
> -Eric
> 
> 
> Cyrus Home Page: http://www.cyrusimap.org/
> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
> To Unsubscribe:
> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


-- 
  Bron Gondwana
  br...@fastmail.fm

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cy