about the replication

2008-01-23 Thread Rudy Gevaert

Hi,

The past week I finally set my mind to make some scripts to check if a 
mailbox on the master was the same as on the replica.


My back-ends started running 2.3.7 and I upgraded to 2.3.10 some time ago.

I ran my script last night on one of the back-ends.  This one has 20928 
mailboxes.  And 1067 are not in sync.  From those, 897 users didn't have 
all their mailboxes replicated to the replica.  The others had the same 
folder count, but one or more messages were not the same (or weren't there).


To check if the folder count was the same I just counted the folders on 
the master and the replica.  To see if the messages were the same, I 
used the GUID of each mail message.


So, if you are using replication, you should really have a look at it, 
if you have haven't started already :)


Also, I would really like to see the GUID commands taken up in upstream! 
 Thanks to fastmail.fm for hacking that into cyrus and releasing there 
patches.


Rudy

--
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Rudy Gevaert  [EMAIL PROTECTED]  tel:+32 9 264 4734
Directie ICT, afd. Infrastructuur ICT Department, Infrastructure office
Groep SystemenSystems group
Universiteit Gent Ghent University
Krijgslaan 281, gebouw S9, 9000 Gent, Belgie   www.UGent.be
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --


Re: about the replication

2008-01-23 Thread Rob Mueller



I ran my script last night on one of the back-ends.  This one has 20928
mailboxes.  And 1067 are not in sync.  From those, 897 users didn't have 
all their mailboxes replicated to the replica.  The others had the same 
folder count, but one or more messages were not the same (or weren't 
there).


We also have a script which does this for all users every week. It checks 
for each user that the following match:


1. Folder list
2. Folder subscriptions
3. Quota (total + used)
4. Sieve scripts
5. For every folder, status output for (messages uidnext unseen recent 
uidvalidity)

6. For every message in every folder, flags + message sizes + GUID

All of those are reasonably quick since they access only meta-data. We also 
check on a regular basis for a random sampling of messages that rfc822.sha1 
on both sides match.


The script also waits for a few seconds and retries if there's any problems 
(replication is non-synchronous, so there may be delays in differences 
between both sides).


If there's a mismatch problem, the script can be set to:

1) Run a sync_client -u
2) Run a reconstruct on both sides
3) Completely delete the replica mailbox and run a sync_client -u to build a 
complete new copy from scratch


One annoying thing we've found recently is that sync_client -u doesn't fix 
quota problems, it seems the protocol doesn't include an absolute set 
replica quota to this action. We'd been meaning to look into that but 
hadn't got to it.


David: Don't suppose you could check that out? It would be nice if 
sync_client -u could really fix every user problem, it's very close to that 
right now, quota mismatches seems to be about the only thing it can't fix.


FYI, unfortunately at the moment this script is fairly closely tied to our 
system (it uses a bunch of other modules to work out master/replica servers 
for a user). It probably would be nice to factor that stuff out so others 
could use it...


Rob



Re: about the replication

2008-01-23 Thread Rudy Gevaert

Rob Mueller wrote:



I ran my script last night on one of the back-ends.  This one has 20928
mailboxes.  And 1067 are not in sync.  From those, 897 users didn't 
have all their mailboxes replicated to the replica.  The others had 
the same folder count, but one or more messages were not the same (or 
weren't there).


We also have a script which does this for all users every week. It 
checks for each user that the following match:


1. Folder list
2. Folder subscriptions
3. Quota (total + used)
4. Sieve scripts
5. For every folder, status output for (messages uidnext unseen recent 
uidvalidity)

6. For every message in every folder, flags + message sizes + GUID


I'm moving to a mix of thoses.  Once you have the framework adding an 
extra check is only a little more work.


FYI, unfortunately at the moment this script is fairly closely tied to 
our system (it uses a bunch of other modules to work out master/replica 
servers for a user). It probably would be nice to factor that stuff out 
so others could use it...




Mine is fairly generic, when it's more robust I'll put it online somewhere.

Rudy

--
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Rudy Gevaert  [EMAIL PROTECTED]  tel:+32 9 264 4734
Directie ICT, afd. Infrastructuur  Direction ICT, Infrastructure dept.
Groep Systemen Systems group
Universiteit Gent  Ghent University
Krijgslaan 281, gebouw S9, 9000 Gent, Belgie   www.UGent.be
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --