about the replication
Hi, The past week I finally set my mind to make some scripts to check if a mailbox on the master was the same as on the replica. My back-ends started running 2.3.7 and I upgraded to 2.3.10 some time ago. I ran my script last night on one of the back-ends. This one has 20928 mailboxes. And 1067 are not in sync. From those, 897 users didn't have all their mailboxes replicated to the replica. The others had the same folder count, but one or more messages were not the same (or weren't there). To check if the folder count was the same I just counted the folders on the master and the replica. To see if the messages were the same, I used the GUID of each mail message. So, if you are using replication, you should really have a look at it, if you have haven't started already :) Also, I would really like to see the GUID commands taken up in upstream! Thanks to fastmail.fm for hacking that into cyrus and releasing there patches. Rudy -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- Rudy Gevaert [EMAIL PROTECTED] tel:+32 9 264 4734 Directie ICT, afd. Infrastructuur ICT Department, Infrastructure office Groep SystemenSystems group Universiteit Gent Ghent University Krijgslaan 281, gebouw S9, 9000 Gent, Belgie www.UGent.be -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Re: about the replication
I ran my script last night on one of the back-ends. This one has 20928 mailboxes. And 1067 are not in sync. From those, 897 users didn't have all their mailboxes replicated to the replica. The others had the same folder count, but one or more messages were not the same (or weren't there). We also have a script which does this for all users every week. It checks for each user that the following match: 1. Folder list 2. Folder subscriptions 3. Quota (total + used) 4. Sieve scripts 5. For every folder, status output for (messages uidnext unseen recent uidvalidity) 6. For every message in every folder, flags + message sizes + GUID All of those are reasonably quick since they access only meta-data. We also check on a regular basis for a random sampling of messages that rfc822.sha1 on both sides match. The script also waits for a few seconds and retries if there's any problems (replication is non-synchronous, so there may be delays in differences between both sides). If there's a mismatch problem, the script can be set to: 1) Run a sync_client -u 2) Run a reconstruct on both sides 3) Completely delete the replica mailbox and run a sync_client -u to build a complete new copy from scratch One annoying thing we've found recently is that sync_client -u doesn't fix quota problems, it seems the protocol doesn't include an absolute "set replica quota to this" action. We'd been meaning to look into that but hadn't got to it. David: Don't suppose you could check that out? It would be nice if sync_client -u could really fix every user problem, it's very close to that right now, quota mismatches seems to be about the only thing it can't fix. FYI, unfortunately at the moment this script is fairly closely tied to our system (it uses a bunch of other modules to work out master/replica servers for a user). It probably would be nice to factor that stuff out so others could use it... Rob
Re: about the replication
Rob Mueller wrote: I ran my script last night on one of the back-ends. This one has 20928 mailboxes. And 1067 are not in sync. From those, 897 users didn't have all their mailboxes replicated to the replica. The others had the same folder count, but one or more messages were not the same (or weren't there). We also have a script which does this for all users every week. It checks for each user that the following match: 1. Folder list 2. Folder subscriptions 3. Quota (total + used) 4. Sieve scripts 5. For every folder, status output for (messages uidnext unseen recent uidvalidity) 6. For every message in every folder, flags + message sizes + GUID I'm moving to a mix of thoses. Once you have the framework adding an extra check is only a little more work. FYI, unfortunately at the moment this script is fairly closely tied to our system (it uses a bunch of other modules to work out master/replica servers for a user). It probably would be nice to factor that stuff out so others could use it... Mine is fairly generic, when it's more robust I'll put it online somewhere. Rudy -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- Rudy Gevaert [EMAIL PROTECTED] tel:+32 9 264 4734 Directie ICT, afd. Infrastructuur Direction ICT, Infrastructure dept. Groep Systemen Systems group Universiteit Gent Ghent University Krijgslaan 281, gebouw S9, 9000 Gent, Belgie www.UGent.be -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Re: about the replication
On Thu, 24 Jan 2008, Rob Mueller wrote: David: Don't suppose you could check that out? It would be nice if sync_client -u could really fix every user problem, it's very close to that right now, quota mismatches seems to be about the only thing it can't fix. In my experience the master is wrong as often as the replica. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH.
Re: about the replication
David: Don't suppose you could check that out? It would be nice if sync_client -u could really fix every user problem, it's very close to that right now, quota mismatches seems to be about the only thing it can't fix. In my experience the master is wrong as often as the replica. Probably true, but still doing a sync_client -u should always bring everything about a user on the replica in sync with the master shouldn't it? It does everything else, except quotas... Rob
Re: about the replication
On Thu, 24 Jan 2008, Rob Mueller wrote: In my experience the master is wrong as often as the replica. Probably true, but still doing a sync_client -u should always bring everything about a user on the replica in sync with the master shouldn't it? It does everything else, except quotas... I run "quota" on master and replica every once in a while and compare. That's an easy way to find and fix the problem cases, regardless of whether the master or replica was at fault. There are typically only about half a dozen cases each time, so I don't think that it's a huge problem. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH.
Re: about the replication
I run "quota" on master and replica every once in a while and compare. That's an easy way to find and fix the problem cases, regardless of whether the master or replica was at fault. There are typically only about half a dozen cases each time, so I don't think that it's a huge problem. I agree, it's not a huge problem, and there are other ways to fix it. However coming back to my main point, shouldn't sync_client -u be guaranteed to make a replica be in sync with the master for a given user for everything; folders, subscriptions, flags, seen state, sieve script, quotas, etc. Currently it seems to do everything but quotas, which is annoyingly inconsistent! Rob
Re: about the replication
Any idea of what is breaking the sync ? On Jan 24, 2008 10:41 AM, Rob Mueller <[EMAIL PROTECTED]> wrote: > > > I run "quota" on master and replica every once in a while and compare. > > > > That's an easy way to find and fix the problem cases, regardless of > > whether the master or replica was at fault. There are typically only about > > half a dozen cases each time, so I don't think that it's a huge problem. > > I agree, it's not a huge problem, and there are other ways to fix it. > > However coming back to my main point, shouldn't sync_client -u be guaranteed > to make a replica be in sync with the master for a given user for > everything; folders, subscriptions, flags, seen state, sieve script, quotas, > etc. Currently it seems to do everything but quotas, which is annoyingly > inconsistent! > > Rob > > -- Alain Spineux aspineux gmail com May the sources be with you