Re: cyrus replication validation
If I understand this patch correctly, it doesn't solve the larger problem that I'm interested in: is the data on my replica the same as the data on my primary, or more to the point, are the two data sets converging? This patch *would* allow me to more or less validate that the cyrus.* meta files are more or less the same on both, and takes care of the sync_server staling staging issue -- whether the cyrus.* meta files match the other files. But I'm really interested in something that can run out of band from csync, imap, etc, that examines files on the primary and replica to know what the variance is. I think make_md5 is pretty ideal for what I'm after, as a source of data. We're working on scripts that compare the data files. :wes On 06 Apr 2007, at 22:31, Rob Mueller wrote: The provided Cyrus tool make_md5 is for validating replication. It would, for instance, have found the recently discussed bug in sync_server that caused random files to be overwritten in the event that sync_server reused a stale staging file. It would probably be cool if there were documentation somewhere that advised people on how to run it and how to use it to validate replication. We have a patch that helps with this as well see MD5 UUIDs here: http://cyrus.brong.fastmail.fm/ Basically it does two things: 1. You can make the UUIDs of all messages the first 11 bytes of the MD5 of the message 2. You can fetch a computed MD5 of any message on disk via IMAP Using the second, you can do complete validation via IMAP, just iterate through all folders and all messages, get the computed MD5 and compare on both sides. The UUID bit is just designed to help replication when messages are moved between folders, rather than having to resend the entire message on a move, it can just link them from one folder to the other at the replication end. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyrus replication validation
Hi If I understand this patch correctly, it doesn't solve the larger problem that I'm interested in: is the data on my replica the same as the data on my primary, or more to the point, are the two data sets converging? ... But I'm really interested in something that can run out of band from csync, imap, etc, that examines files on the primary and replica to know what the variance As mentioned, there's two parts to the patch. The UUID part which helps with the replication, but there's also this bit. 2. You can fetch a computed MD5 of any message on disk via IMAP Using the second, you can do complete validation via IMAP, just iterate through all folders and all messages, get the computed MD5 and compare on both sides. We wanted the same thing you did, some way to guarantee that the message data on both sides was exactly the same. One way of doing that was to use something that runs under the covers to check the messages on disk, which is fine. The other was to basically add something to the IMAP protocol which lets us do the same thing via IMAP. We went the second, because we already had code that given a username, would check their master server and replica server to see that 1. The folder list matched 2. For each folder, message count + unread count + uidvalidity + uidnext matched (eg status results) 3. For each folder, the UID listing matched 4. For each folder, the flags on each UID message matched These were all easy to get via IMAP on both sides and compare. However they were all meta-data related, and didn't help check that the actual email spool data on disk was correct. Which is why we added two FETCH items to the imap protocol with the above patch. FILE.MD5 and FILE.SIZE With these, we can now compare each file on each side of the master/replica set to see that they match. This means we can now check pretty much all meta data + spool data on both sides for consistency, all via IMAP connections, without having having to do any more peeking under the hood. Of course actually having the patch in there is pretty heavily peeking under the hood, but it was easier for us to do that because we already had a script which did steps 1-4, so adding a hack to the IMAP protocol was easier for us than creating a whole new system. Whether this is easier/harder at your site is up to you. Rob Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyrus replication validation
On Fri, Apr 06, 2007 at 05:52:28PM -0400, John Capo wrote: On both servers: find imap/ -type f | awk '!/(cache|index|header)/ {print}' | sort server1.lst find imap/ -type f | awk '!/(cache|index|header)/ {print}' | sort server2.lst and diff -u server1.lst server2.lst Quick mailboxes.db check. ctl_mboxlist -d | md5 on server1 ctl_mboxlist -d | md5 on server2 Both hashes should be identical. Or diff the ctl_mboxlist -d outputs. Please, correct me, if I wrong. It's just check of mailbox lists, but not messages numbers. WBR. Dmitriy Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyrus replication validation
Quoting Dmitriy Kirhlarov ([EMAIL PROTECTED]): On Fri, Apr 06, 2007 at 05:52:28PM -0400, John Capo wrote: On both servers: find imap/ -type f | awk '!/(cache|index|header)/ {print}' | sort server1.lst find imap/ -type f | awk '!/(cache|index|header)/ {print}' | sort server2.lst and diff -u server1.lst server2.lst Quick mailboxes.db check. ctl_mboxlist -d | md5 on server1 ctl_mboxlist -d | md5 on server2 Both hashes should be identical. Or diff the ctl_mboxlist -d outputs. Please, correct me, if I wrong. It's just check of mailbox lists, but not messages numbers. Correct. WBR. Dmitriy Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyrus replication validation
On Thu, Apr 05, 2007 at 12:10:14PM -0400, Ilya Vishnyakov wrote: Hello Cyrus Gurus! I was wondering if there is any specific way to check if the replication was done properly? I set up cyrus replication between two servers (documentation I used: http://cyrusimap.web.cmu.edu/imapd/install-replication.html). However, before switching our production servers we would like to make sure that replication was done properly. We checked if the directories are On both servers: find imap/ -type f | awk '!/(cache|index|header)/ {print}' | sort server1.lst find imap/ -type f | awk '!/(cache|index|header)/ {print}' | sort server2.lst and diff -u server1.lst server2.lst WBR. Dmitriy Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyrus replication validation
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 hmmm. it shows the equal sizes for both files. thank you. Dmitriy Kirhlarov wrote: On Thu, Apr 05, 2007 at 12:10:14PM -0400, Ilya Vishnyakov wrote: Hello Cyrus Gurus! I was wondering if there is any specific way to check if the replication was done properly? I set up cyrus replication between two servers (documentation I used: http://cyrusimap.web.cmu.edu/imapd/install-replication.html). However, before switching our production servers we would like to make sure that replication was done properly. We checked if the directories are On both servers: find imap/ -type f | awk '!/(cache|index|header)/ {print}' | sort server1.lst find imap/ -type f | awk '!/(cache|index|header)/ {print}' | sort server2.lst and diff -u server1.lst server2.lst WBR. Dmitriy Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGFlLcUZGmaUWxLn8RAl0vAJ9cjGvGj6EDp1TICoXby36tqc/yPwCgkrp+ PiSQGmVFX5NjIlKYNYBxZtM= =DY+E -END PGP SIGNATURE- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyrus replication validation
Quoting Dmitriy Kirhlarov ([EMAIL PROTECTED]): On Thu, Apr 05, 2007 at 12:10:14PM -0400, Ilya Vishnyakov wrote: Hello Cyrus Gurus! I was wondering if there is any specific way to check if the replication was done properly? I set up cyrus replication between two servers (documentation I used: http://cyrusimap.web.cmu.edu/imapd/install-replication.html). However, before switching our production servers we would like to make sure that replication was done properly. We checked if the directories are On both servers: find imap/ -type f | awk '!/(cache|index|header)/ {print}' | sort server1.lst find imap/ -type f | awk '!/(cache|index|header)/ {print}' | sort server2.lst and diff -u server1.lst server2.lst Quick mailboxes.db check. ctl_mboxlist -d | md5 on server1 ctl_mboxlist -d | md5 on server2 Both hashes should be identical. Or diff the ctl_mboxlist -d outputs. You should check the subscriptions on the replica too. I don't know of a simple way for you to verify the subscriptions other than software that fetches and compares each each users subscriptions. Subscription replication is the only replication problem I am seeing these days and I haven't had time to look into it. Well, that's not completely true. I have seen some cases where the bits controlling the POP3 UIDL format will differ on the replicas. If all mailboxes were created fairly recently, for some value of recent, or you have no POP3 users, you should not have a problem. I have mailboxes that were originally created with early 1.X and lots of POP3 users. The UIDL format has changed over the years and we have yet another UIDL format that attempts to get around the Outlook problem. The jury is still out on that. The UIDL format difference are only a problem if mail is left on server. John Capo Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyrus replication validation
On 06 Apr 2007, at 17:52, John Capo wrote: Quick mailboxes.db check. ctl_mboxlist -d | md5 on server1 ctl_mboxlist -d | md5 on server2 Both hashes should be identical. Or diff the ctl_mboxlist -d outputs. The provided Cyrus tool make_md5 is for validating replication. It would, for instance, have found the recently discussed bug in sync_server that caused random files to be overwritten in the event that sync_server reused a stale staging file. It would probably be cool if there were documentation somewhere that advised people on how to run it and how to use it to validate replication. :wes Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyrus replication validation
The provided Cyrus tool make_md5 is for validating replication. It would, for instance, have found the recently discussed bug in sync_server that caused random files to be overwritten in the event that sync_server reused a stale staging file. It would probably be cool if there were documentation somewhere that advised people on how to run it and how to use it to validate replication. We have a patch that helps with this as well see MD5 UUIDs here: http://cyrus.brong.fastmail.fm/ Basically it does two things: 1. You can make the UUIDs of all messages the first 11 bytes of the MD5 of the message 2. You can fetch a computed MD5 of any message on disk via IMAP Using the second, you can do complete validation via IMAP, just iterate through all folders and all messages, get the computed MD5 and compare on both sides. The UUID bit is just designed to help replication when messages are moved between folders, rather than having to resend the entire message on a move, it can just link them from one folder to the other at the replication end. Rob Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html