Re: choosing a file system
On Wed, Dec 31, 2008 at 07:47:31AM -0500, Nik Conwell wrote: > > On Dec 30, 2008, at 4:43 PM, Shawn Nock wrote: > > [...] > > > a scripted rename of mailboxes to balance partition utilization when > > we > > add another partition. > > Just curious - how do stop people from accessing their mailboxes > during the time they are being renamed and moved to another partition? All access goes via an nginx proxy - we use the proc directory contents to detect currently active connections and termintate them after blocking all new logins in the authentication daemon. Once they're fully moved, logins are enabled again. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: choosing a file system
Nik Conwell wrote: > > On Dec 30, 2008, at 4:43 PM, Shawn Nock wrote: > > [...] > >> a scripted rename of mailboxes to balance partition utilization when we >> add another partition. > > Just curious - how do stop people from accessing their mailboxes during > the time they are being renamed and moved to another partition? > We don't really bother. We run the script overnight (over several nights) to minimize storage utilization and we haven't run into a problem. I haven't looked at the code in a while, but as I recall the rename operation is fairly atomic. In short: it doesn't take long to move a box. The worst thing that I could imagine would be a momentary outage for a single user (``Mailbox does not exist'' or similar). This sort of error (if it does occur in the wild) would clear almost immediately. Shawn -- Shawn Nock (OpenPGP: 0xFF7D08A3) Unix Systems Group; UITS University of Arizona nock at email.arizona.edu signature.asc Description: OpenPGP digital signature Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: choosing a file system
On Wed, 31 Dec 2008, Adam Tauno Williams wrote: > On Wed, 2008-12-31 at 11:47 +0100, LALOT Dominique wrote: >> Thanks for everybody. That was an interesting thread. Nobody seems to >> use a NetApp appliance, may be due to NFS architecture problems. > > Personally, I'd never use NFS for anything. Over the years I've had way > to many NFS related problems on other things to ever want to try it > again. NFS has some very interesting capabilities and limitations. it's really bad for multiple processes writing to the same file (the cyrus* files for example) and for atomic actions (writing the message files for example) there are ways that you can configure it that will work, but unless you already have a big NFS server you are probably much better off using a mechanism that makes the drives look more like local drives (SAN, iSCSI, etc) or try one of the cluster filesystems that has different tradeoffs than NFS does >> I believe I'll look to ext4 that seemed to be available in last >> kernel, and also to Solaris, but we are not enough to support another >> OS. > > We've used Cyrus on XFS for almost a years, no problems. > > In regards to ext3 I'd pay attention to the vintage of problem reports > and performance issues; ext3 of several years ago is not the ext3 of > today, many improvements have been made. "data=writeback" mode can help > performance quite a bit, as well as enabling "dir_index" if it isn't > already (did it ever become the default?). The periodic fsck can also > be disabled via tune2fs. I only point this out since, if you already > have any ext3 setup, trying the above are all painless and might buy > you something. it's definantly worth testing different filesystems. I last did a test about two years ago and confirmed XFS as my choice. I have one instance of cyrus still running on ext3 and I definantly see it as a user in the performance. David Lang Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: choosing a file system
Ah the saga of Hans Reiser. That unfortunately is the Downfall of Reiserfs. Yes, his company has disappeared, and a "void" has appeared from his lack of presence? However, the Reiserfs4 patch set is current against the linux kernel 2.6.28 (see http://www.kernel.org/pub/linux/kernel/people/edward/reiser4/reiser4-for-2.6/) However I think that (http://en.wikipedia.org/wiki/Reiser4) pretty much sums up the future of Reiserfs4. ... However I haven't really run into show stopping bugs on Reiserfs3 in quite some time (with excellent hardware). However you replace it with dodgy hardware and things change. I haven't looked at btrfs yet with Cyrus, perhaps I'll do that sometime soon. On Dec 31, 2008, at 6:20 AM, Janne Peltonen wrote: > On Wed, Dec 31, 2008 at 04:58:57AM -0800, Scott Likens wrote: >> I would not discount using reiserfs (v3) by any means. It's still >> by far a >> better choice for a filesystem with Cyrus then Ext3 or Ext4. I >> haven't really >> seen anyone do any tests with Ext4, but I imagine it should be >> about par for >> the course for Ext3. > > There are /lots/ of (comparative) tests done: The most recent I could > find with a quick Google is here: > > http://www.phoronix.com/scan.php?page=article&item=ext4_benchmarks > > The problem with reiserfs is... well. The developers have explicitely > stated that the development of v3 has come to its end, and there was > the > long argument between Hans Reiser and kernel delevopers about > whether v4 > could be included in kernel. When Hans Reiser was charged with murder > (not the crow or Cyrus variant), his company assured that the > development (of v4) would continue, but the last time I tried to find > out anything about the project, it appeared more or less dead. Of > course, the current reiserfs (v3) is very stable, but if you run into > any issues, there really isn't a developer you can contact (or send > patches to, if you figure out the bug). > > > --Janne > -- > Janne Peltonen PGP Key ID: 0x9CFAC88B > Please consider membership of the Hospitality Club > (http://www.hospitalityclub.org > ) > > > !DSPAM:495b87d570801804284693! > > Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: choosing a file system
On Wed, 2008-12-31 at 15:46 +0200, Janne Peltonen wrote: > On Wed, Dec 31, 2008 at 07:38:21AM -0500, Adam Tauno Williams wrote: > > In regards to ext3 I'd pay attention to the vintage of problem reports > > and performance issues; ext3 of several years ago is not the ext3 of > > today, many improvements have been made. "data=writeback" mode can help > > performance quite a bit, as well as enabling "dir_index" if it isn't > > already (did it ever become the default?). The periodic fsck can also > > be disabled via tune2fs. I only point this out since, if you already > > have any ext3 setup, trying the above are all painless and might buy > > you something. > I wouldn't call data=writeback painless. I had it on in the testing phase > of our current Cyrus installation, and if the filesystem had to be > forcibly unmounted by any reason (yes, there are reasons), the amount of > corruption in those files that happened to be active during the unmount > - well, it wasn't a nice sight. And the files weren't recoverable, > except from backup. > I never really got the point of the data=writeback mode. Sure, it > increases throughput, but so does disabling the journal completely, and > seems to me the end result as concerns data integrity is exactly the > same. The *filesystem* is recoverable as the meta-data is journaled. *Contents* of files may be lost/corrupted. I'm fine with that since a serious abend usually leaves the state of the data in a questionable state anyway for reasons other than the filesystem; I want something I can safely (and quickly) remount and investigate/restore. It is a trade-off. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Basic question
Jason Voorhees wrote: > Hi there: > > I'm planning to use Cyrus IMAP and OpenLDAP to authenticate users. > Long time ago I used to configure Cyrus IMAP + Cyrus SASL using > saslauthd with pam module. It was something simple. > > Then I used to configure Cyrus IMAP + Cyrus SASL using saslauthd with > ldap module and /etc/saslauthd.conf without problems. That's fine. > > > Now I would like to use Cyrus IMAP with OpenLDAP too, but I found that > there are at least 2 ways: > > 1. Use Cyrus SASL with auxprop to authenticate users trough LDAP using > auxprop_plugin: ldapdb, sasl_ldap_servers among other sasl_* directives. > Right? > > 2. The other way is to use ldap_* directives like ldap_uri, ldap_filter > among others. But I believe that I would need to use 'pts' module in > auth_mech directive, right? > > The question is: What are pts, unix, krb and krb5 modules used for? > What's the difference between them? Should I use pts module to make > Cyrus talk directly to OpenLDAP...? Or should I use Cyrus SASL with > auxprop plugin to make the authentication to OpenLDAP? > > Is there a place where I can get some clear information about these > items? Man pages are not too clear :S > > Thanks people :) > Jason, Available documentation that I'm aware of includes: /doc/options.html (within the cyrus-sasl source) which documents how to configure the ldapdb auxprop plugin /saslauthd/LDAP_SASLAUTHD (within the cyrus-sasl source) which discusses how to configure the ldap saslauthd backend /doc/overview.html (within the cyrus-imap source), in the 'Kerberos vs. Unix Authorization' section, which discusses authorization. As I understand it, the ldapdb auxprop plugin is entirely within the realm of cyrus sasl (authentication), and the auth_mech directive in imapd.conf is cyrus imapd specific, and only handles authorization. The auth_mech options (pts, unix, krb and krb5) direct how cyrus imapd authorizes users to access mailboxes/resources *after* they have been authenticated. The kerberos options direct imapd to perform some canonicalization of the authenticating user before opening their mailbox - so if a user connects as jsm...@example.com, the kerberos options could canonicalize that to 'jsmith', so that the server can open the 'jsmith' mailbox instead of searching for a 'jsm...@example.com' mailbox. The unix and pts options should only come in to play if you have specified a 'group:staff' style ACL for your mailboxes. It tells the imapd server how to resolve group membership to grant access to the mailbox. The 'unix' option will perform a unix getgrent call, or something like that, to determine if a user belongs to a group - using nss for instance, which in turn can use the nss-ldap or nss-mysql modules to lookup groups. However, that's pretty slow in my experience and you'd need to make sure you're properly optimizing your LDAP database. The pts route can be used to reference and LDAP server directly to resolve group membership within an LDAP database. - Dan Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: choosing a file system
On Wed, Dec 31, 2008 at 04:58:57AM -0800, Scott Likens wrote: > I would not discount using reiserfs (v3) by any means. It's still by far a > better choice for a filesystem with Cyrus then Ext3 or Ext4. I haven't really > seen anyone do any tests with Ext4, but I imagine it should be about par for > the course for Ext3. There are /lots/ of (comparative) tests done: The most recent I could find with a quick Google is here: http://www.phoronix.com/scan.php?page=article&item=ext4_benchmarks The problem with reiserfs is... well. The developers have explicitely stated that the development of v3 has come to its end, and there was the long argument between Hans Reiser and kernel delevopers about whether v4 could be included in kernel. When Hans Reiser was charged with murder (not the crow or Cyrus variant), his company assured that the development (of v4) would continue, but the last time I tried to find out anything about the project, it appeared more or less dead. Of course, the current reiserfs (v3) is very stable, but if you run into any issues, there really isn't a developer you can contact (or send patches to, if you figure out the bug). --Janne -- Janne Peltonen PGP Key ID: 0x9CFAC88B Please consider membership of the Hospitality Club (http://www.hospitalityclub.org) Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: choosing a file system
У вт, 2008-12-30 у 17:49 +0100, LALOT Dominique пише: > Once, there was a bad shutdown corrupting ext3fs and we spent 6 hours > on an fsck. Actually i do use reiserfs over 2 years on cyrus-imapd. It performs great even with realy big count of files in imap spool folders. But i dont know how it will perform on EMC. 4 years ago i tryied ext3. It was disaster. Slow as hell. Reiser4 was once used too, it did even better than reiserfs. But after 2 mounth stable running it get kernel OPS because a FS. And i did swiched back to reiserfs. -- Teresa Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: choosing a file system
On Wed, Dec 31, 2008 at 07:38:21AM -0500, Adam Tauno Williams wrote: > In regards to ext3 I'd pay attention to the vintage of problem reports > and performance issues; ext3 of several years ago is not the ext3 of > today, many improvements have been made. "data=writeback" mode can help > performance quite a bit, as well as enabling "dir_index" if it isn't > already (did it ever become the default?). The periodic fsck can also > be disabled via tune2fs. I only point this out since, if you already > have any ext3 setup, trying the above are all painless and might buy > you something. I wouldn't call data=writeback painless. I had it on in the testing phase of our current Cyrus installation, and if the filesystem had to be forcibly unmounted by any reason (yes, there are reasons), the amount of corruption in those files that happened to be active during the unmount - well, it wasn't a nice sight. And the files weren't recoverable, except from backup. I never really got the point of the data=writeback mode. Sure, it increases throughput, but so does disabling the journal completely, and seems to me the end result as concerns data integrity is exactly the same. --Janne -- Janne Peltonen PGP Key ID: 0x9CFAC88B Please consider membership of the Hospitality Club (http://www.hospitalityclub.org) Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
cyrus-sasl pam mysql connections are not getting closed
I am using cyrus-sasl with pam mysql ( on Centos5) The mysql is on a remote server. After some time I find that there are too many connections to mysql open ( using netstat) I restart saslauthd but still these dont away How do I check what the mysql connection is being used for ? and how do I avoid these piling up Thanks Ram Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: choosing a file system
> -- Nik Conwell is rumored to have mumbled on 31. Dezember 2008 > 07:47:31 -0500 regarding Re: choosing a file system: > > > Just curious - how do stop people from accessing their mailboxes > > during the time they are being renamed and moved to another partition? I moved a few thousand mailboxes in a similar fashion (summer of 2007) and encountered no problems. New message deliveries were nicely "frozen" by Cyrus while the target Inbox was being renamed/moved. Question : would it, stabilitywise, make a difference if the mail data and metadata are split, allocating the metadata partitions on SAN-based LUNs and storing messages in NAS (NFS) space ? In other words : are the Cyrus-over-NFS inconveniences confined to the cyrus.* files ? Rationale : NAS space can, typically, be "grown" more easily than SAN space. This could be an advantage to older server OSes en filesystems... Eric Luyten, Brussels Free University Computing Centre (Cyrus 2.2, 58k users, 2.3 TB) Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: choosing a file system
-- Nik Conwell is rumored to have mumbled on 31. Dezember 2008 07:47:31 -0500 regarding Re: choosing a file system: Just curious - how do stop people from accessing their mailboxes during the time they are being renamed and moved to another partition? I just do a grep on the username in the proc directory - if there is no process for that user, I figure it's safe enough to move the mailbox. This approach has worked well so far. I experimented with accessing a mailbox while it was being moved and that seemed to be OK as well, i.e. it failed while the operation was in progress. -- Sebastian Hagedorn - RZKR-R1 (Flachbau), Zi. 18, Robert-Koch-Str. 10 Zentrum für angewandte Informatik - Universitätsweiter Service RRZK Universität zu Köln / Cologne University - Tel. +49-221-478-5587 pgpPU72K0BOGZ.pgp Description: PGP signature Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: choosing a file system
Hi, I would not discount using reiserfs (v3) by any means. It's still by far a better choice for a filesystem with Cyrus then Ext3 or Ext4. I haven't really seen anyone do any tests with Ext4, but I imagine it should be about par for the course for Ext3. as far as the NFS... NFS isn't itself that bad, it's just that people tend to find ways to use NFS in a incorrect manner that only ends up leading to failure. Scott On Dec 31, 2008, at 2:47 AM, LALOT Dominique wrote: Thanks for everybody. That was an interesting thread. Nobody seems to use a NetApp appliance, may be due to NFS architecture problems. I believe I'll look to ext4 that seemed to be available in last kernel, and also to Solaris, but we are not enough to support another OS. Dom And Happy New Year ! 2008/12/31 Bron Gondwana On Tue, Dec 30, 2008 at 02:43:14PM -0700, Shawn Nock wrote: > Bron and the fastmail guys could tell you more about reiserfs... we've > used RH&SuSE/reiserfs/EMC for quite a while and we are very happy. Yeah, sure could :) You can probably find plenty of stuff from me in the archives about our setup - the basic things are: * separate metadata on RAID1 10kRPM (or 15kRPM in the new boxes) drives. * data files on RAID5 big slow drives - data IO isn't a limiting factor * 300Gb "slots" with 15Gb associated meta drives, like this: /dev/sdb6 14016208 8080360 5935848 58% /mnt/meta6 /dev/sdb7 14016208 8064848 5951360 58% /mnt/meta7 /dev/sdb8 14016208 8498812 5517396 61% /mnt/meta8 /dev/sdd2292959500 248086796 44872704 85% /mnt/data6 /dev/sdd3292959500 242722420 50237080 83% /mnt/data7 /dev/sdd4292959500 248840432 44119068 85% /mnt/data8 as you can see, that balances out pretty nicely. We also store per-user bayes databases on the associated meta drives. We balance our disk usage by moving users between stores when usage reaches 88% on any partition. We get emailed if it goes above 92% and paged if it goes above 95%. Replication. We have multiple "slots" on each server, and since they are all the same size, we have replication pairs spread pretty randomly around the hosts, so the failure of any one drive unit (SCSI attached SATA) or imap server doesn't significantly overload any one other machine. By using Cyrus replication rather than, say, DRBD, a filesystem corruption should only affect a single partition, which won't take so long to fsck. Moving users is easy - we run a sync_server on the Cyrus master, and just create a custom config directory with symlinks into the tree on the real server and a rewritten piece of mailboxes.db so we can rename them during the move if needed. It's all automatic. We also have a "CheckReplication" perl module that can be used to compare two ends to make sure everything is the same. It does full per-message flags checks, random sha1 integrity checks, etc. Does require a custom patch to expose the GUID (as DIGEST.SHA1) via IMAP. I lost an entire drive unit on the 26th. It stopped responding. 8 x 1TB drives in it. I tried rebooting everything, then switched the affected stores over to their replicas. Total downtime for those users of about 15 minutes because I tried the reboot first just in case (there's a chance that some messages were delivered and not yet replicated, so it's better not to bring up the replica uncleanly until you're sure there's no other choice) In the end I decided that it wasn't recoverable quickly enough to be viable, so chose new replica pairs for the slots that had been on that drive unit (we keep some empty space on our machines for just this eventuality) and started up another handy little script "sync_all_users" which runs sync_client -u for every user, then starts the rolling sync_client again at the end. It took about 16 hours to bring everything back to fully replicated again. Bron. -- Dominique LALOT Ingénieur Systèmes et Réseaux http://annuaire.univmed.fr/showuser?uid=lalot !DSPAM:495b4f1f47731804284693! Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html !DSPAM:495b4f1f47731804284693! Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: choosing a file system
On Dec 30, 2008, at 4:43 PM, Shawn Nock wrote: [...] > a scripted rename of mailboxes to balance partition utilization when > we > add another partition. Just curious - how do stop people from accessing their mailboxes during the time they are being renamed and moved to another partition? -nik Information Technology Systems Programming Boston University Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: choosing a file system
On Wed, 2008-12-31 at 11:47 +0100, LALOT Dominique wrote: > Thanks for everybody. That was an interesting thread. Nobody seems to > use a NetApp appliance, may be due to NFS architecture problems. Personally, I'd never use NFS for anything. Over the years I've had way to many NFS related problems on other things to ever want to try it again. > I believe I'll look to ext4 that seemed to be available in last > kernel, and also to Solaris, but we are not enough to support another > OS. We've used Cyrus on XFS for almost a years, no problems. In regards to ext3 I'd pay attention to the vintage of problem reports and performance issues; ext3 of several years ago is not the ext3 of today, many improvements have been made. "data=writeback" mode can help performance quite a bit, as well as enabling "dir_index" if it isn't already (did it ever become the default?). The periodic fsck can also be disabled via tune2fs. I only point this out since, if you already have any ext3 setup, trying the above are all painless and might buy you something. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: choosing a file system
Thanks for everybody. That was an interesting thread. Nobody seems to use a NetApp appliance, may be due to NFS architecture problems. I believe I'll look to ext4 that seemed to be available in last kernel, and also to Solaris, but we are not enough to support another OS. Dom And Happy New Year ! 2008/12/31 Bron Gondwana > On Tue, Dec 30, 2008 at 02:43:14PM -0700, Shawn Nock wrote: > > Bron and the fastmail guys could tell you more about reiserfs... we've > > used RH&SuSE/reiserfs/EMC for quite a while and we are very happy. > > Yeah, sure could :) > > You can probably find plenty of stuff from me in the archives about our > setup - the basic things are: > > * separate metadata on RAID1 10kRPM (or 15kRPM in the new boxes) drives. > * data files on RAID5 big slow drives - data IO isn't a limiting factor > * 300Gb "slots" with 15Gb associated meta drives, like this: > > /dev/sdb6 14016208 8080360 5935848 58% /mnt/meta6 > /dev/sdb7 14016208 8064848 5951360 58% /mnt/meta7 > /dev/sdb8 14016208 8498812 5517396 61% /mnt/meta8 > /dev/sdd2292959500 248086796 44872704 85% /mnt/data6 > /dev/sdd3292959500 242722420 50237080 83% /mnt/data7 > /dev/sdd4292959500 248840432 44119068 85% /mnt/data8 > > as you can see, that balances out pretty nicely. We also store > per-user bayes databases on the associated meta drives. > > We balance our disk usage by moving users between stores when usage > reaches 88% on any partition. We get emailed if it goes above 92% > and paged if it goes above 95%. > > Replication. We have multiple "slots" on each server, and since > they are all the same size, we have replication pairs spread pretty > randomly around the hosts, so the failure of any one drive unit > (SCSI attached SATA) or imap server doesn't significantly overload > any one other machine. By using Cyrus replication rather than, > say, DRBD, a filesystem corruption should only affect a single > partition, which won't take so long to fsck. > > Moving users is easy - we run a sync_server on the Cyrus master, and > just create a custom config directory with symlinks into the tree on > the real server and a rewritten piece of mailboxes.db so we can > rename them during the move if needed. It's all automatic. > > We also have a "CheckReplication" perl module that can be used to > compare two ends to make sure everything is the same. It does full > per-message flags checks, random sha1 integrity checks, etc. > Does require a custom patch to expose the GUID (as DIGEST.SHA1) > via IMAP. > > I lost an entire drive unit on the 26th. It stopped responding. > 8 x 1TB drives in it. > > I tried rebooting everything, then switched the affected stores over > to their replicas. Total downtime for those users of about 15 > minutes because I tried the reboot first just in case (there's a > chance that some messages were delivered and not yet replicated, > so it's better not to bring up the replica uncleanly until you're > sure there's no other choice) > > In the end I decided that it wasn't recoverable quickly enough to > be viable, so chose new replica pairs for the slots that had been > on that drive unit (we keep some empty space on our machines for > just this eventuality) and started up another handy little script > "sync_all_users" which runs sync_client -u for every user, then > starts the rolling sync_client again at the end. It took about > 16 hours to bring everything back to fully replicated again. > > Bron. > -- Dominique LALOT Ingénieur Systèmes et Réseaux http://annuaire.univmed.fr/showuser?uid=lalot Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html