Re: Cyrus replication and failover best pracistes
On Mon, Aug 09, 2010 at 08:15:36PM +0400, Dmitry Ivanov wrote: > Hello! > Folks, looking through maillist history i saw that many of you are > running cyrus in rolling replication mode. I am interested in > configuring cyrus replica to use as a standby imap server, where we can > switch DNS in case of problems with primary backend. While testing on > playground I got some problems and several questions appeared, may be > you can help me to solve this. > > 1. Is it safe to leave "sync_host:" options in imapd.conf and running > sync_server (due to record in cyrus.conf) on both master and replica, > and start only sync_client -r on master server? Or better to have > different config files for different roles? Yeah, that's pretty safe. We run sync_server on our masters as well so that we can move users between machines. I'm not such a fan of the sync_host config variables - I'd prefer to pass the information on the sync_client command line. Should go fix that! > 2. Is there any way to solve issue when master overwrites messages with > the same filename on replica (messages that were not synced before > disaster happened) during syncing back to primary host? "guid_mode: > sha1" set. We have a patch at FastMail that does it. There's one again 2.3.16, or soon it will be the default behaviour with the new sync protocol (I keep talking about it ...) It's actually up and running at FastMail now, so I'll be pushing it back to CVS soon, and we'll work on making a release. > May be some one can describe method of switching between replicated > backends in production? For now I want to switch DNS and and than > start/stop sync_client daemon. We do have slightly different configurations, so we have to shut down both ends. In future I plan to have sync_client running at both ends, so it's master-master, but with DNS only pointing at one end, and some sort of "barrier" process where we kill off connections before switching. The barrier is needed if you don't want to be in split-brain recovery mode ALL the time, because some clients hold IMAP connections open for days. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Cyrus replication and failover best pracistes
Hello! Folks, looking through maillist history i saw that many of you are running cyrus in rolling replication mode. I am interested in configuring cyrus replica to use as a standby imap server, where we can switch DNS in case of problems with primary backend. While testing on playground I got some problems and several questions appeared, may be you can help me to solve this. 1. Is it safe to leave "sync_host:" options in imapd.conf and running sync_server (due to record in cyrus.conf) on both master and replica, and start only sync_client -r on master server? Or better to have different config files for different roles? 2. Is there any way to solve issue when master overwrites messages with the same filename on replica (messages that were not synced before disaster happened) during syncing back to primary host? "guid_mode: sha1" set. May be some one can describe method of switching between replicated backends in production? For now I want to switch DNS and and than start/stop sync_client daemon. Thank you for assistance! -- Dmitry S. Ivanov Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication and failover
On Thu, May 10, 2007 at 12:14:44PM -0400, Nik Conwell wrote: > Do you have separate IP addresses for each instance of cyrus on the > machine as well, or just the machine itself? If just the machine, > what 'names' does the front-end know the back-end instances by? Every store has an IP address for master (a.b.10.$storenumber) and one for the replica (a.b.11.$storenumber) which maps to hosts files entries (yay templating), so you can just refer to store6m.internal to connect to the master IP address for store6. Slots themselves don't have any IP addresses. Machines have their own base IP address, and you can find them by, for example. my $store = ME::ImapStore->new($storename); # note, does DB lookup (cached for 5 seconds) my $slot = $store->MasterSlot(); my $server = $slot->Machine(); my $ip = $server->InternalAddress(); and if you don't have perl you can always invoke it or write a small Template::Toolkit script to spit out what you want. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication and failover
On Jan 18, 2007, at 5:35 PM, Rob Mueller wrote: Attached is our operation group's notes on the subject. It makes reference to the tool we use to manage the OS of the machines (radmind), but it should be pretty clear what they are talking about without any radmind knowledge. As an FYI, we have a similar procedure to this, the main differences are: 1. We don't change the DNS. Instead we give each machine a primary IP address, but we also create IP addresses for "cyrusXmaster" and "cyrusXreplica" names(where X is numbers for each machine). When we swap roles, we rebind the different IPs to the particular machines and send ARPs to clear the router table, rather than changing the DNS. This means you can always access the master as "cyrusXmaster" from every machine without having to worry about DNS getting out of sync. 2. Every machine has cyrus-master.conf, cyrus-replica.conf, imapd- master.conf and imapd-replica.conf. We just symlink cyrus.conf and imapd.conf to the appropriate file depending on what mode the machine is currently in Do you have separate IP addresses for each instance of cyrus on the machine as well, or just the machine itself? If just the machine, what 'names' does the front-end know the back-end instances by? FWIW we use IP names for our 17 back-end UW mailstores... Thanks. -nik Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication and failover
Attached is our operation group's notes on the subject. It makes reference to the tool we use to manage the OS of the machines (radmind), but it should be pretty clear what they are talking about without any radmind knowledge. As an FYI, we have a similar procedure to this, the main differences are: 1. We don't change the DNS. Instead we give each machine a primary IP address, but we also create IP addresses for "cyrusXmaster" and "cyrusXreplica" names(where X is numbers for each machine). When we swap roles, we rebind the different IPs to the particular machines and send ARPs to clear the router table, rather than changing the DNS. This means you can always access the master as "cyrusXmaster" from every machine without having to worry about DNS getting out of sync. 2. Every machine has cyrus-master.conf, cyrus-replica.conf, imapd-master.conf and imapd-replica.conf. We just symlink cyrus.conf and imapd.conf to the appropriate file depending on what mode the machine is currently in Rob Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication and failover
On 18 Jan 2007, at 05:41, Janne Peltonen wrote: Is there documentation abt replication failover scenarios anywhere? I can, of course, conjure up a thing or two, but I'd like to see how other people have resolved 'corrupted mailspool -> services to the replica -> maintenance -> resync master -> services back to the master' situations. I did a short Google, but didn't find much of notice. Attached is our operation group's notes on the subject. It makes reference to the tool we use to manage the OS of the machines (radmind), but it should be pretty clear what they are talking about without any radmind knowledge. :wes 1. Establish primary failure we believe that the failover procedure should take approximately 30 minutes, so the failover procedure should be invoked whenever the estimated downtime on the primary would exceed this amount of time an exception may be made if there is reason to believe that a substantial amount of data on the failed primary was not synched to the replica; we will discuss the feasibility of sanity checks which can be run prior to failover 2. Stop cyrus/sync_client on primary if necessary / remove primary from network if necessary /etc/init.d/cyrus stop /etc/init.d/sync_client stop /etc/init.d/network stop (or unplug network cable) 3. stop cyrus on the replica /etc/init.d/cyrus stop 4. Change dns so that the name of -repl becomes -> ensure you change forward and reverse -> leave original entries commented out 5. Verify dns changes are working by checking on truelies dig .mail dnsrev the ip 6. Put special files of -repl in place for to reflect ip information of replica cd to special dir (generally /var/radmind/special/imap) cp -R .save cp -repl/etc/sysconfig/network /etc/sysconfig/network cp -repl/etc/sysconfig/network-devices/ifconfig.eth0 /etc/sysconfig/network-devices edit network to fix hostname vi /etc/sysconfig/network 7. radmind the replica ra.sh update Update command file and/or transcripts? [Yn] y /var/radmind/client/command.K: updated /var/radmind/client/special.T: updated c ./dev/ttyS0 0600 0 0 464 special.T: + f ./etc/adsm/TSM.PWD 0444 0 0 1093046900 164 TIgISWWzEESwLKsM5TQx4CRH1hc= imap/imap-23backend.T: + f ./etc/cyrus.conf0644 0 0 1156541554 1380 HqMdPv649xvUptagZY1X489CCpo= imap/imap.T: + f ./etc/imapd.common.conf 0644 0 0 1119845235 871 kTjkwR4x0SwRuK3qvpKi2ZGwANU= imap/imap-23backend.T: + f ./etc/imapd.conf0644 0 0 1155789187 343 RIr24APHrHa8fp6YTCezsGUCK4U= special.T: + f ./etc/imapd.host.conf 0444 0 0 1156186085 104 RIgobQuTFI/HRQNmF4H4WEEoU1I= + f ./etc/krb5.keytab 0640 025 1093051728 952 hk7wwXNZgVqyiPgB8BQ55fGtULg= + f ./etc/sysconfig/network 0644 0 0 1166473054 81 pfuFsI4FuD763RKzCIXMHojQadc= + f ./etc/sysconfig/network-devices/ifconfig.eth0 0644 0 0 1166473074 78 yXkW7BokmxryTqqJKLmFl9zc3Qs= + f ./etc/sysconfig/network-devices/ifconfig.eth1 0444 0 0 1166473075 71 yvCcuy3ATic/4AXPPVa1zeoPnbo= - f ./opt/tivoli/tsm/client/ba/bin/dsm.sys 0644 0 0 1164130511 418 - + f ./var/imap/hostname.pem 0444 0 0 1155787168 2920 Hyfrb/Sg4WkWHp/dUYHe8q9/cv4= 8. /etc/init.d/network restart hostname (remember to use fqdn) pkill syslogd ksyslogd or reboot (your choice) 9. start cyrus su cyrus (get tickets) /usr/local/heimdal-k5/bin/kinit -k -l 25h imap/[EMAIL PROTECTED] ctl_mboxlist -m -w (no output is good!!!) (exit so you are root) init 3 10. comment out replnag until new replica is brought up 11. restart nefu to catch ip change *** bringing up a new replica, hopefully on same hardware ** 1. update DNS for new replica 2. set up special files of -repl cd to special dir (generally /var/radmind/special/imap) cp -R -repl -repl.save cp .save/etc/sysconfig/network -repl/etc/sysconfig/network cp .save/etc/sysconfig/network-devices/ifconfig.eth0 -repl/etc/sysconfig/network-devices edit network to fix hostname vi -repl/etc/sysconfig/network 3. reload new replica with existing command file 4. boot new replica & start cyrus 5. generate list of mailboxes & sync to get mailboxes ctl_mboxlist -d > /tmp/users awk '{ print $1 }' /tmp/users | xargs sync_client -v -l -m 6. start sync client *** switch back during next maintenance window *** 1. stop cyrus on primary init 2 2. verify that /var/imap/sync is empty (no pending syncs), if not run sync_client -v -l -r -f on any remaining log files, delete each file after syncing 3. swap DNS 4. move specials back into pl
Replication and failover
Hi! Is there documentation abt replication failover scenarios anywhere? I can, of course, conjure up a thing or two, but I'd like to see how other people have resolved 'corrupted mailspool -> services to the replica -> maintenance -> resync master -> services back to the master' situations. I did a short Google, but didn't find much of notice. --Janne Peltonen Email admin Univ. of Helsinki Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html