Ronnie, I'm copying this to the list so it ends up in the archive for anyone to use in future.
In answer to your question about the floating IP for the master, I'm using heartbeat, more at http://www.samag.com/documents/s=1146/sam0109c/0109c.htm My semi-scripts and explanations are at the bottom of this mail as I sent them to the last person who asked. Best to all, Steve Rapaport. Ronnie Garcia wrote: Hi Steve, I just found your post about failover and mysql on google ( http://groups.google.com/groups?hl=fr&threadm=a11dqg%242slq%241%40FreeBSD.csie.NCTU.edu.tw&rnum=7&prev=/groups%3Fq%3Dfailover%2Bmysql%26hl%3Dfr%26scoring%3Dd%26selm%3Da11dqg%25242slq%25241%2540FreeBSD.csie.NCTU.edu.tw%26rnum%3D7) I would be pleased to get your "half-scripted solution", since i want my slave to get the INSERT and UPDATES while my master is down. Can you also explain me how do you handle the "floating IP address for the master" ? I'm thinking about nsupdate .. ? Thank you much, Ronnie. -- part of your post -- 1) Use circular replication, where all machines are simulataneously master and slave. Problem: recovery from a screwup is really hard. 2) Change roles upon failure: Use CHANGE MASTER TO on all the slaves. Problems: tricky to set up and test. Hints in the FAQ help but are not really tested. Configuration files may need to be changed at failure time. 3) Never change roles, slave becomes temporary master on failure, manual reset. This is easiest to setup and test, but inelegant to reset. For 2) and 3) you also need to have a floating IP address for your master that can be changed upon failure to point to the current master, using BIND or using a gratuitous ARP scheme. Anyway, I'm currently using scheme 3, and I'll be happy to send you my half-scripted solution which you are welcome to improve upon. I was also using scheme 1 before, but gave up after something mis-synchronized the replication logs and it was impossible to recover. ere are the scripts I'm (sort of) using, and how to run them, on a 2-machine system. Note that the order of operations could be improved to reduce downtime, but I don't see how it could be reduced to zero yet. When I tried it this way I had only about 30 seconds down, but that's partly because I only copied the databases that update! This is for use only when the master has failed, and the slave has been making updates for some time. [steve@db1 mysql]$ cat reset-slave.sh #!/bin/sh # the SLAVE half of what to do when recovering from a failover # this runs on db1, the SLAVE. # # Steve Dec 27, 2001 # Note the master half is reset-slave.sh, on db2. # stop slave replication mysqladmin slave-stop # stop updates and freeze data files : note this is incorrect syntax! mysqladmin flush-tables --with-read-lock # this won't work, you need to run something like: # mysql -uuser -ppassword -e "FLUSH TABLES WITH READ LOCK" #copy the new dynamic databases to db2 NOTE Omit any large read-only db's! cd /var/lib/mysql tar cvzf STEVE/reset.tgz atono mysql push scp STEVE/reset.tgz steve@db2:/var/lib/mysql/STEVE password # (perhaps redundant) tell slave that bin files are meaningless mysqladmin reset-slave # stop the slave mysqladmin shutdown # remove all the bin files to reset the slave rm db1-bin* # reboot this system so heartbeat forces dbhost to switch over to db2. sync sync /sbin/telinit 6 And now, on the other machine (db2) reset it to be the master: [steve@db2 mysql]$ cat reset-master.sh #!/bin/sh # the MASTER half of what to do when recovering from a failover # # Steve Dec 27, 2001 # Note the slave half is reset-slave.sh, on db1. # THE SLAVE HALF IS TO BE RUN FIRST, with master shut down. # Shutdown the master mysqladmin shutdown #copy the new dynamic databases from db1 cd /var/lib/mysql tar xvzf STEVE/reset.tgz # remove all the bin files to reset the master rm db2-bin* # start the master mysqladmin start # (perhaps redundant) tell master that bin files are gone mysqladmin reset-master Reminder: For 2) and 3) you also need to have a floating IP address for your master that can be changed upon failure to point to the current master, using BIND or using a gratuitous ARP scheme. (Mine is gratuitous ARP with heartbeat detection.) If you get scheme 2) to work, let me know, I might want to change. Scheme 2 would probably be best for our environment, as I don't plan on being here forever to baby it, so the more the system can do on its own the better:) I'd definitely like to take a look at your script to see if it gives me any ideas on how to get from 3 to 2 Agreed here. (they seem like pretty much the same idea, just one is more complete than the other) I wish I could agree with you. Scheme 2 is pretty different, since there's no data copying going on, but the configuration files must be changed (I think) as well as telling all the systems, in the correct order, to change masters. Also, the dying master must remember to record its SHOW MASTER STATUS bookmark so it can be passed on to the new master. See some of my postings to the mailing list over the last 2 weeks in the archives and the responses. See also the Replication FAQ in the Mysql manual. The last question is ours. Please let me know if you come up with a scripted solution, I may want to borrow it! -- Steve Rapaport still at large -- Steve Rapaport still at large --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php