Ronnie, I'm copying this to the list so it ends up in the archive for
anyone to use in future.

In answer to your question about the floating IP for the master, I'm
using heartbeat, more at http://www.samag.com/documents/s=1146/sam0109c/0109c.htm

My semi-scripts and explanations are at the bottom of this mail
as I sent them to the
last person who asked.

Best to all,
Steve Rapaport.

Ronnie Garcia wrote:

Hi Steve,
  I just found your post about failover and mysql on google ( 
http://groups.google.com/groups?hl=fr&threadm=a11dqg%242slq%241%40FreeBSD.csie.NCTU.edu.tw&rnum=7&prev=/groups%3Fq%3Dfailover%2Bmysql%26hl%3Dfr%26scoring%3Dd%26selm%3Da11dqg%25242slq%25241%2540FreeBSD.csie.NCTU.edu.tw%26rnum%3D7)

  I would be pleased to get your "half-scripted solution", since i want my 
slave to get the INSERT and UPDATES while my master is down.

  Can you also explain me how do you handle the "floating IP address for the 
master" ? I'm thinking about nsupdate .. ?

  Thank you much,
Ronnie.


-- part of your post --

1) Use circular replication, where all machines are simulataneously master and
slave.  Problem: recovery from a screwup is really hard.
2) Change  roles upon failure:  Use CHANGE MASTER TO on all the slaves.
Problems: tricky to set up and test.  Hints in the FAQ help but are not
really tested.  Configuration files may need to be changed at failure time.
3) Never change roles, slave becomes temporary master on failure, manual reset.
This is easiest to setup and test, but inelegant to reset.

For 2) and 3) you also need to have a floating IP address for your master
that can be changed upon failure to point to the current master, using
BIND or using a gratuitous ARP scheme.


Anyway, I'm currently using scheme 3, and I'll be happy to send you my

half-scripted solution which you are welcome to improve upon.  I was also
using scheme 1 before, but gave up after something mis-synchronized the
replication logs and it was impossible to recover.


ere are the scripts I'm (sort of) using, and how to run them, on a 2-machine 
system.  Note that the order of operations could be improved to reduce 
downtime, but I don't see how it could be reduced to zero yet.  When I
tried it this way I had only about 30 seconds down, but that's partly because
I only copied the databases that update!


This is for use only when the master has failed, and the slave has been
making updates for some time.

[steve@db1 mysql]$ cat reset-slave.sh
#!/bin/sh
# the SLAVE half of what to do when recovering from a failover
# this runs on db1, the SLAVE.
#
# Steve Dec 27, 2001

# Note the master half is reset-slave.sh, on db2.

# stop slave replication
mysqladmin slave-stop

# stop updates and freeze data files : note this is incorrect syntax!
mysqladmin flush-tables --with-read-lock
# this won't work, you need to run something like:
# mysql -uuser -ppassword -e "FLUSH TABLES WITH READ LOCK"

#copy the new dynamic databases to db2  NOTE Omit any large read-only db's!
cd /var/lib/mysql
tar cvzf STEVE/reset.tgz atono mysql push
scp STEVE/reset.tgz steve@db2:/var/lib/mysql/STEVE
password

# (perhaps redundant) tell slave that bin files are meaningless
mysqladmin reset-slave

# stop the slave
mysqladmin shutdown

# remove all the bin files to reset the slave
rm db1-bin*

# reboot this system so heartbeat forces dbhost to switch over to db2.
sync
sync
/sbin/telinit 6




And now, on the other machine (db2) reset it to be the master:

[steve@db2 mysql]$ cat reset-master.sh
#!/bin/sh
# the MASTER half of what to do when recovering from a failover
#
# Steve Dec 27, 2001

# Note the slave half is reset-slave.sh, on db1.
# THE SLAVE HALF IS TO BE RUN FIRST, with master shut down.

# Shutdown the master
mysqladmin shutdown

#copy the new dynamic databases from db1
cd /var/lib/mysql
tar xvzf STEVE/reset.tgz

# remove all the bin files to reset the master
rm db2-bin*

# start the master
mysqladmin start

# (perhaps redundant) tell master that bin files are gone
mysqladmin reset-master




Reminder:
For 2) and 3) you also need to have a floating IP address for your
master that can be changed upon failure to point to the current
master, using BIND or using a gratuitous ARP scheme.

(Mine is gratuitous ARP with heartbeat detection.)




If you get scheme 2) to work, let me know, I might want to change.


Scheme 2 would probably be best for our environment, as I don't plan
on being here forever to baby it, so the more the system can do on
its own the better:) I'd definitely like to take a look at your
script to see if it gives me any ideas on how to get from 3 to 2


Agreed here.

(they seem like pretty much the same idea, just one is more complete
than the other)


I wish I could agree with you.  Scheme 2 is pretty different, since
there's no data copying going on, but the configuration files must
be changed (I think) as well as telling all the systems, in the
correct order, to change masters.  Also, the dying master must
remember to record its SHOW MASTER STATUS bookmark so it can be
passed on to the new master.

See some of my postings to the mailing list over the last 2 weeks in
the archives and the responses.

See also the Replication FAQ in the Mysql manual.  The last question
is ours.

Please let me know if you come up with a scripted solution, I may want
to borrow it!



-- 
Steve Rapaport
still at large


-- 
Steve Rapaport
still at large


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to