Howdy, There has been some interest lately in HA of MySQL services both in my company and on the list. A few of us here sat down on Friday (at 5PM no less) and started hashing out the details of providing such a service. Following several possible approaches, we ran into major stumbling blocks on each path.
The Setup: One master database, 5 slave databases. The end application are Perl CGIs connecting to DNS CNAMES (db1, db2, db3, etc). I started by demonstrating the application of slaves in an LVS (http://www.linuxvirtualserver.org) cluster. This proved to be very successful. I was able to build a cluster of slaves, load balancing the queries among them, weighting them differently and having ones removed from the cluster by shutting down MySQL. The goal then is to point queries to web-db, which is a cluster of 2 or 3 slaves. The next step is to use heartbeat (http://www.linux-ha.org) to do IP address takeover of the master in the event of a failure. This is where it gets tricky. One of the slaves will be designated the master failover. Upon detection of a master failure, the program... 1) Runs a SLAVE STOP on the failover slave 2) Runs a script to enable writes to the slave tables 3) Removes itself from the cluster 4) Takes over the IP address of the master The problem then lies in how to miss as few inserts queries as possible. The easiest solution is to start the binlog on the failover slave as soon as it becomes the master. As a downfall, some writes to the master will be lost, possibly forever with a disk failure (but disk failure is a scenario you can't always prepare for at a software level). What happens if other slaves in the cluster are "very far" behind, possibly due to long reporting queries. If the master goes down, these would have to rely on the new master to catch up, however, the new master has no binlog information resulting in wildly out of sync data. In order to provide true data redundancy, the binlog position would have to be identical to that of the master to retain the same filename and position. That's not an easy feat to accomplish. If you bring the slave down, the master down, the slave up, then the master up, you should get binlogs that match, but I can't confirm this yet. So, I put it to the list. Am I missing the obvious here? How do YOU achieve a failover master? -J --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php