Hi Imran, Have a look at MySQL MMM for Multi-Master Replication failover. The project is currently in refurbishment when ti comes to having a home, but you can start by looking at : http://mysql-mmm.org for information.
This project is made for exactly what you want to achieve: Having multiple masters and multiple slaves with automatic failover. Hope this helps! Walter On Wed, Aug 12, 2009 at 09:08, Imran Chaudhry <ichaud...@gmail.com> wrote: > I want to fix a replication issue with a 2-node cluster (one active, > one passive) that is using Heartbeat for failover. The nodes are in > Master-Master configuration (that is, each is the slave and master of > the other). > > I have several other hosts that are replication slaves from the active > node. They connect to MySQL via TCP over an SSH tunnels. > > When failover occurs, the passive node becomes the active node. > However the replication slaves stop replicating. The error from a log > on one of the slaves is: > > Jul 15 07:43:32 <host> mysqld[1339]: 090715 7:43:32 [Note] Slave I/O > thread: conn > ected to master '<user>@127.0.0.1:3307', replication started in log > 'mysql-bin.00 > 0978' at position 23923243 > Jul 15 07:43:32 <host> mysqld[1339]: 090715 7:43:32 [ERROR] Error > reading packet > from server: Could not find first log file name in binary log index file ( > serve > r_errno=1236) > Jul 15 07:43:32 <host> mysqld[1339]: 090715 7:43:32 [ERROR] Got fatal > error 1236: > 'Could not find first log file name in binary log index file' from master > when > reading data from binary log > Jul 15 07:43:32 <host> mysqld[1339]: 090715 7:43:32 [Note] Slave I/O > thread > exiting, read up to log 'mysql-bin.000978', position 23923243 > > I do not think this is an SSH tunnel issue. I believe this is because > of inconsistent binary log file names and positions between the two > nodes. Probably because one of the nodes had been in operation a lot > longer than the other. > > At the moment I have to get replication going by dumping the master > databases again, re-import to the slave hosts and bootstrap the > slaves. > > What is the best way to make this consistent and ensure that > replication continues smoothly after a failover (and failback) event? > > Thank you, > Imran Chaudhry > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: http://lists.mysql.com/mysql?unsub=li...@olindata.com > > -- Walter Heck, Engineer @ Open Query (http://openquery.com) Affordable Training and ProActive Support for MySQL & related technologies Follow our blog at http://openquery.com/blog/ OurDelta: free enhanced builds for MySQL @ http://ourdelta.org