Re: Strange replication problem
I found the problem which I will outline here just in case anyone else runs across it in the future: It appears that a slave will not replicate data from it's own server-id. In my case, a large portion of the binary logs on my slave had originally come from the master, so when the master tried to re-replicate the data, it simply ignored entries from it's own server-id. This makes complete sense, however, I can't find anywhere in the MySQL documention that explains this behavior. The documention only says that the master and slaves should have unique server-ids. -Mike I have one master and one slave which I am upgrading to 4.0.14 from 4.0.12. To start the upgrade I stopped the slave, took a snapshot of it's data directory, cleared it's binary logs, and switched to the 4.0.14 binaries. I then restarted the slave thread to get it caught up with the master. The slave also runs with --log-slave-updates so that it has a copy of all of the data from the point of the snapshot. This afternoon at 2:10pm I switched our mysql.domain.com CNAME to point to the slave instead of the master (this is relevant). At this point, the slave is acting as the master and taking all of the updates. When I was sure all of the clients were using the slave I stopped it's slave thread and took down the master server to upgrade it as well. I rebuilt the master's data directory from the snapshot I'd taken previously on the slave. At this point I told the master to replicate the data off of the slave. Here's the strange part. The I/O thread seems to be grabbing the data off of the slave correctly. It writes relay logs just fine. However, the SQL thread doesn't update the database. SHOW SLAVE STATUS indicates that both parts are running normally. The I/O thread continues to write relay log files (deleting old ones as it goes as if it doesn't need them anymore). Then... at the point in logs for 2:10pm today when the CNAME was switched, all of the sudden the SQL thread decides to start updating the database. There isn't anything strange in the binary logs that I can see accept that the 'log_pos' value drops a fair amount at the same time it decides to start updating the database. I don't know what the means if anything. Why isn't it updating the database for all of the relay data? Considering that I've completely wiped the master's data directory except for the snapshot, cleared it's binary logs, and it's innodblogs... I'm completely at a loss for how it can know the exact time it stopped getting normal updates when it's CNAME changed. If you have any questions about my environment I'd be happy to answer them. Thanks, Mike -- -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Strange replication problem
I have one master and one slave which I am upgrading to 4.0.14 from 4.0.12. To start the upgrade I stopped the slave, took a snapshot of it's data directory, cleared it's binary logs, and switched to the 4.0.14 binaries. I then restarted the slave thread to get it caught up with the master. The slave also runs with --log-slave-updates so that it has a copy of all of the data from the point of the snapshot. This afternoon at 2:10pm I switched our mysql.domain.com CNAME to point to the slave instead of the master (this is relevant). At this point, the slave is acting as the master and taking all of the updates. When I was sure all of the clients were using the slave I stopped it's slave thread and took down the master server to upgrade it as well. I rebuilt the master's data directory from the snapshot I'd taken previously on the slave. At this point I told the master to replicate the data off of the slave. Here's the strange part. The I/O thread seems to be grabbing the data off of the slave correctly. It writes relay logs just fine. However, the SQL thread doesn't update the database. SHOW SLAVE STATUS indicates that both parts are running normally. The I/O thread continues to write relay log files (deleting old ones as it goes as if it doesn't need them anymore). Then... at the point in logs for 2:10pm today when the CNAME was switched, all of the sudden the SQL thread decides to start updating the database. There isn't anything strange in the binary logs that I can see accept that the 'log_pos' value drops a fair amount at the same time it decides to start updating the database. I don't know what the means if anything. Why isn't it updating the database for all of the relay data? Considering that I've completely wiped the master's data directory except for the snapshot, cleared it's binary logs, and it's innodblogs... I'm completely at a loss for how it can know the exact time it stopped getting normal updates when it's CNAME changed. If you have any questions about my environment I'd be happy to answer them. Thanks, Mike -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Strange Replication Problem in 3.23.33 (bug?)
I set up two MySQL servers to run in a failover configuration. Because queries will only ever be submitted to one server at a time, I decided to use a makeshift two-way replication scheme under MySQL as descibed in the MySQL manual. First server (wallace) has this: server-id=1 log-bin master-host=gromit master-user=repl master-password=password log-slave-updates Second server (gromit) has this: server-id=2 log-bin master-host=wallace master-user=repl master-password=ghoti log-slave-updates I started by mirroring both data directories. Then I started both servers and all looked fine. Logs indicate no errors. When I inserted some data on wallace, gromit replicated them just fine. But when I tried to insert data on gromit, wallace never got the changes. The weird thing is, no real errors appeared in the logs. Then I did a "SHOW SLAVE STATUS" on wallace and saw "Skip_counter" was set to 4294967295!!! Strange, I thought, so I ran "STOP SLAVE", "SET SQL_SLAVE_SKIP_COUNTER=0", and "START SLAVE" on wallace. Suddenly changes made on gromit were reflected on wallace. But then I tried to insert data on wallace again and the same thing happened. This time gromit never got the changes. When I ran "SHOW SLAVE STATUS" on gromit, it indicated 4294967293. To get it to work, I had to run "SET SQL_SLAVE_SKIP_COUNTER=0" on gromit. I just don't get it... Why are the skip counters being reset to thse crazy high numbers? Shouldn't both servers just replicate each other? Is this a replication bug? Please help! Matt Hahnfeld EverySoft - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php