Re: Strange replication problem

2003-09-09 Thread Mike Dopheide

I found the problem which I will outline here just in case anyone else 
runs across it in the future:

It appears that a slave will not replicate data from it's own server-id.  
In my case, a large portion of the binary logs on my slave had originally 
come from the master, so when the master tried to re-replicate the data, 
it simply ignored entries from it's own server-id.

This makes complete sense, however, I can't find anywhere in the MySQL 
documention that explains this behavior.  The documention only says that 
the master and slaves should have unique server-ids.

-Mike

 I have one master and one slave which I am upgrading to 4.0.14 from 
 4.0.12.  To start the upgrade I stopped the slave, took a snapshot of it's 
 data directory, cleared it's binary logs, and switched to the 4.0.14 
 binaries.  I then restarted the slave thread to get it caught up with the 
 master.  The slave also runs with --log-slave-updates so that it has a 
 copy of all of the data from the point of the snapshot.
 
 This afternoon at 2:10pm I switched our mysql.domain.com CNAME to point 
 to the slave instead of the master (this is relevant).  At this point, the 
 slave is acting as the master and taking all of the updates.  When I was 
 sure all of the clients were using the slave I stopped it's slave thread 
 and took down the master server to upgrade it as well.
 
 I rebuilt the master's data directory from the snapshot I'd taken 
 previously on the slave.  At this point I told the master to replicate the 
 data off of the slave.
 
 Here's the strange part.  The I/O thread seems to be grabbing the data off 
 of the slave correctly.  It writes relay logs just fine.  However, the SQL 
 thread doesn't update the database.  SHOW SLAVE STATUS indicates that 
 both parts are running normally.  The I/O thread continues to write 
 relay log files (deleting old ones as it goes as if it doesn't need them 
 anymore).  Then... at the point in logs for 2:10pm today when the CNAME 
 was switched, all of the sudden the SQL thread decides to start updating 
 the database.  There isn't anything strange in the binary logs that I can 
 see accept that the 'log_pos' value drops a fair amount at the same time 
 it decides to start updating the database.  I don't know what the means if 
 anything.
 
 Why isn't it updating the database for all of the relay data?  Considering 
 that I've completely wiped the master's data directory except for the 
 snapshot, cleared it's binary logs, and it's innodblogs...  I'm completely 
 at a loss for how it can know the exact time it stopped getting normal 
 updates when it's CNAME changed.
 
 If you have any questions about my environment I'd be happy to answer 
 them.
 
 Thanks,
 Mike
 
 
 

-- 



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Strange replication problem

2003-09-08 Thread Mike Dopheide

I have one master and one slave which I am upgrading to 4.0.14 from 
4.0.12.  To start the upgrade I stopped the slave, took a snapshot of it's 
data directory, cleared it's binary logs, and switched to the 4.0.14 
binaries.  I then restarted the slave thread to get it caught up with the 
master.  The slave also runs with --log-slave-updates so that it has a 
copy of all of the data from the point of the snapshot.

This afternoon at 2:10pm I switched our mysql.domain.com CNAME to point 
to the slave instead of the master (this is relevant).  At this point, the 
slave is acting as the master and taking all of the updates.  When I was 
sure all of the clients were using the slave I stopped it's slave thread 
and took down the master server to upgrade it as well.

I rebuilt the master's data directory from the snapshot I'd taken 
previously on the slave.  At this point I told the master to replicate the 
data off of the slave.

Here's the strange part.  The I/O thread seems to be grabbing the data off 
of the slave correctly.  It writes relay logs just fine.  However, the SQL 
thread doesn't update the database.  SHOW SLAVE STATUS indicates that 
both parts are running normally.  The I/O thread continues to write 
relay log files (deleting old ones as it goes as if it doesn't need them 
anymore).  Then... at the point in logs for 2:10pm today when the CNAME 
was switched, all of the sudden the SQL thread decides to start updating 
the database.  There isn't anything strange in the binary logs that I can 
see accept that the 'log_pos' value drops a fair amount at the same time 
it decides to start updating the database.  I don't know what the means if 
anything.

Why isn't it updating the database for all of the relay data?  Considering 
that I've completely wiped the master's data directory except for the 
snapshot, cleared it's binary logs, and it's innodblogs...  I'm completely 
at a loss for how it can know the exact time it stopped getting normal 
updates when it's CNAME changed.

If you have any questions about my environment I'd be happy to answer 
them.

Thanks,
Mike


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Strange Replication Problem in 3.23.33 (bug?)

2001-02-13 Thread Matt Hahnfeld

I set up two MySQL servers to run in a failover configuration.  Because
queries will only ever be submitted to one server at a time, I decided to
use a makeshift two-way replication scheme under MySQL as descibed in
the MySQL manual.

First server (wallace) has this:

server-id=1
log-bin
master-host=gromit
master-user=repl
master-password=password
log-slave-updates


Second server (gromit) has this:

server-id=2
log-bin
master-host=wallace
master-user=repl
master-password=ghoti
log-slave-updates


I started by mirroring both data directories.  Then I started both servers
and all looked fine.  Logs indicate no errors.  When I inserted some
data on wallace, gromit replicated them just fine.  But when I tried to
insert data on gromit, wallace never got the changes.  The weird thing is,
no real errors appeared in the logs.

Then I did a "SHOW SLAVE STATUS" on wallace and saw "Skip_counter" was
set to 4294967295!!!  Strange, I thought, so I ran "STOP SLAVE", "SET
SQL_SLAVE_SKIP_COUNTER=0", and "START SLAVE" on wallace.  Suddenly
changes made on gromit were reflected on wallace.

But then I tried to insert data on wallace again and the same thing
happened.  This time gromit never got the changes.  When I ran "SHOW SLAVE
STATUS" on gromit, it indicated 4294967293.  To get it to work, I had to
run "SET SQL_SLAVE_SKIP_COUNTER=0" on gromit.

I just don't get it...  Why are the skip counters being reset to thse
crazy high numbers?  Shouldn't both servers just replicate each other?  Is
this a replication bug?  Please help!

Matt Hahnfeld
EverySoft


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php