Re: mysql as cluster service, failover causes broken replication

2004-03-25 Thread Matt Sturtz
Yes, the clients (appearently) read to the end of the previous file, and
then sit there, while the server is writing to a new file.

I was thinking this had to do with the unclean shutdown of MySQL--
perhapps it's something else.

-Matt-


 Matt Sturtz wrote:
 Hello--

 We're using Red Hat's cluster manager (RH AS 2.1, MySQL 4.0.16 RPM).
 Due
 to a problem within the cluster software that we're working on with Red
 Hat, the cluster fails over from one node to the other sometimes when it
 shouldn't (one node will reboot, services will fail over-- at this point
 we think it's probably related to IO on the shared quorum partitions).

 When service is restored some seconds later, the slaves won't start
 replicating from the newly created binary-log, instead continuing to
 read
 from the previous one (IE db-bin.002 is created when MySQL is restarted,
 but the slaves keep reading from the old file, db-bin.001).  The only
 fix
 seems to be CHANGE MASTER TO..., which seems somewhat error prone.

 Anybody else running MySQL in this type of environment have any words of
 wisdom?  Thanks in advance for any info...

 They should keep reading from the old one until they catch up. Do they
 fail to
 roll over to the next one after finishing the old one? If yes, it would be
 a bug.

 --
 Sasha Pachev


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: mysql as cluster service, failover causes broken replication

2004-03-25 Thread Sasha Pachev
Matt Sturtz wrote:
Yes, the clients (appearently) read to the end of the previous file, and
then sit there, while the server is writing to a new file.
I was thinking this had to do with the unclean shutdown of MySQL--
perhapps it's something else.
It might, but it is a bug anyway. The whole idea of replication is to be able to 
deal with things like unclean shutdown.

First upgrade to 4.0.18. Then if it happens again, use mysqlbinlog -j 
pos_at_which_the_slave_is_stuck along with od -c to gather some more details ( I 
suspect a truncated or corrupted binlog event), and send the details to the 
MySQL developers.

--
Sasha Pachev
Create online surveys at http://www.surveyz.com/
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: mysql as cluster service, failover causes broken replication

2004-03-24 Thread Sasha Pachev
Matt Sturtz wrote:
Hello--

We're using Red Hat's cluster manager (RH AS 2.1, MySQL 4.0.16 RPM).  Due
to a problem within the cluster software that we're working on with Red
Hat, the cluster fails over from one node to the other sometimes when it
shouldn't (one node will reboot, services will fail over-- at this point
we think it's probably related to IO on the shared quorum partitions).
When service is restored some seconds later, the slaves won't start
replicating from the newly created binary-log, instead continuing to read
from the previous one (IE db-bin.002 is created when MySQL is restarted,
but the slaves keep reading from the old file, db-bin.001).  The only fix
seems to be CHANGE MASTER TO..., which seems somewhat error prone.
Anybody else running MySQL in this type of environment have any words of
wisdom?  Thanks in advance for any info...
They should keep reading from the old one until they catch up. Do they fail to 
roll over to the next one after finishing the old one? If yes, it would be a bug.

--
Sasha Pachev
Create online surveys at http://www.surveyz.com/
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]