4.0.4 replication slave "hangs" after master breakdown+reboot

Michael Zimmermann Mon, 14 Oct 2002 09:28:17 -0700

Hi friends,

has anybody had similiar experiences?



Environment:

Systems are x86 SuSE 8.0/7.3 Linux, MySQL-max 
version 4.0.4 is used on all machines 
(installed from the mysql.org-RPMs). 
The setup is a circular replication
with 3 machines. Linux distro- or kernel-
version don't seem to play a role, several
combinations of SuSE-distro (8.0 - 8.0,
7.3 - 8.0, 8.0 - 7.3) or kernels (2.4.10,
2.4.19, 2.4.20-pre10) reproduce the
same situation.


Problem:

After one server went down the "hard" way
(which can be simulated with a "rcnetwork stop")
and comes up through a reboot the mysql 
master-process naturally opens a new bin-log.

But the slave-process on the next machine in the
replication-chain keeps 'hanging' on a position
in the previous bin-log (Slave is running,
Slave IO is also 'Yes') - probably the position 
when its master jumped down the cliff without 
prior notice.

A 'slave stop;' plus 'slave start;' solves
this problem, but the startup is not done
automatically. As if the slave process
is still listening on the socket of the
dead connection and has not recognized
that there is no longer somebody on the 
other side.

No corrupted data on any machine or the like, 
just this inability to resume the slave-
operations without that manual 'push'.
Without that slave start+stop the hanging 
occurs 'forever' (much longer than the 
master-retry time, which is kept at the 
default of 60 seconds).

If the reboot on the master is done normally
(without cutting the connection first),
then everything works fine.


Michael
-- 
Michael Zimmermann  (http://vegaa.de)

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

4.0.4 replication slave "hangs" after master breakdown+reboot

Reply via email to