We are replicating our database server in the interests of
having a hot spare with a fully up to date dataset.

Every so often, I'll get a page from a watchdog script[1] telling
me that the slave has fallen out of sync.  Perhaps every 10 days
this occurs.  

I notice there's always a coincidence of the slave disconnecting and
reconnecting to the master a few minutes before this occurs:

031025  6:24:06  Error reading packet from server: Lost connection to MySQL server 
during query (server_errno=2013)
031025  6:24:06  Slave: Failed reading log event, reconnecting to retry, log 
'dbms2-bin.318' position 26879196
031025  6:24:06  Slave: reconnected to master '[EMAIL PROTECTED]:3306',replication 
resumed in log 'dbms2-bin.318' at position 26879196
ERROR: 1062  Duplicate entry '3133173' for key 2
031025  9:32:35  Slave:  error running query [**insert that failed goes here**]
031025  9:32:35  Error running query, slave aborted. Fix the problem, and re-start the 
slave thread with "mysqladmin start-slave". We stopped at log 'dbms2-bin.319' position 
68555844
031025  9:32:35  Slave thread exiting, replication stopped in log 'dbms2-bin.319' at 
position 68555844

Is the error disconnect/reconnect not sync-safe?

[1] The watchdog script checks to make sure the that a frequently
  updated table has a row with timestamp younger than 5 minutes on
  the slave.

-- 
Michael Bacarella                24/7 phone: 1-646-641-8662
Netgraft Corporation                   http://netgraft.com/

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to