I've been having trouble with my master/slave server - recently I was
having a few repeated issues where the mysql slave would stop due to
"invalid sql syntax", but the queries executed fine on the master. I
would have to manually dig through the logs and then find the query to
manually execute on the slave, then use skip_counter to resume the
replication skipping the corrupted statement on the slave. I thought it
might be hardware related since it was only affecting the slave, so I
moved it to a different blade (both the servers are blades).
However, today I was greeted with a nagios alert that the slave had
stopped again. This time, it seems like the relay log is definitely
corrupt. I was able to run mysqlbinlog > /dev/null on all the master
logs, none are corrupt (including the one it had read up to on the
slave). The relay log on the slave is though - it reports
"[EMAIL PROTECTED] mysql]# mysqlbinlog mysql02-relay-bin.010923 > /dev/null
ERROR: Error in Log_event::read_log_event(): 'read error', data_len:
38210134, event_type: 0
Could not read entry at offset 618730:Error in log format or read error"
_Nothing too much different in the logs either:
_071006 11:18:52 [Note] Slave I/O thread: connected to master
'[EMAIL PROTECTED]
4:3306', replication started in log 'mysql-bin.000104' at position
906124600
071008 9:07:12 [ERROR] Error reading packet from server: Lost
connection to MySQL server during query ( server_errno=2013)
071008 9:07:13 [Note] Slave I/O thread: Failed reading log event,
reconnecting to retry, log 'mysql-bin.000105' position 766367499
071008 9:07:13 [Note] Slave: connected to master
'[EMAIL PROTECTED]:3306',replication resumed in log 'mysql-bin.000105'
at position 766367499
071008 10:08:16 [ERROR] Error reading packet from server: Lost
connection to MySQL server during query ( server_errno=2013)
071008 10:08:16 [Note] Slave I/O thread: Failed reading log event,
reconnecting to retry, log 'mysql-bin.000105' position 819300906
071008 10:08:16 [Note] Slave: connected to master
'[EMAIL PROTECTED]:3306',replication resumed in log 'mysql-bin.000105'
at position 819300906
071008 12:12:40 [ERROR] Error reading packet from server: Lost
connection to MySQL server during query ( server_errno=2013)
071008 12:12:40 [Note] Slave I/O thread: Failed reading log event,
reconnecting to retry, log 'mysql-bin.000105' position 893443034
071008 12:12:40 [Note] Slave: connected to master
'[EMAIL PROTECTED]:3306',replication resumed in log 'mysql-bin.000105'
at position 893443034
071008 12:15:33 [ERROR] Error in Log_event::read_log_event(): 'read
error', data_len: 38210134, event_type: 0
071008 12:15:33 [ERROR] Error reading relay log event: slave SQL thread
aborted because of I/O error
071008 12:15:33 [ERROR] Slave: Could not parse relay log event entry.
The possible reasons are: the master's binary log is corrupted (you can
check this
by running 'mysqlbinlog' on the binary log), the slave's relay log is
corrupted (you can check this by running 'mysqlbinlog' on the relay
log), a netwo
rk problem, or a bug in the master's or slave's MySQL code. If you want
to check the master's binary log or slave's relay log, you will be able
to know
their names by issuing 'SHOW SLAVE STATUS' on this slave. Error_code: 0
071008 12:15:33 [ERROR] Error running query, slave SQL thread aborted.
Fix the problem, and restart the slave SQL thread with "SLAVE START". We
stopped
at log 'mysql-bin.000105' position 893425700
Any help or ideas tracking this down would be appreciated - I think we
are going to have to take down the production database to resync the two
and get replication going again. We mainly use the replica for backup
purposes in order to avoid downtime during the backup and in the event
of a hardware issue with the master.
Thanks,
Frank
--
The sender of this email subscribes to Perimeter eSecurity's email
anti-virus service. This email has been scanned for malicious code and is
believed to be virus free. For more information on email security please
visit: http://www.perimeterusa.com/email-defense-content.html
This communication is confidential, intended only for the named recipient(s)
above and may contain trade secrets or other information that is exempt from
disclosure under applicable law. Any use, dissemination, distribution or
copying of this communication by anyone other than the named recipient(s) is
strictly prohibited. If you have received this communication in error, please
delete the email and immediately notify our Command Center at 203-541-3444.
Thanks