Re: problems w/ Replication over the Internet

Eric Bergen Sun, 20 Apr 2008 19:42:09 -0700

Hi Jan,

You have two separate issues here. First the issue with the link
between the external slave and the master. Running mysql through
something like stunnel may help with the connection and data loss
issues.


The second problem is that your slave is corrupt. Duplicate key errors
are sometimes caused by a corrupt table but more often by restarting
replication from an incorrect binlog location. Try recloning the slave
and starting replication again through stunnel.

-Eric

On Tue, Apr 15, 2008 at 1:11 AM, Jan Kirchhoff <[EMAIL PROTECTED]> wrote:
> I have a setup with a master and a bunch of slaves in my LAN as well as
>  one external slave that is running on a Xen-Server on the internet.
>  All servers run Debian Linux and its mysql version 5.0.32
>  Binlogs are around 2 GB per day. I have no trouble at all with my local
>  slaves, but the external one hangs once every two days.
>  As this server has no "other" problems like crashing programs, kenrel
>  panics, corrupted files or such, I am pretty sure that the hardware is OK.
>
>  the slave's log:
>
>  Apr 15 06:39:19 db-extern mysqld[24884]: 080415  6:39:19 [ERROR] Error
>  reading packet from server: Lost connection to MySQL server during query
>  ( server_errno=2013)
>  Apr 15 06:39:19 db-extern mysqld[24884]: 080415  6:39:19 [Note] Slave
>  I/O thread: Failed reading log event, reconnecting to retry, log
>  'mysql-bin.045709' position 7334981
>  Apr 15 06:39:19 db-extern mysqld[24884]: 080415  6:39:19 [Note] Slave:
>  connected to master '[EMAIL PROTECTED]:1234',replication resumed in log
>  'mysql-bin.045709' at position 7334981
>  Apr 15 06:39:20 db-extern mysqld[24884]: 080415  6:39:20 [ERROR] Error
>  in Log_event::read_log_event(): 'Event too big', data_len: 503316507,
>  event_type: 16
>  Apr 15 06:39:20 db-extern mysqld[24884]: 080415  6:39:20 [ERROR] Error
>  reading relay log event: slave SQL thread aborted because of I/O error
>  Apr 15 06:39:20 db-extern mysqld[24884]: 080415  6:39:20 [ERROR] Slave:
>  Could not parse relay log event entry. The possible reasons are: the
>  master's binary log is corrupted (you can check this by running
>  'mysqlbinlog' on the binary log), the slave's relay log is corrupted
>  (you can check this by running 'mysq
>  lbinlog' on the relay log), a network problem, or a bug in the master's
>  or slave's MySQL code. If you want to check the master's binary log or
>  slave's relay log, you will be able to know their names by issuing 'SHOW
>  SLAVE STATUS' on this slave. Error_code: 0
>  Apr 15 06:39:20 db-extern mysqld[24884]: 080415  6:39:20 [ERROR] Error
>  running query, slave SQL thread aborted. Fix the problem, and restart
>  the slave SQL thread with "SLAVE START". We stopped at log
>  'mysql-bin.045709' position 172
>  Apr 15 06:40:01 db-extern mysqld[24884]: 080415  6:40:01 [Note] Slave
>  I/O thread killed while reading event
>  Apr 15 06:40:01 db-extern mysqld[24884]: 080415  6:40:01 [Note] Slave
>  I/O thread exiting, read up to log 'mysql-bin.045709', position 23801854
>  Apr 15 06:40:01 db-extern mysqld[24884]: 080415  6:40:01 [Note] Slave
>  SQL thread initialized, starting replication in log 'mysql-bin.045709'
>  at position 172, relay log './db-extern-relay-bin.000001' position: 4
>  Apr 15 06:40:01 db-extern mysqld[24884]: 080415  6:40:01 [Note] Slave
>  I/O thread: connected to master '[EMAIL PROTECTED]:1234',  replication
>  started in log 'mysql-bin.045709' at position 172
>  Apr 15 06:40:01 db-extern mysqld[24884]: 080415  6:40:01 [ERROR] Error
>  reading packet from server: error reading log entry ( server_errno=1236)
>  Apr 15 06:40:01 db-extern mysqld[24884]: 080415  6:40:01 [ERROR] Got
>  fatal error 1236: 'error reading log entry' from master when reading
>  data from binary log
>  Apr 15 06:40:01 db-extern mysqld[24884]: 080415  6:40:01 [Note] Slave
>  I/O thread exiting, read up to log 'mysql-bin.045709', position 172
>
>  slave start;
>  doesn't help.
>
>  slave stop, reset slave; change master to
>  master_log_file="mysql-bin.045709", master_log_pos=172;slave start
>  does not help as well
>
>  the only way to get this up and running again is to do a change master
>  to master_log_file="mysql-bin.045709", master_log_pos=0 and use
>  sql_slave_skip_counter when I get duplicate key errors. this sucks.
>  When this problem occurs, the log positions are always small number, I
>  would say less than 500.
>
>  I also get connection errors in the log from time to time, but it
>  recovers itself:
>  Apr 14 22:27:17 db-extern mysqld[24884]: 080414 22:27:17 [ERROR] Error
>  reading packet from server: Lost connection to MySQL server during query
>  ( server_errno=2013)
>  Apr 14 22:27:17 db-extern mysqld[24884]: 080414 22:27:17 [Note] Slave
>  I/O thread: Failed reading log event, reconnecting to retry, log
>  'mysql-bin.045705' position 34671615
>  Apr 14 22:27:17 db-extern mysqld[24884]: 080414 22:27:17 [Note] Slave:
>  connected to master '[EMAIL PROTECTED]:1234',replication resumed in log
>  'mysql-bin.045705' at position 34671615
>
>  Sometimes I have
>  Apr 13 23:22:04 db-extern mysqld[24884]: 080413 23:22:04 [ERROR] Slave:
>  Error 'You have an error in your SQL syntax; check the manual that
>  corresponds to your MySQL server version for the right syntax to use
>  near '^\' at line 1' on query.
>  Apr 13 23:22:04 db-extern mysqld[24884]: 080413 23:22:04 [ERROR] Error
>  running query, slave SQL thread aborted. Fix the problem, and restart
>  the slave SQL thread with "SLAVE START". We stopped at log
>  'mysql-bin.045699' position 294101453
>  But this time
>  slave stop, reset slave; change master to
>  master_log_file="mysql-bin.045699", master_log_pos=294101453;slave start
>  helps!
>
>  master# mysqlbinlog --position=172 mysql-bin.045709
>  /*!40019 SET @@session.max_insert_delayed_threads=0*/;
>  /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
>
>  ERROR: Error in Log_event::read_log_event(): 'read error', data_len:
>  543519343, event_type: 116
>  # End of log file
>  ROLLBACK /* added by mysqlbinlog */;
>  /*!50003 SET [EMAIL PROTECTED]/;
>
>  so the position "172" seems to be wrong?
>
>  master# mysqlbinlog mysql-bin.045709  >/dev/null
>  master#
>
>  The binlog on the master is ok (As I said, alle other slaves replicate
>  without any problems...)
>
>  Any suggestions? I have cronjobs running now that read the output of
>  "show slave status" and run queries like the above
>  slave stop, reset slave; change master to
>  master_log_file="mysql-bin.045699", master_log_pos=294101453; if
>  necessary and every second day I do a change master to
>  master_log_file="abc", master_log_pos=0;slave start in the console and
>  start a "sql_slave_skip_counter"-loop in the bash until everything is
>  running without error again.
>  btw: Although the master's binlog-postion the slave tells me (in this
>  case 172) is a relatively low number, i have to send at least a few
>  dozen of "sql_slave_skip_counter"-queries. So the problem seems to be,
>  that the "172" should be something in the ten-thousands or more...
>
>  Has anybody seen something like this?
>
>  Jan
>
>
>  --
>  MySQL General Mailing List
>  For list archives: http://lists.mysql.com/mysql
>  To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]
>
>



-- 
high performance mysql consulting.
http://provenscaling.com

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: problems w/ Replication over the Internet

Reply via email to