Hello,
I have a replication setup on to linux boxes (debian woody, kernel 2.4.21-xfs, mysql 4.1.7-standard official intel-compiler binary from mysql.com).
master:~# mysqladmin status
Uptime: 464848 Threads: 10 Questions: 296385136 Slow queries: 1752 Opens: 2629 Flush tables: 1 Open tables: 405 Queries per second avg: 637.596
slave:~# mysqladmin status
Uptime: 463460 Threads: 2 Questions: 292885156 Slow queries: 6 Opens: 2510 Flush tables: 1 Open tables: 327 Queries per second avg: 631.953
both systems have identical hardware (P4 2.4ghz, 3GB RAM, SCSI-Hardware-RAID) connection is gigabit-ethernet.
Everything used to work fine, but I wanted to get rid of InnoDB since I did only use that for very big table containing historical data and those tables were moved to another server. I ran out of discspace, innodb-datafiles can only grow but not shrink and i didn't need it anyway, so it had to go.
I stopped the slave, changed all left over innodb-tables to myisam, added skip-innodb to my.cnf on the master and the slave, restarted the server, renewed the replication by doing it the "classical" way: flush tables with read log, copy the /var/lib/mysql on the slave (not much, just around 20GB), reset master, unlock tables. Then start the slave-mysqld, reset slave, slave start.
Everything was fine and very fast for 4 days (from saturday till wednesday afternoon), then suddenly the slave stopped.
this is where the weird stuff starts:
"show slave status" tells me everything is fine, just "Slave_IO_Running: No" is wrong.
After typing "slave start", it says "Slave_IO_Running: Yes", and "Slave_SQL_Running: No". Very strange. Now i did a "slave stop;slave start;" and everything is fine again, the slave catches up and goes on. Today (thursday afternoon), the same thing happens again and can be solved again by "slave stop;slave start;". Now it happened again around 10pm. Again, the stop-start-trick made it working again.
I add the output of my mysql-shell
Can anybody help me with that?
This is a production system under heavy load and I can't play around with different mysql-versions and such...
If I don't find a solution really quick, I'll have to do help myself with some shell-skript-daemon checking if replication is running and issuing "stop slave;start slave"-commands otherwise... not really the way it should be :(
Thanks Jan
SLAVE:
slave:~# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Pentium(R) 4 CPU 2.40GHz
stepping : 7
cpu MHz : 2392.077
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips : 4771.02
slave:~# free total used free shared buffers cached Mem: 3105104 2355364 749740 0 440 1514104 -/+ buffers/cache: 840820 2264284 Swap: 779144 428072 351072
MASTER
master:~# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Pentium(R) 4 CPU 2.40GHz
stepping : 7
cpu MHz : 2392.163
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips : 4771.02
master:~# free total used free shared buffers cached Mem: 3105104 3096016 9088 0 648 2087780 -/+ buffers/cache: 1007588 2097516 Swap: 779144 391732 387412
Slave shell:
wpdb2:~# mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 23083 to server version: 4.1.7-standard
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
wpdb2 mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Master_Host: 192.168.10.26 Master_User: repl Master_Port: 3306 Connect_Retry: 10 Master_Log_File: mysql-bin.000210 Read_Master_Log_Pos: 146168522 Relay_Log_File: wpdb2-relay-bin.000210 Relay_Log_Pos: 146168608 Relay_Master_Log_File: mysql-bin.000210 Slave_IO_Running: No Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 146168522 Relay_Log_Space: 146168608 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 4384 1 row in set (0.00 sec)
slave mysql> slave start; Query OK, 0 rows affected (0.01 sec)
slave mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.10.26
Master_User: repl
Master_Port: 3306
Connect_Retry: 10
Master_Log_File: mysql-bin.000210
Read_Master_Log_Pos: 186399548
Relay_Log_File: slave-relay-bin.000210
Relay_Log_Pos: 146168608
Relay_Master_Log_File: mysql-bin.000210
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by
running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network pro
blem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their na
mes by issuing 'SHOW SLAVE STATUS' on this slave.
Skip_Counter: 0
Exec_Master_Log_Pos: 146168522
Relay_Log_Space: 186399677
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 4395
1 row in set (0.00 sec)
slave mysql> slave stop; Query OK, 0 rows affected (0.00 sec)
slave mysql> slave start; Query OK, 0 rows affected (0.01 sec)
slave mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.10.26 Master_User: repl Master_Port: 3306 Connect_Retry: 10 Master_Log_File: mysql-bin.000211 Read_Master_Log_Pos: 501070714 Relay_Log_File: slave-relay-bin.000210 Relay_Log_Pos: 148765772 Relay_Master_Log_File: mysql-bin.000210 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 148765643 Relay_Log_Space: 1575017939 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 4729 1 row in set (0.00 sec)
slave mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.10.26 Master_User: repl Master_Port: 3306 Connect_Retry: 10 Master_Log_File: mysql-bin.000211 Read_Master_Log_Pos: 501273227 Relay_Log_File: slave-relay-bin.000210 Relay_Log_Pos: 155647931 Relay_Master_Log_File: mysql-bin.000210 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 155647802 Relay_Log_Space: 1575220452 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 4729 1 row in set (0.00 sec)
slave mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.10.26 Master_User: repl Master_Port: 3306 Connect_Retry: 10 Master_Log_File: mysql-bin.000211 Read_Master_Log_Pos: 502052054 Relay_Log_File: slave-relay-bin.000210 Relay_Log_Pos: 172407186 Relay_Master_Log_File: mysql-bin.000210 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 172407057 Relay_Log_Space: 1575999279 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 4693 1 row in set (0.00 sec)
slave mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Queueing master event to the relay log Master_Host: 192.168.10.26 Master_User: repl Master_Port: 3306 Connect_Retry: 10 Master_Log_File: mysql-bin.000211 Read_Master_Log_Pos: 987239824 Relay_Log_File: wpdb2-relay-bin.000211 Relay_Log_Pos: 987239782 Relay_Master_Log_File: mysql-bin.000211 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 987239653 Relay_Log_Space: 987239953 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 1 row in set (0.00 sec)
no it looks like it's working again... at first it worked for 4 days, then another 24 hours, and then only 6 hours.
-- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]