Ratheesh K J schrieb:
Thanks,
It helped me a lot. I wanted to know
1. what are the various scenarios where my replication setup can
fail? (considering even issues like network failure and server
reboot etc). What is the normal procedure to correct the failure
when something unpredicted happens?
You should first read the right parts of the manual at
https//dev.mysql.com/doc before asking such questions.
Basically:
-Use good hardware with ECC-RAM and RAID-Controllers in order to
minimize trouble with faulty hardware.
-Never write on the slaves without knowing what this could do to your
replication setup
-Watch the diskspace and make sure it's always enough space for the
binlogs. Otherwise you might end up with half-written binlogs on either
the slave or master because of a full disk which can cause trouble and
some work to get it up and running again.
When a master goes down or network connection is lost, the slave
automatically tries to reconnect once a minute or so. Restarting the
master or exchanging some network equipment is no problem. When the
slave reboots, it tries to reconnect on startup, too.
This is "out-of-the-box"-behaviour. You can modify it in the my.cnf
(i.e. use the "skip-slave-start" option etc)
1. What are the scenarios where the SQL THREAD stops running and
what are the scenarios where the IO THREAD stops running?
SQL thread stops when it can't run a SQL-Query from the binlogs for any
reason, as you have experiences when the table already existed.
The IO-Thread only stops when it has an error reading a binlog from the
master. When its only a lost connection, it automatically reconnects.
Other problems (i.e. unable to read a binlog) should never happen as
long a you don't delete binlogs on the master that have not yet been
copied over to the slave by the io-thread ("show master status" and
"show slave status" commands and their output) or you have faulty
hardware (io_errors on the harddisk or such things)
1. Does SQL_SLAVE_SKIP_COUNTER skip the statement of the master
binlog from being replicated to the slave relay log OR Has the
statement already been copied into the slave relay log and has
been skipped from the relay log?
it skips the entry on the local copy of the binlog. The IO-Thread
replicates the whole binlog and the sql-thread skips an entry in it when
you use sql_slave_skip_counter
1. How do I know immediately that replication has failed? (
have heard that the enterprise edition has some technique for
this )?
watch the logfile, it is written there. Or run a cronjob once a minute
with something like
mysql -e 'show slave status\G' |grep '_Running:' >/dev/null || bash
my_alarm_script_that_sends_mail_or_whatever.sh
regards
Jan
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]