Ratheesh K J schrieb:
Thanks,
It helped me a lot. I wanted to know

   1. what are the various scenarios where my replication setup can
      fail? (considering even issues like network failure and server
      reboot etc). What is the normal procedure to correct the failure
      when something unpredicted happens?

You should first read the right parts of the manual at https//dev.mysql.com/doc before asking such questions.
Basically:
-Use good hardware with ECC-RAM and RAID-Controllers in order to minimize trouble with faulty hardware. -Never write on the slaves without knowing what this could do to your replication setup -Watch the diskspace and make sure it's always enough space for the binlogs. Otherwise you might end up with half-written binlogs on either the slave or master because of a full disk which can cause trouble and some work to get it up and running again.

When a master goes down or network connection is lost, the slave automatically tries to reconnect once a minute or so. Restarting the master or exchanging some network equipment is no problem. When the slave reboots, it tries to reconnect on startup, too.

This is "out-of-the-box"-behaviour. You can modify it in the my.cnf (i.e. use the "skip-slave-start" option etc)

   1. What are the scenarios where the SQL THREAD stops running and
      what are the scenarios where the IO THREAD stops running?

SQL thread stops when it can't run a SQL-Query from the binlogs for any reason, as you have experiences when the table already existed.

The IO-Thread only stops when it has an error reading a binlog from the master. When its only a lost connection, it automatically reconnects. Other problems (i.e. unable to read a binlog) should never happen as long a you don't delete binlogs on the master that have not yet been copied over to the slave by the io-thread ("show master status" and "show slave status" commands and their output) or you have faulty hardware (io_errors on the harddisk or such things)

   1. Does SQL_SLAVE_SKIP_COUNTER skip the statement of the master
      binlog from being replicated to the slave relay log OR Has the
      statement already been copied into the slave relay log and has
      been skipped from the relay log?

it skips the entry on the local copy of the binlog. The IO-Thread replicates the whole binlog and the sql-thread skips an entry in it when you use sql_slave_skip_counter

   1. How do I know immediately that replication has failed? (
      have heard that the enterprise edition has some technique for
      this )?

watch the logfile, it is written there. Or run a cronjob once a minute with something like mysql -e 'show slave status\G' |grep '_Running:' >/dev/null || bash my_alarm_script_that_sends_mail_or_whatever.sh



regards
Jan

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to