Re: Using "START SLAVE [SQL_THREAD] UNTIL" syntax

Bruce Dembecki Tue, 12 Jul 2005 10:35:30 -0700

Good issue, I totally had the same concerns, so we built our ownsystem. As a side note we run an admin server which we use togenerate reports, run backups and so on - it takes load off theproduction servers, where speed is critical. Recovery from backuphowever is a whole other issue, and a corrupting error getsreplicated very quickly - With the new system we use we have cutrecover from backup from 4 hours to 30 minutes, and the data is newer.

Our approach is a little different, but basically the same. In anideal world I'd like to see MySQL add a feature to replication thatallowed me to set a variable to control how quickly the SQL_THREAD inreplication executes it's queries... eg a variable such asreplication-delay=3600 could tell MySQL's replication thread to holdoff executing any command until the time is 3600 seconds beyond thetimestamp in the binary log.

Some of my questions:
1) What are the benefits to using relay_log_file and relay_log_pos
instead of master_log_file and master_log_pos?  that the slave
binlogs would already exist locally?  Perhaps that's good or bad?
thoughts?

Relay logs are better to use for this for one major reason - Assumingone of the reasons this server exists is to provide backup to theprimary then having the data copied from the master server to theslave server provides a copy of the logs where you need them in theevent of a hardware failure on the master - or in other words, if youmanipulate the slave's SQL_THREAD and keep the IO_THREAD running youare copying your data pretty close to instantly, so you have itsomewhere else. Most people would put that in the "good" category.the solution I proposed above does this too, keeping the data copiedoff the master all the time. Our home built set of scripts doesn't dothis, we manage the process through controlling the IO_THREAD, it'seasy, and we have two primary servers running as a pair so we have alive "backup" on a separate system. We would however prefer to havethe data in the relay log current all the time, and manipulate theSQL_THREAD.


2) Has anyone done something like this?

Yes.. our "like this" is simple scripts to start and stop theIO_THREAD and SQL_THREAD at certain times, run by cron. Specificallythe sequence (which repeats every two hours) goes like this:


4:00pm stop SQL_THREAD
4:01pm flush logs, start IO_THREAD
4:05pm stop IO_THREAD
4:10pm start SQL_THREAD

The net impact of this is that the data on the admin server is on thelow side 5 minutes behind live, and on the high side a little over 2hours behind.

We also set all the "start" scripts to first check for the existenceof a file (/var/mysql/replicate) and if the file doesn't exist, don'tstart anything... and we have a script that stops all replication andnukes the test file, which can be run on the machine and is actuallytied to a tcp port so all we have to do is hit a specific port with atelnet connection or a web browser and replication is immediatelystopped and won't be started again by cron until the file get's putback, so we have managed the emergency stop issue (thus we feltcomfortable with a 5 minute low on this process).

By having the logs flush on the slave and the master every 2 hours aspart of this process we have small chunks of binary log we can applyto an overnight backup if we miss the replication stop before thedisaster hits this server... making recovery to a fairly recent pointin time simple - we could then spend some time munging through therelevant binary log to eliminate the corrupting event before applyingthe rest of them, while our services is back online.

3) If I made it robust and flexible would people be interested in it?

Because our production servers are a MASTER-MASTER pair we are kindof OK with the method we use in controlling the IO_THREAD in thisway, but I do acknowledge there is an attraction to having the relaylog (flushed regularly) have the latest data thus ensuring the adminserver has all the data in one form or another, even if it's notactually executed on the database. So MAYBE we would be interested.

4) Is there a better way?

Yes - I still firmly believe the BEST solution here should come fromMySQL... give us a replication-delay type variable that allows us toset an implementation delay on replication, the relay logs are up todate and the queries are executed by the SQL_THREAD after nnnnseconds from the timestamp in the original binary log - no need tochange the log formats, fairly simple piece of code to add at MySQL'send... clearly the default value would be 0 so it only comes intoplay when someone wants it... I'd much rather set the value to 30minutes or an hour or something and no exactly how far behind mybackup server is, instead of the sliding 2 hour range I have now...

Outside of a MySQL based solution, our system works great, it's OK tomanage it without keeping relay logs current because the productionpair means there's two copies of the data live protecting us fromhardware failure. Something like what you describe may or may not beworth doing for us.


Best Regards, Bruce

Here's a simplified version of our scripts (simplified because ouradmin server has multiple instances of mysql it manages):


Crontab:

0 */2 * * * /usr/local/bin/sqlpause stop > /dev/null 2>&1
1 */2 * * * /usr/local/bin/iopause start > /dev/null 2>&1
4 */2 * * * /usr/local/bin/iopause stop > /dev/null 2>&1
10 */2 * * * /usr/local/bin/sqlpause start > /dev/null 2>&1


[mysql-admin:/usr/local/bin] root# more sqlpause
#!/bin/sh
#
#
# Bruce's MySQL Replication Management Scripts
#
# Suitable for use on LiveWorld's network only... There
# are other ways that may be more effective for other
# networks or setup, refer to MySQL documentation
#

#
# check to see if we have been told to "start"
#

if [ $1 = "start" ]
then

#
# First Flush local binlogs
#
      mysqladmin --socket=/tmp/mysql.sock flush-logs

#
# Then start SQL thread
#

   if [ -f /var/mysql/replicate ]
   then
         mysql --socket=/tmp/$mysql.sock -e "SLAVE START SQL_THREAD;"
   fi

#
# If we are not told to start, then stop
#

else
      mysql --socket=/tmp/mysql.sock -e "SLAVE STOP SQL_THREAD;"
   done
fi








[mysql-admin:/usr/local/bin] root# more iopause
#!/bin/sh
#
#
# Bruce's MySQL Replication Management Script
#
# Suitable for use on LiveWorld's network only... There
# are other ways that may be more effective for other
# networks or setup, refer to MySQL documentation
#

host="insert.your.master.hostname.here"

#
# check to see if we have been told to "start"
#

if [ $1 = "start" ]
then

#
# First Flush remote binlogs
#

      mysqladmin -h $host flush-logs


#
# Then start IO thread if Replication is Authorized
#

   if [ -f /var/mysql/replicate ]
   then
         mysql --socket=/tmp/mysql.sock -e "SLAVE START IO_THREAD;"
   fi

#
# If we are not told to start, then stop
#

else
      mysql --socket=/tmp/mysql.sock "SLAVE STOP IO_THREAD;"
fi



And the emergency stop script:

[mysql-admin:/usr/local/bin] root# more noreplicate
#!/bin/sh
#
#
# Bruce's MySQL Replication Authorization Script
#

#
# The assumption here is that this script must stop replication
# and by removing the file /var/mysql/replicate prevent
# replication from starting again as all scripts used to start
# replication check for the existence of this file
#

rm /var/mysql/replicate

/usr/local/bin/sqlpause stop
/usr/local/bin/iopause stop


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: Using "START SLAVE [SQL_THREAD] UNTIL" syntax

Reply via email to