I've got an application that uses a fairly large (~50MM rows, ~1GB of
disk) table of read-only data.  The table changes maybe once a month,
but when it changes, (almost) EVERY row in the table changes.

The app needs to be replicated into several datacenters worldwide
using relatively slow backend links.  For this reason and others (I
need to be able to control when each datacenter picks up updates,
etc.) native MySQL replication isn't attractive.  I'm considering
building a scheme where I insert the data into a table once and ship
around a gzipped mysqldump and load it into each datacenter -- this is
easy, uses less bandwidth, is easy to control via cron and fits well
into the rest of our infrastructure.

Then I found mysqlhotcopy.  Neato!  I've tested, and this seems to work:

1) use mysqlhotcopy to copy the table on the "replication master"

2) gzip the table/index/data files and ship them someplace remote 

3) (on the slave) unzip them 

4) LOCK TABLES foo WRITE

5) FLUSH TABLE foo

6) copy the unzipped data files over the running mysql data files for
the single table I'm intersted in.  There's clearly a problem here if
the machine crashes during this step, but it can be worked out to just
3 calls to rename(2), which is atomic on a POSIX fs, so that's less an
issue than it could be.

7) FLUSH TABLE foo

8) profit!  It looks like table foo now contains the new data.  It
takes a LOT less time than reinserting all the data into the table.

Other than "you should really use mysql native replication", does
anyone have any comments on whether this is likely to be reliable, or
why it's a bad idea?  I'm using 3.23.49 (Debian stable); Is FLUSH
TABLE likely to change in future versions in a way that will break
this?

-- 
                                        thanks,
                
                                        Will

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to