On Thursday 10 January 2002 17:22, Matthew Walker wrote:
> Where I work, we're at the point of needing to maintain two database
> servers, one inhouse, and the other at another physical location. We
> need to replicate between the servers, to keep them both up to date, as
> both will be getting updated. This leaves us with the problem of
> resolving primary key conflicts if the servers should ever lose touch
> with eachother for a while.
>
> We're looking at InnoDB, and wondering if it would be able to resolve
> those sorts of issues using transactions.
>
> Input please?

We looked closely at "native" MySQL replication, but after a few questions 
about the handling of auto increment fields (we use them like row IDs all 
over the place), we decided that MySQL's replication was broken for our 
application.  We have more than one live database and they replicate to 
each other.

We write our own replication system using our own checkpointing and code.  
We start with the update logs (we're not happy about the future removal of 
the update logs because of this).   I've looked through the MySQL source 
to see if I can figure out enough of the format of the binary log to see 
if I can make some Perl code that will parse it.  It is really easy to use 
the update log since all SQL end with a semicolon as the last character on 
a line.  Since Perl tends to be line-oriented, it is easy to find this.

Here's what we do:

- we run a special server process on each database.  This replicator 
server will spool update log files on demand.

- on each machine, we run a special client process.  This process connects 
to the replicator server (not MySQL's), gets the update log information 
and puts it into the local database.  If necessary, we can rewrite any SQL 
(we don't).

The server process puts a special checkpoint comment between each SQL 
statement.  The checkpoint comments has a special header and trailer so 
that we can recognize it.  It contains the name of the file that is 
currently being spooled and the byte offset in that file.  This 
information is stored on the client side in a special checkpoint file.  

If the client process needs to be restarted or the network drops or 
something causes replication to quit, it will restart where it left off 
based on the checkpoint file.  This functionality is basically identical 
to the native MySQL replication.

The reasons for using the update log and not native MySQL replication are 
these:

- we can have more that two servers coupled together.

- we use auto increment fields all over.  With native mySQL replication, 
these fields are filled in with values on the source side.   Thus, we get 
collisions on the target side.  We have no need for the row IDs to match 
on each machine as every row has another unique ID field.  We do need the 
row IDs for purging old data and other things that do not rely on the 
unique ID field.

- we can do SQL rewriting if we really need to.

These problems led us to write our own replication.  If you have the 
luxury of redesigning your database or of designing with MySQL's native 
replication in mind (we did not), then I would use the native version.

Best,
Kyle

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to