Re: Input Needed: Replication issues...

2002-01-14 Thread Kyle Hayes

On Saturday 12 January 2002 20:53, Jeremy Zawodny wrote:
 On Fri, Jan 11, 2002 at 08:08:51AM -0800, Kyle Hayes wrote:
  We write our own replication system using our own checkpointing and
  code.  We start with the update logs (we're not happy about the
  future removal of the update logs because of this).  I've looked
  through the MySQL source to see if I can figure out enough of the
  format of the binary log to see if I can make some Perl code that
  will parse it.  It is really easy to use the update log since all
  SQL end with a semicolon as the last character on a line.  Since
  Perl tends to be line-oriented, it is easy to find this.

 [snip]

 Several questions:

   (1) Do you use transactions at all, or is this MyISAM only?

   (2) Have you considered publishing the code?

   (3) Do I understand correctly, that you have your own primary keys
   which have nothing to do with MySQL's unique IDs?

1) MyISAM only (we started on some of this when 3.23 was knee-high to a 
bullfrog :-)

2) Yes, but we probably won't.  The code has a large number of assumptions 
about our set up.  I don't know that management would be too thrilled with 
it.  The system I'm working on now (to replace the old one) is much 
cleaner, but would still need a work to remove any Quicknet IP from it.  I 
don't have time to do the clean up, but in general I'd like to give it 
back to the community.

3) We have sets of IDs for each row that are unique per server.  They are 
made with something like a sequence table using LAST_INSERT_ID tricks.  
These are the IDs that tie everything together.  The auto increment fields 
are for bulk record keeping and things like purging old data efficiently.

Honestly, it is not that much code.  I wrote the whole thing in about a 
week full time.  I'll check with management to see if I can release it.

Best,
Kyle

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: Input Needed: Replication issues...

2002-01-12 Thread Heikki Tuuri

Hi!

Two-way replication is complex, and even more difficult is coping with
broken communications.

MySQL only supports one-way replication (master - slaves). Auto-inc column
values are communicated in the binlog to the slaves.

Some ideas:

- Generate primary keys where you concatenate the id of the database site
before the id number:

Paris0234, NewYork0123

Writing SELECT queries will be difficult.

- Write your own replication script like Kyle has done (maybe Kyle could use
the 'query-log' of MySQL instead of the 'update-log'?). Communicate changes
to the databases in human-readable SQL strings between the database nodes.
But you must be very careful in your application to take into account that
the two nodes will not be identical, because changes to them will be
recorded in different orders. The problem of conflicting primary keys during
a communication break can be solved with the above trick.

- Transactions might help a little bit if you want to make several changes
to the database as one atomic operation (= transaction).

Best regards,

Heikki Tuuri
Innobase Oy
---
Order technical MySQL/InnoDB support at https://order.mysql.com/
See http://www.innodb.com for the online manual and latest news on InnoDB



.
On Thursday 10 January 2002 17:22, Matthew Walker wrote:
 Where I work, we're at the point of needing to maintain two database
 servers, one inhouse, and the other at another physical location. We
 need to replicate between the servers, to keep them both up to date, as
 both will be getting updated. This leaves us with the problem of
 resolving primary key conflicts if the servers should ever lose touch
 with eachother for a while.

 We're looking at InnoDB, and wondering if it would be able to resolve
 those sorts of issues using transactions.

 Input please?

We looked closely at native MySQL replication, but after a few questions
about the handling of auto increment fields (we use them like row IDs all
over the place), we decided that MySQL's replication was broken for our
application.  We have more than one live database and they replicate to
each other.

We write our own replication system using our own checkpointing and code.
We start with the update logs (we're not happy about the future removal of
the update logs because of this).   I've looked through the MySQL source
to see if I can figure out enough of the format of the binary log to see
if I can make some Perl code that will parse it.  It is really easy to use
the update log since all SQL end with a semicolon as the last character on
a line.  Since Perl tends to be line-oriented, it is easy to find this.

Here's what we do:

- we run a special server process on each database.  This replicator
server will spool update log files on demand.

- on each machine, we run a special client process.  This process connects
to the replicator server (not MySQL's), gets the update log information
and puts it into the local database.  If necessary, we can rewrite any SQL
(we don't).

The server process puts a special checkpoint comment between each SQL
statement.  The checkpoint comments has a special header and trailer so
that we can recognize it.  It contains the name of the file that is
currently being spooled and the byte offset in that file.  This
information is stored on the client side in a special checkpoint file.

If the client process needs to be restarted or the network drops or
something causes replication to quit, it will restart where it left off
based on the checkpoint file.  This functionality is basically identical
to the native MySQL replication.

The reasons for using the update log and not native MySQL replication are
these:

- we can have more that two servers coupled together.

- we use auto increment fields all over.  With native mySQL replication,
these fields are filled in with values on the source side.   Thus, we get
collisions on the target side.  We have no need for the row IDs to match
on each machine as every row has another unique ID field.  We do need the
row IDs for purging old data and other things that do not rely on the
unique ID field.

- we can do SQL rewriting if we really need to.

These problems led us to write our own replication.  If you have the
luxury of redesigning your database or of designing with MySQL's native
replication in mind (we did not), then I would use the native version.

Best,
Kyle



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: Input Needed: Replication issues...

2002-01-12 Thread Jeremy Zawodny

On Fri, Jan 11, 2002 at 08:08:51AM -0800, Kyle Hayes wrote:

 We write our own replication system using our own checkpointing and
 code.  We start with the update logs (we're not happy about the
 future removal of the update logs because of this).  I've looked
 through the MySQL source to see if I can figure out enough of the
 format of the binary log to see if I can make some Perl code that
 will parse it.  It is really easy to use the update log since all
 SQL end with a semicolon as the last character on a line.  Since
 Perl tends to be line-oriented, it is easy to find this.

[snip]

Several questions:

  (1) Do you use transactions at all, or is this MyISAM only?

  (2) Have you considered publishing the code?

  (3) Do I understand correctly, that you have your own primary keys
  which have nothing to do with MySQL's unique IDs?

Jeremy
-- 
Jeremy D. Zawodny, [EMAIL PROTECTED]
Technical Yahoo - Yahoo Finance
Desk: (408) 349-7878   Fax: (408) 349-5454   Cell: (408) 685-5936

MySQL 3.23.41-max: up 10 days, processed 248,268,346 queries (280/sec. avg)

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: Input Needed: Replication issues...

2002-01-11 Thread Kyle Hayes

On Thursday 10 January 2002 17:22, Matthew Walker wrote:
 Where I work, we're at the point of needing to maintain two database
 servers, one inhouse, and the other at another physical location. We
 need to replicate between the servers, to keep them both up to date, as
 both will be getting updated. This leaves us with the problem of
 resolving primary key conflicts if the servers should ever lose touch
 with eachother for a while.

 We're looking at InnoDB, and wondering if it would be able to resolve
 those sorts of issues using transactions.

 Input please?

We looked closely at native MySQL replication, but after a few questions 
about the handling of auto increment fields (we use them like row IDs all 
over the place), we decided that MySQL's replication was broken for our 
application.  We have more than one live database and they replicate to 
each other.

We write our own replication system using our own checkpointing and code.  
We start with the update logs (we're not happy about the future removal of 
the update logs because of this).   I've looked through the MySQL source 
to see if I can figure out enough of the format of the binary log to see 
if I can make some Perl code that will parse it.  It is really easy to use 
the update log since all SQL end with a semicolon as the last character on 
a line.  Since Perl tends to be line-oriented, it is easy to find this.

Here's what we do:

- we run a special server process on each database.  This replicator 
server will spool update log files on demand.

- on each machine, we run a special client process.  This process connects 
to the replicator server (not MySQL's), gets the update log information 
and puts it into the local database.  If necessary, we can rewrite any SQL 
(we don't).

The server process puts a special checkpoint comment between each SQL 
statement.  The checkpoint comments has a special header and trailer so 
that we can recognize it.  It contains the name of the file that is 
currently being spooled and the byte offset in that file.  This 
information is stored on the client side in a special checkpoint file.  

If the client process needs to be restarted or the network drops or 
something causes replication to quit, it will restart where it left off 
based on the checkpoint file.  This functionality is basically identical 
to the native MySQL replication.

The reasons for using the update log and not native MySQL replication are 
these:

- we can have more that two servers coupled together.

- we use auto increment fields all over.  With native mySQL replication, 
these fields are filled in with values on the source side.   Thus, we get 
collisions on the target side.  We have no need for the row IDs to match 
on each machine as every row has another unique ID field.  We do need the 
row IDs for purging old data and other things that do not rely on the 
unique ID field.

- we can do SQL rewriting if we really need to.

These problems led us to write our own replication.  If you have the 
luxury of redesigning your database or of designing with MySQL's native 
replication in mind (we did not), then I would use the native version.

Best,
Kyle

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php