Hi Kristian,

On Sep 23, 2010, at 11:50 PM PDT, Kristian Nielsen wrote:

Robert Hodges 
<[email protected]<mailto:[email protected]>> writes:

#2 is faster but does have some hidden complexities.  One tricky problem is
to ensure that you can easily recreate the original serialized commit order
in case you hit something that can't be parallelized due to
application-level dependencies between apply streams.  This occurs commonly

My thought is that it is sufficient to write the "COMMIT" event of each
transaction in-order to the replication stream.

That seems right to me.


Another practical problem of #2 is how to deal with transactions that go for
some period of time and then fail.  Would you not write them in the log at
all, or would you start to write them before they actually commit? In this
case you would need to do the standard DBMS log thing and have a rollback at
the end to cancel them.

Right, rollback must be written at the end for transactions that do not
commit, but I read that Drizzle already does that (in one special case).

You also need to write some kind of "server restart" event, so slave knows to
rollback any not committed transactions after master crashed and comes back up.

At this point the binlog *does* look pretty much like a DBMS log, doesn't it?

One consequence, then, of design #2 is that fast and robust plugin 
implementations will be more complex, hence fewer.   You'll either have to 
demultiplex interleaved transaction fragments as others have described or 
maintain a cache of connections that apply transactions in parallel.  Either 
way you have to deal with a number of special cases.

Personally I think fast and readily parallelizable should win over a bit of 
complexity in implementation.  However, there is a real trade-off that runs 
deeper than a bit of light buffering to iron things out.

Yet another parallel apply problem is restart after a crash but this is less
difficult.  There are several ways for slaves to remember their positions on
different streams as long as transactions on those streams are fully
serialized with respect to other transactions within the stream.

Yes, some care is needed, but it should be possible. The slave needs to record
transactionally the replication stream position corresponding to the engine
state. I think one way is to first scan the stream from an "early enough"
position, recording all transactions with commit sequence earlier than saved
positions (so transactions already applied). Then start replication again from
"early enough", skipping all transactions found to already be applied. Another
way is to handle it like the MySQL transaction coordinator does with log_xid()
and unlog(), but I do not know if this code still exists in Drizzle.

BTW, I get the impression that you have in mind using multiple replication
streams, each with non-interleaved transactions. I understood #2 to mean
interleaving transaction events within a single stream.

Tungsten currently only uses separate streams at the very end for apply on the 
slave.  Parallelization solves the problem of slave apply being blocked by i/o 
operations on single-threaded replication.  Before that point you are better 
off (typically) just having a single very efficient stream of transactions so 
you get sequential writes and reads off disk logs.  I guess that may change 
with SSDs or disks with large caches but that's not where most people are right 
now.

- Kristian.

_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to