Hi Kristian, On Sep 23, 2010, at 11:50 PM PDT, Kristian Nielsen wrote:
Robert Hodges <[email protected]<mailto:[email protected]>> writes: #2 is faster but does have some hidden complexities. One tricky problem is to ensure that you can easily recreate the original serialized commit order in case you hit something that can't be parallelized due to application-level dependencies between apply streams. This occurs commonly My thought is that it is sufficient to write the "COMMIT" event of each transaction in-order to the replication stream. That seems right to me. Another practical problem of #2 is how to deal with transactions that go for some period of time and then fail. Would you not write them in the log at all, or would you start to write them before they actually commit? In this case you would need to do the standard DBMS log thing and have a rollback at the end to cancel them. Right, rollback must be written at the end for transactions that do not commit, but I read that Drizzle already does that (in one special case). You also need to write some kind of "server restart" event, so slave knows to rollback any not committed transactions after master crashed and comes back up. At this point the binlog *does* look pretty much like a DBMS log, doesn't it? One consequence, then, of design #2 is that fast and robust plugin implementations will be more complex, hence fewer. You'll either have to demultiplex interleaved transaction fragments as others have described or maintain a cache of connections that apply transactions in parallel. Either way you have to deal with a number of special cases. Personally I think fast and readily parallelizable should win over a bit of complexity in implementation. However, there is a real trade-off that runs deeper than a bit of light buffering to iron things out. Yet another parallel apply problem is restart after a crash but this is less difficult. There are several ways for slaves to remember their positions on different streams as long as transactions on those streams are fully serialized with respect to other transactions within the stream. Yes, some care is needed, but it should be possible. The slave needs to record transactionally the replication stream position corresponding to the engine state. I think one way is to first scan the stream from an "early enough" position, recording all transactions with commit sequence earlier than saved positions (so transactions already applied). Then start replication again from "early enough", skipping all transactions found to already be applied. Another way is to handle it like the MySQL transaction coordinator does with log_xid() and unlog(), but I do not know if this code still exists in Drizzle. BTW, I get the impression that you have in mind using multiple replication streams, each with non-interleaved transactions. I understood #2 to mean interleaving transaction events within a single stream. Tungsten currently only uses separate streams at the very end for apply on the slave. Parallelization solves the problem of slave apply being blocked by i/o operations on single-threaded replication. Before that point you are better off (typically) just having a single very efficient stream of transactions so you get sequential writes and reads off disk logs. I guess that may change with SSDs or disks with large caches but that's not where most people are right now. - Kristian.
_______________________________________________ Mailing list: https://launchpad.net/~drizzle-discuss Post to : [email protected] Unsubscribe : https://launchpad.net/~drizzle-discuss More help : https://help.launchpad.net/ListHelp

