Thank you for the comments! Re: Michael - Great point. One solution to the "two atomic increments of the same value" problem is to just ROLLBACK the 2nd transaction and re-run it after the first batch completes committing. This way non-conflicting transactions can safely commit at the same time, while conflicting transactions remain serialized. Do you agree this would solve that problem?
Re: Adam - Huh, really great example. I don't see a solution off the cuff; I'll need to think on this (and determine if this theoretical case would actually affect our real-world schema). Thanks both, and anybody else who sees problems in the design, please share your thoughts! -david On Tue, Apr 19, 2016 at 6:11 AM, Adam Midvidy <[email protected]> wrote: > David, > > I don't think this conflict detection mechanism works well with updates. > > Consider the following schedule of transactions, running against an empty > table 'foo' with two integer valued columns 'a' and 'b': > > 1) UPDATE foo SET b=2 WHERE a=1 > > 2) INSERT INTO foo VALUES (1, 1), (1, 1) > > On the primary the first update is a no-op since the table is empty. After > the subsequent insert the contents of foo are ((1, 1), (1, 1)). > > These transactions are batches together since they touch no common pages. > > When the replica applies the batch, 2 completes before 1, and the contents > of 'foo' on the secondary are ((1, 2), (1, 2)). As you can see, the > replicas have diverged! > > In general it will be hard to do this type of conflict analysis with > physical replication, you would need to do some kind of analysis on the DML > itself. Unfortunately that boils down to detecting when arbitrary > compositions of boolean predicates intersect, which is NP complete. > > Adam > > On Tue, Apr 19, 2016 at 3:30 AM, David Barrett <[email protected]> > wrote: > >> Thank you so much for the detailed response! I'm really surprised that >> MySQL and Oracle are so... basic for replication. Here this whole time >> I've assumed they had some magic capability, huh. Anyway, regarding this: >> >> On Mon, Apr 18, 2016 at 10:04 PM, Adam Midvidy <[email protected]> >> wrote: >> >>> I can't think of any CP systems used in industry that use your idea of >>> batching non-conflicting replication updates and allowing replicas to apply >>> them in parallel. ... The reason why this approach isn't popular is that >>> efficiently detecting non-conflicting transactions is actually not a >>> trivial task by any means >>> >> >> Maybe I'm just naive, but it seems straightforward to detect conflicting >> transactions. Basically: >> >> 1. Start out with: >> - nonConflictingBatchID = 0 >> >> 2. Set a callback that is called every time a query touches a page. In >> this callback, maintain a series of sets: >> - pagesReadByThisTransaction >> - pagesWrittenByThisTransaction >> >> 3. Before you commit, compare those pages to: >> - pagesReadByCommittedTransactions >> - pagesWrittenByCommittedTransactions >> >> 3. It's totally fine for the new transaction to read from a page read by >> other transactions already committed. But if you read from a page written >> by another transaction, or write to a page read by another transaction, >> then you know that the new transaction conflicts with one or more of the >> previous, already committed transactions. >> >> 4. If there are no conflicts, then commit this new transaction with >> confidence that it didn't conflict with any of the other transactions in >> this batch. Send out this transaction to slaves with the current >> "nonConflictingBatchID", signaling to slaves that they are free to commit >> this transaction simultaneous to other transactions in the same batch. >> Also: >> - pagesReadByCommittedTransactions += pagesReadByThisTransaction + >> pagesWrittenByThisTransaction; // A write is also a read >> - pagesWrittenByCommittedTransactions += pagesWrittenByThisTransaction; >> >> 5. If there is a conflict, however, then you can still go ahead and >> commit this transaction on the master. However, increment the >> "nonConflictingBatchID" before replicating out the transaction, to signal >> "wait for all previous batches to finish being committed before >> committing", and then set: >> - pagesReadByCommittedTransactions = pagesReadByThisTransaction; >> - pagesWrittenByCommittedTransactions = pagesWrittenByThisTransaction; >> >> This doesn't actually seem that hard, and maintaining a series of >> unordered sets of page numbers seems like it'll be pretty fast -- much >> faster than the queries themselves, at least. >> >> As for: >> >> and that when the workload is composed of non-conflicting transactions, >>> partitioning (sharding) provides the same performance benefits with less >>> complexity. >>> >> >> True sharding isn't an option as mentioned before. But this effectively >> creates "ephemeral shards" by detecting non-conflicting queries in >> realtime. In practice I don't actually think it'll take much to get pretty >> massive improvements: if I can get even *two* non-conflicting commits on >> average, that's the same performance advantage of having 2 perfectly >> disjoint shards. But in practice, I expect we'll get vastly more >> concurrency than that, as we have millions of users across a large database >> -- the odds that two transactions back to back are even for the same >> company (not to mention, linked in such a way as to conflict) seems really >> rare. >> >> Anyway, that's my plan above. I'd be curious for your thoughts if you >> think it'll work, or if I'm missing something obvious. Thanks! >> >> -david >> >> >> >> >> _______________________________________________ >> p2p-hackers mailing list >> [email protected] >> http://lists.zooko.com/mailman/listinfo/p2p-hackers >> >> > > _______________________________________________ > p2p-hackers mailing list > [email protected] > http://lists.zooko.com/mailman/listinfo/p2p-hackers > >
_______________________________________________ p2p-hackers mailing list [email protected] http://lists.zooko.com/mailman/listinfo/p2p-hackers
