Thank you for the comments!

Re: Michael - Great point.  One solution to the "two atomic increments of
the same value" problem is to just ROLLBACK the 2nd transaction and re-run
it after the first batch completes committing.  This way non-conflicting
transactions can safely commit at the same time, while conflicting
transactions remain serialized.  Do you agree this would solve that problem?

Re: Adam - Huh, really great example.  I don't see a solution off the cuff;
I'll need to think on this (and determine if this theoretical case would
actually affect our real-world schema).

Thanks both, and anybody else who sees problems in the design, please share
your thoughts!

-david



On Tue, Apr 19, 2016 at 6:11 AM, Adam Midvidy <[email protected]> wrote:

> David,
>
> I don't think this conflict detection mechanism works well with updates.
>
> Consider the following schedule of transactions, running against an empty
> table 'foo' with two integer valued columns 'a' and 'b':
>
> 1) UPDATE foo SET b=2 WHERE a=1
>
> 2) INSERT INTO foo VALUES (1, 1), (1, 1)
>
> On the primary the first update is a no-op since the table is empty. After
> the subsequent insert the contents of foo are ((1, 1), (1, 1)).
>
> These transactions are batches together since they touch no common pages.
>
> When the replica applies the batch, 2 completes before 1, and the contents
> of 'foo' on the secondary are ((1, 2), (1, 2)). As you can see, the
> replicas have diverged!
>
> In general it will be hard to do this type of conflict analysis with
> physical replication, you would need to do some kind of analysis on the DML
> itself. Unfortunately that boils down to detecting when arbitrary
> compositions of boolean predicates intersect, which is NP complete.
>
> Adam
>
> On Tue, Apr 19, 2016 at 3:30 AM, David Barrett <[email protected]>
> wrote:
>
>> Thank you so much for the detailed response!  I'm really surprised that
>> MySQL and Oracle are so... basic for replication.  Here this whole time
>> I've assumed they had some magic capability, huh.  Anyway, regarding this:
>>
>> On Mon, Apr 18, 2016 at 10:04 PM, Adam Midvidy <[email protected]>
>> wrote:
>>
>>> I can't think of any CP systems used in industry that use your idea of
>>> batching non-conflicting replication updates and allowing replicas to apply
>>> them in parallel. ... The reason why this approach isn't popular is that
>>> efficiently detecting non-conflicting transactions is actually not a
>>> trivial task by any means
>>>
>>
>> Maybe I'm just naive, but it seems straightforward to detect conflicting
>> transactions.  Basically:
>>
>> 1. Start out with:
>> - nonConflictingBatchID = 0
>>
>> 2. Set a callback that is called every time a query touches a page.  In
>> this callback, maintain a series of sets:
>> - pagesReadByThisTransaction
>> - pagesWrittenByThisTransaction
>>
>> 3. Before you commit, compare those pages to:
>> - pagesReadByCommittedTransactions
>> - pagesWrittenByCommittedTransactions
>>
>> 3. It's totally fine for the new transaction to read from a page read by
>> other transactions already committed.  But if you read from a page written
>> by another transaction, or write to a page read by another transaction,
>> then you know that the new transaction conflicts with one or more of the
>> previous, already committed transactions.
>>
>> 4. If there are no conflicts, then commit this new transaction with
>> confidence that it didn't conflict with any of the other transactions in
>> this batch.  Send out this transaction to slaves with the current
>> "nonConflictingBatchID", signaling to slaves that they are free to commit
>> this transaction simultaneous to other transactions in the same batch.
>> Also:
>> - pagesReadByCommittedTransactions += pagesReadByThisTransaction +
>> pagesWrittenByThisTransaction; // A write is also a read
>> - pagesWrittenByCommittedTransactions += pagesWrittenByThisTransaction;
>>
>> 5. If there is a conflict, however, then you can still go ahead and
>> commit this transaction on the master.  However, increment the
>> "nonConflictingBatchID" before replicating out the transaction, to signal
>> "wait for all previous batches to finish being committed before
>> committing", and then set:
>> - pagesReadByCommittedTransactions = pagesReadByThisTransaction;
>> - pagesWrittenByCommittedTransactions = pagesWrittenByThisTransaction;
>>
>> This doesn't actually seem that hard, and maintaining a series of
>> unordered sets of page numbers seems like it'll be pretty fast -- much
>> faster than the queries themselves, at least.
>>
>> As for:
>>
>> and that when the workload is composed of non-conflicting transactions,
>>> partitioning (sharding) provides the same performance benefits with less
>>> complexity.
>>>
>>
>> True sharding isn't an option as mentioned before.  But this effectively
>> creates "ephemeral shards" by detecting non-conflicting queries in
>> realtime.  In practice I don't actually think it'll take much to get pretty
>> massive improvements: if I can get even *two* non-conflicting commits on
>> average, that's the same performance advantage of having 2 perfectly
>> disjoint shards.  But in practice, I expect we'll get vastly more
>> concurrency than that, as we have millions of users across a large database
>> -- the odds that two transactions back to back are even for the same
>> company (not to mention, linked in such a way as to conflict) seems really
>> rare.
>>
>> Anyway, that's my plan above.  I'd be curious for your thoughts if you
>> think it'll work, or if I'm missing something obvious.  Thanks!
>>
>> -david
>>
>>
>>
>>
>> _______________________________________________
>> p2p-hackers mailing list
>> [email protected]
>> http://lists.zooko.com/mailman/listinfo/p2p-hackers
>>
>>
>
> _______________________________________________
> p2p-hackers mailing list
> [email protected]
> http://lists.zooko.com/mailman/listinfo/p2p-hackers
>
>
_______________________________________________
p2p-hackers mailing list
[email protected]
http://lists.zooko.com/mailman/listinfo/p2p-hackers

Reply via email to