Great questions: On Wed, Apr 20, 2016 at 6:21 AM, Adam Midvidy <[email protected]> wrote:
> 1) Can follower nodes service reads, or are all reads serviced by the > leader? > All nodes CAN service reads, but only the master/leader does writes. In practice, we have a load balancing pool that contains all the current slaves (which can change, if the master changes), and they receive all direct requests -- both reads and writes -- and then escalate all writes to the master. However, the master *can* respond to reads, we just don't burden it with read traffic in normal operation. > 2) Is the APPROVE_COMMIT message sent after every transaction, or every > batch of transactions? > Actually... today we use a "selective synchronization" meaning we only actually wait for the commit to be approved by a quorum of the cluster on certain, high-sensitivity operations (eg, moving money). This way the master can "race ahead" without waiting on most transactions, and we allow it to get up to 100 transactions ahead of slaves before waiting for a transaction to be approved. And because today replication is serialized, in practice approving any given transaction retroactively approves committing all past transactions. (This means the master can "fork" from the cluster in some scenarios, but it gets kicked out automatically and we can apply corrective actions if necessary.) However, for the solution we're discussing here, we'd need to remove selective synchronization and do a separate APPROVE_COMMIT for every single transaction. That by itself would come with an enormous performance *reduction*, because approving every single transaction when serialized would reduce throughput to the inverse of the median latency of the cluster (to get quorum). However, I'm working on a way to run the approval logic in parallel, to get the benefit of separate approvals for each transactions, without the commensurate drop in replication performance. > 3) How does hashing work in the case of a delete? Do you send hashes of > the deleted pages as well? > Correct, I think a delete would be just like a write. > 4) In the case of a ROLLBACK, what do you rollback to? In my previous > example, even if you ROLLBACK the update (2) on the follower node after > committing the insert, this protocol is not guaranteed to make forward > progress since you would have to rollback the insert as well (which you > have already committed). > The only transactions that would be rolled back are those that touch a different set of pages than the master. So in the case above, the INSERT would always commit (because the pages it touch are the same regardless of whether it's before or after the UPDATE), and the UPDATE would roll back for *everyone* if *any* of the slaves attempted to commit it in a different order than the master. But if the master does the UPDATE before the INSERT -- and all the slaves follow suit -- then it'd go through just fine. (And if the master did the UPDATE *after* the INSERT, so long as all the slaves did the same, again, it'd work.) The only time it gets rolled back is if a slave commits in a different order than the master. -david
_______________________________________________ p2p-hackers mailing list [email protected] http://lists.zooko.com/mailman/listinfo/p2p-hackers
