kocolosk opened a new issue #3675:
URL: https://github.com/apache/couchdb/issues/3675


   ## Summary
   
   I'd like to be able to choose the starting sequence for a replication 
between a given source and target using more information than just the 
replication history between those two databases. Specifically, I'd like to be 
able to use other replication checkpoint histories to discover transitive 
relationships that could be used to accelerate the first replication between 
CouchDB databases that share a common peer.
   
   ## Desired Behaviour
   
   It might be simplest to provide an example. Consider a system where you have 
a pair of cloud sites (call them `us-east` and `us-west`) and a series of edge 
locations (e.g. `store1`):
   
   * `us-east` and `us-west` are replicating with each other
   * `store1` is pulling data from `us-east`
   * `us-east` experiences an outage, so we respond by initiating `us-west` -> 
`store1`
   
   In the current version of CouchDB, the `us-west` -> `store1` replication 
will start from 0 because those peers have no replication history between them. 
Going forward, it would be useful for us to recognize that `us-west` -> 
`us-east` has a history, and `us-east` -> `store1` has a history, so we can 
fast-forward `us-west` -> `store1` by analyzing the pair of those checkpoint 
histories to discover the maximum sequence on `us-west` guaranteed to have been 
observed on `store1` (by way of `us-east`).
   
   ## Possible Solution
   
   I believe we actually already employ this transitive analysis for 
fast-forwarding internal replications between shard copies in a cluster, so we 
may be able to refactor some of that code to apply it more generally.
   
   I'm not sure if we track the _target_ sequence in the current external 
replication checkpoint schema. That's essential for this analysis to work.
   
   There's nothing fundamental that limits the analysis to first-order 
transitive relationships. One could build out an entire graph. I'm not sure the 
extra complexity that would bring is worth it in a first pass.
   
   ## Additional context
   
   Proposing this enhancement after chatting with a user who is planning this 
kind of deployment and would benefit from the enhancement.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to