Hi,

I have two machines, A and B.  I am running an analysis on A, saving
the output to couchdb, and then replicating by push from A to B.  

B's database started life from this push replication, and has never
been otherwise modified.  So the two are more or less identical.

What I want to do now is parallelize my work flow and run more
analyses on B, save locally (to B's couchdb), and then replicate those
changes to A.  (The database is used both to save output, and also to
keep track of what has been done already, so it is useful to keep the
output db syncronized between all machines.)

My question is whether replicating from B to A will require pushing
all of the docs to A.  This is an issue because my database is 21 GB
and growing, and I'd rather not push all that data from B to A when I
*know* the two are identical right now.

Is there a way to set up replication to skip everything already there?
Or to copy the replication state from A to B so that B knows that
replication with A can start with new data only?

If not, I can of course just save work done on B to the CouchDB on A
directly, but I'd rather set it up so that the computation process
always just hits couchdb on localhost, and let couchdb do the machine
to machine copying.

Thanks for any insights or pointers to the correct docs page, 

James Marca

Reply via email to