Hi there, My team is designing a distributed health data capture system to be used in rural Africa, and we are planning to use CouchDB as a back end for it's excellent replication features.
One concern I had was how the replication would perform over a very unreliable internet connection. Is replication done in pieces or does it require large amounts of data to make it through at a single time? If the connection goes down in the middle of replication is the result that you have to start over from the beginning or is it smart enough to recover what has already made it across the wire? Also, are there any numbers I can get on how chatty replication is? Our system will likely be deployed with post-paid SIM cards and GSM modems providing the internet connection in many sites, so I would like to be able to get a rough estimate of data usage. Is there any formula I could use, such as "syncing X bytes of data in couch causes K * X bytes to go over the wire (where K i some overhead amount)". Seeing how JSON probably compresses quite well, is there any way to do compressed synchronization? thanks in advance, Cory