Forwarding from user list upon suggestion. ---------- Forwarded message ---------- From: Chris Stockton <chrisstockto...@gmail.com> Date: Wed, May 25, 2011 at 12:23 PM Subject: Thoughts on server wide replication To: u...@couchdb.apache.org
I was thinking if there was a server wide replication we could support many more users. Currently we are at a few thousand and we are starting to feel just the expense of all of the TCP connections and replication tasks, the calls to status to monitor that they are running etc are getting very expensive and noticeable. It would seem to me that a API for server wide replication would greatly benefit our use patterns, and I'm sure anyone else who scales through many databases (One database, is one customer). Here is a few ideas for such a feature, throwing this out here just to see if it sparks interest. We will call this API _replicate_server for example purposes, name could be subject to discussion. To begin server wide replication: curl -vX POST http://localhost:5984/_replicate_server -d '{"source":"example-database","target":"http://example.org/example-database"}' -> {"ok": true, <... other details>} To begin server wide replication with a filtering function, here maybe we can return either FALSE to not replicate, TRUE to replicate, then an array of filters to use a filtering function? this could be simple or very robust function(dbName, req) { return s.indexOf("my_interesting_dbs_prefix") == 1; } curl -vX POST http://localhost:5984/_replicate_server -d '{"source":"example-database","target":"http://example.org/example-database", "filter": "filters/server_filter"}' -> {"ok": true, <... other details>} To begin server wide replication for a array of dbs: curl -vX POST http://localhost:5984/_replicate_server -d '{"source":"example-database","target":"http://example.org/example-database", "database_names": ["db_1", "db_2" ..., "db_3050"]}' -> {"ok": true, <... other details>} Other params for request: "persistent": true|false - should this replication job persist through couchdb restart, maybe this adds a entry to the config file or something? "continuous": true|false - do a one time pass of all dbs or not, defaulting to true makes sense, but is inconsistent with _replicate, maybe just not support 1 time passes? my specific use cases don't require it but I don't want to just speak for myself. Just some thoughts from my last 1-2years or so experience with couchdb and my use patterns. If we could trim down and improve replication usability a bit I think couchdb could greatly benefit as a project. Right now having to tell replication to start, having to make sure it runs on restart (I know changes are coming/implemented for this of some sort), and monitoring your databases to make sure they are up to date is just a bit too much for the app tier to do and scares away DBA's from embracing the technology as much I think. Overall I love couchdb and find it to be a great product and has fit our needs very well. -Chris