Alexander Harm wrote:
I’ve been playing with CouchDB over the last weeks and am running into serious problems. My idea was, to have a central database and roughly 1.000 project dbs that stem from a filtered local one-way replication from the central db using _replicator db. However, triggering 1.000 replications grinds CouchDB to a standstill. I read that for each replication Couch spawns a process so that approach might not be feasible after all. I changed my setup now into a node process listening to the changes feed of the central database and then manually triggering a one-shot replication to the affected project dbs. That seems to work for now.However, now the clients should listen to the changes feed of their relevant project dbs and again CouchDB is thrashing and crashing all over the place. I increased "max_dbs_open" to 1.500 and added "+P 10.000” to the Erlang startup parameters. ULIMIT on OS X is set to unlimited. My central database only has 1.000 small documents so I’m a bit scared of what will happen if it grows to millions of documents and hundreds of clients listening to the changes feeds. Are there any recommendations on (nr_of_dbs, nr_of_clients, nr_of_replications) => {couch_settings, erlang_settings, system_settings}? Or should I not even attempt to host Couch myself (I would really like to) and directly use Cloudant?
Filtered replications are much more work for CouchDB and I have run in to issues in the past. In particular it seems possible to deadlock CouchDB when there aren't enough query servers available. I don't have specific evidence to support this but I found that defining extra query_server languages: e.g. [query_servers] javascript = /usr/bin/couchjs /usr/share/couchdb/server/main.js repljs = /usr/bin/repljs /usr/share/couchdb/server/main.js and then setting the language as repljs on the design document with the replication filter function helped. /usr/bin/repljs is just a symlink to the same target as /usr/bin/couchjs. The design document with the filter function should not contain any other view/list/show/update functions. You can then see in process monitoring tools the activity due to replications as well as leaving the normal javascript query_servers to be free to handle other requests. You can repeat this if you are also using validate_doc_update functions. James Zynstra is a private limited company registered in England and Wales (registered number 07864369). Our registered office and Headquarters are at The Innovation Centre, Broad Quay, Bath, BA1 1UD. This email, its contents and any attachments are confidential. If you have received this message in error please delete it from your system and advise the sender immediately.
