Alexander Harm wrote:
I’ve been playing with CouchDB over the last weeks and am running into serious 
problems. My idea was, to have a central database and roughly 1.000 project dbs 
that stem from a filtered local one-way replication from the central db using 
_replicator db. However, triggering 1.000 replications grinds CouchDB to a 
standstill. I read that for each replication Couch spawns a process so that 
approach might not be feasible after all.
I changed my setup now into a node process listening to the changes feed of the 
central database and then manually triggering a one-shot replication to the 
affected project dbs. That seems to work for now.

However, now the clients should listen to the changes feed of their relevant project dbs and 
again CouchDB is thrashing and crashing all over the place. I increased 
"max_dbs_open" to 1.500 and added "+P 10.000” to the Erlang startup parameters. 
ULIMIT on OS X is set to unlimited.

My central database only has 1.000 small documents so I’m a bit scared of what 
will happen if it grows to millions of documents and hundreds of clients 
listening to the changes feeds.

Are there any recommendations on (nr_of_dbs, nr_of_clients, nr_of_replications) 
=> {couch_settings, erlang_settings, system_settings}?
Or should I not even attempt to host Couch myself (I would really like to) and 
directly use Cloudant?
Filtered replications are much more work for CouchDB and I have run in
to issues in the past.  In particular it seems possible to deadlock
CouchDB when there aren't enough query servers available. I don't have
specific evidence to support this but I found that defining extra
query_server languages: e.g.

[query_servers]
javascript = /usr/bin/couchjs /usr/share/couchdb/server/main.js
repljs = /usr/bin/repljs /usr/share/couchdb/server/main.js

and then setting the language as repljs on the design document with the
replication filter function helped.  /usr/bin/repljs is just a symlink
to the same target as /usr/bin/couchjs.  The design document with the
filter function should not contain any other view/list/show/update
functions.  You can then see in process monitoring tools the activity
due to replications as well as leaving the normal javascript
query_servers to be free to handle other requests.  You can repeat this
if you are also using validate_doc_update functions.

James
Zynstra is a private limited company registered in England and Wales 
(registered number 07864369). Our registered office and Headquarters are at The 
Innovation Centre, Broad Quay, Bath, BA1 1UD. This email, its contents and any 
attachments are confidential. If you have received this message in error please 
delete it from your system and advise the sender immediately.

Reply via email to