On Thursday, May 26, 2011 at 6:22 AM, Glenn Bech wrote:

> Hi,
> 
> I just want to ask if there are limits on the number of databases in Couch.
> I am playing around with embeded Couch on Android and are thinking in the
> line of having
> one database per user, and use replication to push data from the client to
> the server. This will provde for an Excellent "offline" user experience.
> 
> This will of course not work if Couch does not handle unlimited datbases
> very well performance- or otherwise.
> 
> Does this sound like a feasable design solution?
> 
> Regards,
> 
> Glenn
 I've done some testing and there are a couple things to keep in mind.

First of all, CouchDB relies directly on the scalability of your filesystem. 
Having one database in CouchDB means you also have at least one file for each 
of those. Since CouchDB currently stores them all in one directory, you'll need 
to make sure you select a filesystem that can handle your expected scale 
appropriately (many filesystems should be fine in the millions of files level, 
but characteristics can differ so do test this). 

Another problem, one which I don't have an immediate answer for is backup. 
While you could claim replication is enough for this, I'd say it isn't. The 
event you need backups for also cover events like maliciously destroyed or 
manipulated data or simply the existence of bugs. I'd rather not trust my data 
never get screwed up. by the code that accesses it. Many backup systems are 
designed around a small number of files. Being able to rollback to a point in 
time with millions of files could be an extremely painful process. (I have 
ideas on how to solve this but it's still not an easy problem.)

Last but not least, consider the number of active databases you'll need at any 
single time. This can be split across many machines of course but it still adds 
up quickly. Open file descriptors are great but not if you have to close and 
then reopen them all the time. A carefully tuned VM can manage many thousands 
w/o a problem but I wouldn't push this too much higher. So if you have 15 
machines and 30k active users for any single 1 minute window, that would be 2k 
files open and active per machine.

Brian. 

Reply via email to