I'm running couchdb 1.0.2 on CentOS 5.5. The databases are on an ext4 formatted drive.
I have 209 databases, but they're never truly active at the same time. Our stack is written in ruby. The web layer switches between active databases depending on the url. However, we have 16 web processes, so in theory the maximum number of truly active databases is 16. We also have a daemon process that loops through a chunk of the databases periodically. However, it's one thread, and as such also only truly works with one database at a time. My understanding is that CouchRest doesn't keep HTTP connections alive for multiple requests, but I don't know that for sure. I have even gone as far as putting in manual garbage collection calls in my daemon to ensure that any stranded connection objects will be collected. With all of that, however, I eventually get into a state where I get the all_dbs_active error. It doesn't happen often -- last time was nearly 3 weeks ago. However, once it gets in the state, restarting all of my clients doesn't release the databases. The only way to recover is to restart couch. open_os_files was at 2308 before I restarted it this morning, which is less than the current limit set (4096). I guess I feel like this is an issue inside of couch because even if I quit all of my active server processes that connect to couch, couch never frees up the open databases. I can hit it one-off from my browser and still get the error, even though I'm the only active connection. Has anyone else seen this? Any ideas of what I can try to prevent this from happening? Thanks! -Jon
