In the end the errors ended up (ironically) being caused by a bad call to our error monitoring service. We run couchdb behind a rails proxy and we were randomly seeing 500 errors on some proxy requests with no backtrace so it was hard to track down.
Benoit, by graceful I mean something like unicorn's USR2 signal. The original process would immediately stop accepting new requests, fire up a new instance, finish handling any existing requests, and then terminate. Is there a strong argument against graceful restarts? Couchdb restarts so quickly it's a shame to have it result in failed requests.