Hi, I'm having a hard time figuring out the high memory usage of a CouchDB server.
What I'm observing is that the memory consumption from the "beam.smp" process gradually rises until it triggers the kernel's OOM (Out-Of-Memory) which kill the "beam.smp" process. It also seems that many databases are not compacted: I've made a script to iterate over the databases to compute de fragmentation factor, and it seems I have around 2100 databases with a frag > 70%. We have a single CouchDB v2.1.1server (configured with q=8 n=1) and around 2770 databases. The server initially had 4 GB of RAM, and we are now with 16 GB w/ 8 vCPU, and it still regularly reaches OOM. From the monitoring I see that with 16 GB the OOM is almost triggered once per week (c.f. attached graph). The memory usage seems to increase gradually until it reaches OOM. The Couch server is mostly used by web clients with the PouchDB JS API. We have ~1300 distinct users and by monitoring the netstat/TCP established connections I guess we have around 100 (maximum) users at any given time. >From what I understanding of the application's logic, each user access 2 private databases (read/write) + 1 common database (read-only). On-disk usage of CouchDB's data directory is around 40 GB. Any ideas on what could cause such behavior (increasing memory usage over the course of a week)? Or how to find what is happening behind the scene? Regards, Jérôme