Re: High memory consumption of a single node CouchDB server

Adam Kocoloski Tue, 18 Jun 2019 09:59:21 -0700

Hi Jérôme, definitely useful.

The “run_queue” is the number of Erlang processes in a runnable state that are 
not currently executing on a scheduler. When that value is greater than zero it 
means the node is hitting some compute limitations. Seeing a small positive 
value from time to time is no problem.


Your last six snapshots show a message queue backlog in couch_server. That 
could be what caused the node to OOM. The couch_server process is a singleton 
and if it accumulates a large message backlog there are limited backpressure or 
scaling mechanisms to help it recover. I noticed you’re running 2.1.1; there 
were a couple of important enhancements to reduce the message flow through 
couch_server in more recent releases:

2.2.0: https://github.com/apache/couchdb/pull/1118 
<https://github.com/apache/couchdb/pull/1118>
2.3.1: https://github.com/apache/couchdb/pull/1593 
<https://github.com/apache/couchdb/pull/1593>

The change in 2.2.0 is just a change in the default configuration; you can try 
applying it to your server by setting:

[couchdb]
update_lru_on_read = false

The changes in 2.3.1 offer additional benefits for couch_server message 
throughput but you’ll need to upgrade to get them.

Cheers, Adam

P.S. II don’t know what’s going on with the negative memory.other value there, 
it’s not intentionally meaningful :) 


> On Jun 18, 2019, at 11:30 AM, Jérôme Augé <jerome.a...@anakeen.com> wrote:
> 
> "beam.smp" just got killed by OOM, but I was not in front of the machine to
> perform this command...
> 
> However, here is the CouchDB log of "/_node/_local/_system" for the 30
> minutes preceding the OOM:
> -
> https://gist.github.com/eguaj/1fba3eda4667a999fa691ff1902f04fc#file-log-couchdb-system-2019-06-18-log
> 
> I guess the spike that triggers the OOM is so quick (< 1min) that it does
> not gets logged (I log every minute).
> 
> Is there anything that can be used/deduced from the last line logged at
> 2019-06-18T16:00:14+0200?
> 
> At 15:55:25, the "run_queue" is at 36: what does it means? Number of active
> concurrent requests?
> 
> From 15:56 to 16:00 the "memory"."other" value is a negative value: does it
> means something special? or just an integer overflow?
> 
> 
> 
> Le lun. 17 juin 2019 à 14:09, Vladimir Ralev <vladimir.ra...@gmail.com> a
> écrit :
> 
>> Alright, I think the issue will be more visible towards the OOM point,
>> however for now since you have the system live with a leak, it will be
>> useful to repeat the same steps, but replace
>> "message_queue_len" with "total_heap_size" then with "heap_size" then with
>> "stack_size" and then with "reductions".
>> 
>> For example:
>> 
>> MQSizes2 = lists:map(fun(A) -> {_,B} = case process_info(A,total_heap_size)
>> of {XV,XB} -> {XV, XB}; _ERR -> io:format("~p",[_ERR]),{ok, 0} end, {B,A}
>> end, processes()).
>> 
>> Then same with the other params.
>> 
>> That can shed some light, otherwise someone will need to monitor process
>> count and go into them by age and memory patterns.
>> 
>> On Mon, Jun 17, 2019 at 2:55 PM Jérôme Augé <jerome.a...@anakeen.com>
>> wrote:
>> 
>>> The 2G consumption is from Adrien's system.
>>> 
>>> On mine, since I setup the logging of "/_node/_local/_system" output :
>>> - on june 14th max memory.processes was 2.6 GB
>>> - on june 15th max memory.processes was 4.7 GB
>>> - on june 16th max memory.processes was 7.0 GB
>>> - today (june 17th) max memory.processes was 8.0 GB (and with an
>>> interactive top I see spikes at 12 GB)
>>> 
>>> The memory.processes seems to be steadily increasing over the days, and
>> I'm
>>> soon expecting the out-of-memory condition to be triggered in a couple of
>>> days.
>>> 
>>> Le lun. 17 juin 2019 à 11:53, Vladimir Ralev <vladimir.ra...@gmail.com>
>> a
>>> écrit :
>>> 
>>>> Nothing to see here, the message queue stat from Adam's advice is
>>> accurate.
>>>> Note that you should run this only when there is already an
>> unreasonable
>>>> amount memory leaked/consumed.
>>>> 
>>>> But now I realise you had "processes":1877591424 before restart from
>> the
>>>> stats above which is less than 2G. Are you using only 2 gigs of RAM? I
>>> got
>>>> confused by the initial comment and I thought you had 15GB RAM. If you
>>> are
>>>> only using 2 gigs of RAM, it's probably not enough for your workload.
>>>> 
>>>> On Mon, Jun 17, 2019 at 12:15 PM Jérôme Augé <jerome.a...@anakeen.com>
>>>> wrote:
>>>> 
>>>>> That command seems to work, and here is the output:
>>>>> 
>>>>> --8<--
>>>>> # /opt/couchdb/bin/remsh < debug.2.remsh
>>>>> Eshell V7.3  (abort with ^G)
>>>>> (remsh22574@127.0.0.1)1> [{0,<0.0.0>},
>>>>> {0,<0.3.0>},
>>>>> {0,<0.6.0>},
>>>>> {0,<0.7.0>},
>>>>> {0,<0.9.0>},
>>>>> {0,<0.10.0>},
>>>>> {0,<0.11.0>},
>>>>> {0,<0.12.0>},
>>>>> {0,<0.14.0>},
>>>>> {0,<0.15.0>},
>>>>> {0,<0.16.0>},
>>>>> {0,<0.17.0>},
>>>>> {0,<0.18.0>},
>>>>> {0,<0.19.0>},
>>>>> {0,<0.20.0>},
>>>>> {0,<0.21.0>},
>>>>> {0,<0.22.0>},
>>>>> {0,<0.23.0>},
>>>>> {0,<0.24.0>},
>>>>> {0,<0.25.0>},
>>>>> {0,<0.26.0>},
>>>>> {0,<0.27.0>},
>>>>> {0,<0.28.0>},
>>>>> {0,<0.29.0>},
>>>>> {0,<0.31.0>},
>>>>> {0,<0.32.0>},
>>>>> {0,<0.33.0>},
>>>>> {0,...},
>>>>> {...}]
>>>>> (remsh22574@127.0.0.1)2> {0,<0.38.0>}
>>>>> (remsh22574@127.0.0.1)3> [{current_function,{erl_eval,do_apply,6}},
>>>>> {initial_call,{erlang,apply,2}},
>>>>> {status,running},
>>>>> {message_queue_len,0},
>>>>> {messages,[]},
>>>>> {links,[<0.32.0>]},
>>>>> {dictionary,[]},
>>>>> {trap_exit,false},
>>>>> {error_handler,error_handler},
>>>>> {priority,normal},
>>>>> {group_leader,<0.31.0>},
>>>>> {total_heap_size,5172},
>>>>> {heap_size,2586},
>>>>> {stack_size,24},
>>>>> {reductions,24496},
>>>>> {garbage_collection,[{min_bin_vheap_size,46422},
>>>>>                      {min_heap_size,233},
>>>>>                      {fullsweep_after,65535},
>>>>>                      {minor_gcs,1}]},
>>>>> {suspending,[]}]
>>>>> (remsh22574@127.0.0.1)4> *** Terminating erlang ('
>> remsh22574@127.0.0.1
>>> ')
>>>>> -->8--
>>>>> 
>>>>> What should I be looking for in this output?
>>>>> 
>>>>> Le ven. 14 juin 2019 à 17:30, Vladimir Ralev <
>> vladimir.ra...@gmail.com
>>>> 
>>>> a
>>>>> écrit :
>>>>> 
>>>>>> That means your couch is creating and destroying processes too
>>>> rapidly. I
>>>>>> haven't seen this, however I think Adam's message_queues stat above
>>>> does
>>>>>> the same thing. I didn't notice you can get it from there.
>>>>>> 
>>>>>> Either way it will be useful if you can get the shell to work:
>>>>>> Try this command instead for the first, the rest will be the same:
>>>>>> 
>>>>>> MQSizes2 = lists:map(fun(A) -> {_,B} = case
>>>>>> process_info(A,message_queue_len) of {XV,XB} -> {XV, XB}; _ERR ->
>>>>>> io:format("~p",[_ERR]),{ok, 0} end, {B,A} end, processes()).
>>>>>> 
>>>>>> On Fri, Jun 14, 2019 at 5:52 PM Jérôme Augé <
>> jerome.a...@anakeen.com
>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> I tried the following, but it seems to fail on the first command:
>>>>>>> 
>>>>>>> --8<--
>>>>>>> # /opt/couchdb/bin/remsh
>>>>>>> Erlang/OTP 18 [erts-7.3] [source-d2a6d81] [64-bit] [smp:8:8]
>>>>>>> [async-threads:10] [hipe] [kernel-poll:false]
>>>>>>> 
>>>>>>> Eshell V7.3  (abort with ^G)
>>>>>>> (couchdb@127.0.0.1)1> MQSizes2 = lists:map(fun(A) -> {_,B} =
>>>>>>> process_info(A,message_queue_len), {B,A} end, processes()).
>>>>>>> ** exception error: no match of right hand side value undefined
>>>>>>> -->8--
>>>>>>> 
>>>>>>> 
>>>>>>> Le ven. 14 juin 2019 à 16:08, Vladimir Ralev <
>>>> vladimir.ra...@gmail.com
>>>>>> 
>>>>>> a
>>>>>>> écrit :
>>>>>>> 
>>>>>>>> Hey guys. I bet it's a mailbox leaking memory. I am very
>>> interested
>>>>> in
>>>>>>>> debugging issues like this too.
>>>>>>>> 
>>>>>>>> I can suggest to get an erlang shell and run these commands to
>>> see
>>>>> the
>>>>>>> top
>>>>>>>> memory consuming processes
>>>>>>>> 
>>> https://www.mail-archive.com/user@couchdb.apache.org/msg29365.html
>>>>>>>> 
>>>>>>>> One issue I will be reporting soon is if one of your nodes is
>>> down
>>>>> for
>>>>>>> some
>>>>>>>> amount of time, it seems like all databases independently try
>> and
>>>>> retry
>>>>>>> to
>>>>>>>> query the missing node and fail, resulting in printing a lot of
>>>> logs
>>>>>> for
>>>>>>>> each db which can overwhelm the logger process. If you have a
>> lot
>>>> of
>>>>>> DBs
>>>>>>>> this makes the problem worse, but it doesn't happen right away
>>> for
>>>>> some
>>>>>>>> reason.
>>>>>>>> 
>>>>>>>> On Fri, Jun 14, 2019 at 4:25 PM Adrien Vergé <
>>>>> adrien.ve...@tolteck.com
>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi Jérôme and Adam,
>>>>>>>>> 
>>>>>>>>> That's funny, because I'm investigating the exact same
>> problem
>>>>> these
>>>>>>>> days.
>>>>>>>>> We have a two CouchDB setups:
>>>>>>>>> - a one-node server (q=2 n=1) with 5000 databases
>>>>>>>>> - a 3-node cluster (q=2 n=3) with 50000 databases
>>>>>>>>> 
>>>>>>>>> ... and we are experiencing the problem on both setups. We've
>>>> been
>>>>>>> having
>>>>>>>>> this problem for at least 3-4 months.
>>>>>>>>> 
>>>>>>>>> We've monitored:
>>>>>>>>> 
>>>>>>>>> - The number of open files: it's relatively low (both the
>>>> system's
>>>>>>> total
>>>>>>>>> and or fds opened by beam.smp).
>>>>>>>>>  https://framapic.org/wQUf4fLhNIm7/oa2VHZyyoPp9.png
>>>>>>>>> 
>>>>>>>>> - The usage of RAM, total used and used by beam.smp
>>>>>>>>>  https://framapic.org/DBWIhX8ZS8FU/MxbS3BmO0WpX.png
>>>>>>>>>  It continuously grows, with regular spikes, until killing
>>>> CouchDB
>>>>>>> with
>>>>>>>> an
>>>>>>>>> OOM. After restart, the RAM usage is nice and low, and no
>>> spikes.
>>>>>>>>> 
>>>>>>>>> - /_node/_local/_system metrics, before and after restart.
>>> Values
>>>>>> that
>>>>>>>>> significantly differ (before / after restart) are listed
>> here:
>>>>>>>>>  - uptime (obviously ;-))
>>>>>>>>>  - memory.processes : + 3732 %
>>>>>>>>>  - memory.processes_used : + 3735 %
>>>>>>>>>  - memory.binary : + 17700 %
>>>>>>>>>  - context_switches : + 17376 %
>>>>>>>>>  - reductions : + 867832 %
>>>>>>>>>  - garbage_collection_count : + 448248 %
>>>>>>>>>  - words_reclaimed : + 112755 %
>>>>>>>>>  - io_input : + 44226 %
>>>>>>>>>  - io_output : + 157951 %
>>>>>>>>> 
>>>>>>>>> Before CouchDB restart:
>>>>>>>>> {
>>>>>>>>>  "uptime":2712973,
>>>>>>>>>  "memory":{
>>>>>>>>>    "other":7250289,
>>>>>>>>>    "atom":512625,
>>>>>>>>>    "atom_used":510002,
>>>>>>>>>    "processes":1877591424,
>>>>>>>>>    "processes_used":1877504920,
>>>>>>>>>    "binary":177468848,
>>>>>>>>>    "code":9653286,
>>>>>>>>>    "ets":16012736
>>>>>>>>>  },
>>>>>>>>>  "run_queue":0,
>>>>>>>>>  "ets_table_count":102,
>>>>>>>>>  "context_switches":1621495509,
>>>>>>>>>  "reductions":968705947589,
>>>>>>>>>  "garbage_collection_count":331826928,
>>>>>>>>>  "words_reclaimed":269964293572,
>>>>>>>>>  "io_input":8812455,
>>>>>>>>>  "io_output":20733066,
>>>>>>>>>  ...
>>>>>>>>> 
>>>>>>>>> After CouchDB restart:
>>>>>>>>> {
>>>>>>>>>  "uptime":206,
>>>>>>>>>  "memory":{
>>>>>>>>>    "other":6907493,
>>>>>>>>>    "atom":512625,
>>>>>>>>>    "atom_used":497769,
>>>>>>>>>    "processes":49001944,
>>>>>>>>>    "processes_used":48963168,
>>>>>>>>>    "binary":997032,
>>>>>>>>>    "code":9233842,
>>>>>>>>>    "ets":4779576
>>>>>>>>>  },
>>>>>>>>>  "run_queue":0,
>>>>>>>>>  "ets_table_count":102,
>>>>>>>>>  "context_switches":1015486,
>>>>>>>>>  "reductions":111610788,
>>>>>>>>>  "garbage_collection_count":74011,
>>>>>>>>>  "words_reclaimed":239214127,
>>>>>>>>>  "io_input":19881,
>>>>>>>>>  "io_output":13118,
>>>>>>>>>  ...
>>>>>>>>> 
>>>>>>>>> Adrien
>>>>>>>>> 
>>>>>>>>> Le ven. 14 juin 2019 à 15:11, Jérôme Augé <
>>>> jerome.a...@anakeen.com
>>>>>> 
>>>>>> a
>>>>>>>>> écrit :
>>>>>>>>> 
>>>>>>>>>> Ok, so I'll setup a cron job to journalize (every minute?)
>>> the
>>>>>> output
>>>>>>>>> from
>>>>>>>>>> "/_node/_local/_system" and wait for the next OOM kill.
>>>>>>>>>> 
>>>>>>>>>> Any property from "_system" to look for in particular?
>>>>>>>>>> 
>>>>>>>>>> Here is a link to the memory usage graph:
>>>>>>>>>> https://framapic.org/IzcD4Y404hlr/06rm0Ji4TpKu.png
>>>>>>>>>> 
>>>>>>>>>> The memory usage varies, but the general trend is to go up
>>> with
>>>>>> some
>>>>>>>>>> regularity over a week until we reach OOM. When "beam.smp"
>> is
>>>>>> killed,
>>>>>>>>> it's
>>>>>>>>>> reported as consuming 15 GB (as seen in the kernel's OOM
>>> trace
>>>> in
>>>>>>>>> syslog).
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Jérôme
>>>>>>>>>> 
>>>>>>>>>> Le ven. 14 juin 2019 à 13:48, Adam Kocoloski <
>>>>> kocol...@apache.org>
>>>>>> a
>>>>>>>>>> écrit :
>>>>>>>>>> 
>>>>>>>>>>> Hi Jérôme,
>>>>>>>>>>> 
>>>>>>>>>>> Thanks for a well-written and detailed report (though the
>>>>> mailing
>>>>>>>> list
>>>>>>>>>>> strips attachments). The _system endpoint provides a lot
>> of
>>>>>> useful
>>>>>>>> data
>>>>>>>>>> for
>>>>>>>>>>> debugging these kinds of situations; do you have a
>> snapshot
>>>> of
>>>>>> the
>>>>>>>>> output
>>>>>>>>>>> when the system was consuming a lot of memory?
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> http://docs.couchdb.org/en/stable/api/server/common.html#node-node-name-system
>>>>>>>>>>> 
>>>>>>>>>>> Adam
>>>>>>>>>>> 
>>>>>>>>>>>> On Jun 14, 2019, at 5:44 AM, Jérôme Augé <
>>>>>>> jerome.a...@anakeen.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> 
>>>>>>>>>>>> I'm having a hard time figuring out the high memory
>> usage
>>>> of
>>>>> a
>>>>>>>>> CouchDB
>>>>>>>>>>> server.
>>>>>>>>>>>> 
>>>>>>>>>>>> What I'm observing is that the memory consumption from
>>> the
>>>>>>>> "beam.smp"
>>>>>>>>>>> process gradually rises until it triggers the kernel's
>> OOM
>>>>>>>>>> (Out-Of-Memory)
>>>>>>>>>>> which kill the "beam.smp" process.
>>>>>>>>>>>> 
>>>>>>>>>>>> It also seems that many databases are not compacted:
>> I've
>>>>> made
>>>>>> a
>>>>>>>>> script
>>>>>>>>>>> to iterate over the databases to compute de fragmentation
>>>>> factor,
>>>>>>> and
>>>>>>>>> it
>>>>>>>>>>> seems I have around 2100 databases with a frag > 70%.
>>>>>>>>>>>> 
>>>>>>>>>>>> We have a single CouchDB v2.1.1server (configured with
>>> q=8
>>>>> n=1)
>>>>>>> and
>>>>>>>>>>> around 2770 databases.
>>>>>>>>>>>> 
>>>>>>>>>>>> The server initially had 4 GB of RAM, and we are now
>> with
>>>> 16
>>>>> GB
>>>>>>> w/
>>>>>>>> 8
>>>>>>>>>>> vCPU, and it still regularly reaches OOM. From the
>>>> monitoring I
>>>>>> see
>>>>>>>>> that
>>>>>>>>>>> with 16 GB the OOM is almost triggered once per week
>> (c.f.
>>>>>> attached
>>>>>>>>>> graph).
>>>>>>>>>>>> 
>>>>>>>>>>>> The memory usage seems to increase gradually until it
>>>> reaches
>>>>>>> OOM.
>>>>>>>>>>>> 
>>>>>>>>>>>> The Couch server is mostly used by web clients with the
>>>>> PouchDB
>>>>>>> JS
>>>>>>>>> API.
>>>>>>>>>>>> 
>>>>>>>>>>>> We have ~1300 distinct users and by monitoring the
>>>>> netstat/TCP
>>>>>>>>>>> established connections I guess we have around 100
>>> (maximum)
>>>>>> users
>>>>>>> at
>>>>>>>>> any
>>>>>>>>>>> given time. From what I understanding of the
>> application's
>>>>> logic,
>>>>>>>> each
>>>>>>>>>> user
>>>>>>>>>>> access 2 private databases (read/write) + 1 common
>> database
>>>>>>>>> (read-only).
>>>>>>>>>>>> 
>>>>>>>>>>>> On-disk usage of CouchDB's data directory is around 40
>>> GB.
>>>>>>>>>>>> 
>>>>>>>>>>>> Any ideas on what could cause such behavior (increasing
>>>>> memory
>>>>>>>> usage
>>>>>>>>>>> over the course of a week)? Or how to find what is
>>> happening
>>>>>> behind
>>>>>>>> the
>>>>>>>>>>> scene?
>>>>>>>>>>>> 
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Jérôme
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>>

Re: High memory consumption of a single node CouchDB server

Reply via email to