Not to pile on, but one other point: if you're logging to a slow volume, especially if it's the same volume that your CouchDB data is on, the problem will be compounded.
If you absolutely need debug-level logging to be on for a while, a better approach is to use the syslog-udp backend for the Couch logger, and have it send the data to a second machine running rsyslogd for collection. If you can't do that, at least get another drive/spindle/virtual volume with more bandwidth and log to that, local SSD/NVMe preferred. -Joan ----- Original Message ----- > From: "Jan Lehnardt" <[email protected]> > To: [email protected] > Sent: Thursday, 21 March, 2019 2:43:23 PM > Subject: Re: Leaking memory in logger process couch 2.3.1 > > In particular, of you have views, each design doc will cause a full > database scan to be dumped into the logs. > > Cheers > Jan > — > > > On 21. Mar 2019, at 19:40, Robert Newson <[email protected]> > > wrote: > > > > Hi, > > > > Eek. This queue should never get this big, it indicates that there > > is far too much logging traffic generated and your target (file or > > syslog server) can't take it. It looks like you have 'debug' level > > set which goes a long way to explaining it. I would return to the > > default level of 'notice' for a significant reduction in logging > > volume. > > > > -- > > Robert Samuel Newson > > [email protected] > > > >> On Thu, 21 Mar 2019, at 18:34, Vladimir Ralev wrote: > >> Hello, > >> > >> I am testing couch 2.3.1 in various configurations and while > >> loading high > >> number of test DBs I notice a ton of memory being eaten at some > >> point and > >> never recovered More than 20 gigs and going into swap at which > >> point i kill > >> the machine. > >> > >> So went into the remsh to see where the memory goes and it is the > >> logging > >> process. Take a look at the message queue len 4671185: > >> > >> ([email protected])65> MQSizes2 = lists:map(fun(A) -> {_,B} > >> = > >> process_info(A,message_queue_len), {B,A} end, processes()). > >> ([email protected])66> {_,BadProcess} = > >> hd(lists:reverse(lists:sort(MQSizes2))). > >> ([email protected])67> process_info(BadProcess). > >> [{registered_name,couch_log_server}, > >> {current_function,{prim_file,drv_get_response,1}}, > >> {initial_call,{proc_lib,init_p,5}}, > >> {status,running}, > >> {message_queue_len,4671185}, > >> {messages,[{'$gen_cast',{log,{log_entry,debug,<0.8973.15>, > >> > >> [79,83,32,80,114,111,99,101,115,115,32,[...]|...], > >> "--------", > >> > >> ["2019",45,["0",51],45,"21",84,["0",50],58,"40",58|...]}}}, > >> {'$gen_cast',{log,{log_entry,debug,<0.8973.15>, > >> > >> [79,83,32,80,114,111,99,101,115,115,32|...], > >> "--------", > >> > >> ["2019",45,["0",51],45,"21",84,["0",50],58,[...]|...]}}}, > >> {'$gen_cast',{log,{log_entry,debug,<0.15949.9>, > >> > >> [79,83,32,80,114,111,99,101,115,115|...], > >> "--------", > >> > >> ["2019",45,["0",51],45,"21",84,[[...]|...],58|...]}}}, > >> {'$gen_cast',{log,{log_entry,debug,<0.8971.15>, > >> > >> [79,83,32,80,114,111,99,101,115|...], > >> "--------", > >> > >> ["2019",45,["0",51],45,"21",84,[...]|...]}}}, > >> {'$gen_cast',{log,{log_entry,debug,<0.9015.15>, > >> [79,83,32,80,114,111,99,101|...], > >> "--------", > >> > >> ["2019",45,["0",51],45,"21",84|...]}}}, > >> {'$gen_cast',{log,{log_entry,debug,<0.9015.15>, > >> [79,83,32,80,114,111,99|...], > >> "--------", > >> > >> ["2019",45,["0",51],45,[...]|...]}}}, > >> {'$gen_cast',{log,{log_entry,debug,<0.8973.15>, > >> [79,83,32,80,114,111|...], > >> "--------", > >> ["2019",45,[[...]|...],45|...]}}}, > >> {'$gen_cast',{log,{log_entry,debug,<0.15949.9>, > >> [79,83,32,80,114|...], > >> "--------", > >> ["2019",45,[...]|...]}}}, > >> {'$gen_cast',{log,{log_entry,debug,<0.8971.15>, > >> [79,83,32,80|...], > >> "--------", > >> ["2019",45|...]}}}, > >> {'$gen_cast',{log,{log_entry,debug,<0.8973.15>, > >> [79,83,32|...], > >> "--------", > >> [[...]|...]}}}, > >> {'$gen_cast',{log,{log_entry,debug,<0.15949.9>, > >> [79,83|...], > >> "--------", > >> [...]}}}, > >> {'$gen_cast',{log,{log_entry,debug,<0.9015.15>, > >> [79|...], > >> [...],...}}}, > >> {'$gen_cast',{log,{log_entry,debug,<0.8971.15>,[...],...}}}, > >> {'$gen_cast',{log,{log_entry,debug,<0.8973.15>,...}}}, > >> {'$gen_cast',{log,{log_entry,debug,...}}}, > >> {'$gen_cast',{log,{log_entry,...}}}, > >> {'$gen_cast',{log,{...}}}, > >> {'$gen_cast',{log,...}}, > >> {'$gen_cast',{...}}, > >> {'$gen_cast',...}, > >> {...}|...]}, > >> {links,[<0.122.0>,#Port<0.2149>]}, > >> {dictionary,[{'$initial_call',{couch_log_server,init,1}}, > >> {'$ancestors',[couch_log_sup,<0.121.0>]}]}, > >> {trap_exit,true}, > >> {error_handler,error_handler}, > >> {priority,normal}, > >> {group_leader,<0.120.0>}, > >> {total_heap_size,10957}, > >> {heap_size,4185}, > >> {stack_size,29}, > >> {reductions,292947037857}, > >> {garbage_collection,[{max_heap_size,#{error_logger => true,kill => > >> true,size => 0}}, > >> {min_bin_vheap_size,46422}, > >> {min_heap_size,233}, > >> {fullsweep_after,65535}, > >> {minor_gcs,591}]}, > >> {suspending,[]}] > >> > >> This last line took 1 hour to finish because it was dumping the > >> whole > >> mailbox into swap once again. > >> > >> I can see I have debug logs enabled which exaggerates the problem, > >> but I am > >> assuming this can happen with any log level over time. Is this > >> known > >> behaviour and do you have any suggestions? > >> > >
