Hi Gustaf,
Thanks for the very detailed response!
I had an inkling that nsproxy might be the answer - we'll look at making
this change.

The RSS size of our nsd process is often ~2GB but does occasionally stray
as far as 4GB. (The small spike on the right of the attached graph is when
this particular problem occurred).
Looking at your graph I've very little doubt we'd see benefit from using
TCMalloc.
I found this comment from you about how to go about it... is the patch
referred to online somewhere?
https://sourceforge.net/p/naviserver/mailman/message/31805358/

[image: image.png]

Thanks again Gustaf,
Regards,
David



On Fri, 1 Nov 2019 at 20:46, Gustaf Neumann <neum...@wu.ac.at> wrote:

> Dear David,
>
> Technically, every unix fork() creates a new process by duplicating the
> calling process.
> If the calling process is large this can lead to the problem you are
> mentioning.
> However, on Linux uses a copy-on-write. This means that while after the
> fork,
> both processes might show RSS of 4GB, but it does not mean that 8GB are
> really
> used. However, based on the overcommit_memory settings, it might be
> the case that the oom killer kicks in.
>
> Concerning your configuration: Are you using nsproxy? This module was made
> for addressing this issue (instead of forking on every Tcl "exec", the
> command is
> sent via pipe to a second process that executes it. This process has
> typically
> a much smaller memory footprint, so the problem does not become worse,
> when nsd uses are huge memory footprint).
>
> Do you monitor the size of nsd over time? Is it normal that nsd has
> with the given configuration 4GB RSS? There might be a problem with the
> application code causing some memory growth. The chart below is
> generated with munin and the munin-plugins-ns from
> https://github.com/gustafn/munin-plugins-ns (some plugins are for OpenACS,
> some are fine for every NaviServer installation).
>
> Do you you system-malloc? One can reduce the size of the memory
> footprint significantly by using SYSTEM_MALLOC together with
> TCMalloc (see e.g.
> https://next-scripting.org/2.3.0/doc/misc/thread-mallocs).
> The following chart show the effects on openacs.org, when i switched
> to SYSTEM_MALLOC + TCMalloc around August (same code, some
> number of requests, etc.)
>
> -g
>
>
> [image: yearly graph]
>
> On 01.11.19 18:02, David Osborne wrote:
>
> Hi,
>
> I was wondering if anyone could point us in the right direction with an
> intermittent problem we're seeing. Apologies for the wooly problem
> description - we don't have a firm test case as of yet.
>
> These instances are running NaviServer/4.99.16d10 on Debian Jessie 8.10
>
> The problem usually manifests itself as memory exhaustion on the server in
> question where the system's OOM killer is invoked.
>
> What seems to be causing the memory exhaustion is a copy of the main nsd
> process (sometimes several) which can use large amounts of memory.
>
> For example we captured a snapshot of this copy of the nsd daemon (also
> using 4gb RSS) after it had been running for about 5 hours.
>
> [image: image.png]
>
> It appears as if there are 2 of main nsd daemons running. Usually only 1
> nsd daemon is running on this server.
>
> When this child was killed by the OOM killer, it was a Tcl exec of an
> external command which was running within it - not a command that would
> normally 5 hours to complete.
>
> child killed: kill signal
>     while executing
> "exec $cmd << $input"
>
> In test, when running an exec from naviserver, I see the forked process
> being created and initially named "nsd", then it takes on the name of the
> underlying command. I'm not sure why this isn't happening in these cases.
>
> Any insights?
>
>
> _______________________________________________
> naviserver-devel mailing list
> naviserver-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/naviserver-devel
>
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to