As I've observed using the mwrap-perl LD_PRELOAD wrapper,
permanent 4080-byte arenas allocated by Perl late in the
process will impede consolidation of freed adjacent blocks.
In long-lived processes, this fragmentation from immortal
arenas near the "wilderness"[1] area can force excessive
memory to be requested from the kernel (via sbrk or mmap).

Combining this with MALLOC_MMAP_THRESHOLD_=131072 appears
to reduce memory use of long-lived, heavily-trafficked
processes.

[1] this is dlmalloc terminology:
    https://gee.cs.oswego.edu/dl/html/malloc.html
---

 Been testing various iterations and refining this off/on for a
 few weeks, now.  Immortal allocations sprinkled throughout the
 heap is bad and Perl's arenas are a major remaining culprit...

 lib/PublicInbox/Daemon.pm | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/lib/PublicInbox/Daemon.pm b/lib/PublicInbox/Daemon.pm
index e578f2e8..a80f2cbb 100644
--- a/lib/PublicInbox/Daemon.pm
+++ b/lib/PublicInbox/Daemon.pm
@@ -36,6 +36,22 @@ my %SCHEME2PORT = map { $KNOWN_TLS{$_} => $_ + 0 } keys 
%KNOWN_TLS;
 for (keys %KNOWN_STARTTLS) { $SCHEME2PORT{$KNOWN_STARTTLS{$_}} = $_ + 0 }
 $SCHEME2PORT{http} = 80;
 
+# Preallocate SV and HE arenas early so they don't end up in/near
+# the "wilderness" (in dlmalloc terminology).  This should result
+# in lower fragmentation and memory use.
+# Not sure if other arenas can benefit from this.
+# using PREALLOC_NR=500000 for the yhbt.net/lore mirror
+BEGIN {
+       if (my $nr = $ENV{PREALLOC_NR}) {
+               my @tmp = map { $_ => $_ } (0..$nr);
+               # some extra arenas for HVs and AVs
+               $nr >>= 5;
+               $nr = 10000 if $nr < 10000;
+               @tmp = map { +{} } (0..$nr);
+               @tmp = map { [ $_ ] } (0..$nr);
+       }
+}
+
 our ($parent_pipe, %POST_ACCEPT, %XNETD);
 our %WORKER_SIG = (
        INT => \&worker_quit,

Reply via email to