Re: [Mimedefang] Embedded Perl (continued)
On 2015-9-22 17:16 , Steffen Kaiser wrote: > I had SpamAssassin rules allocating about 100MB, the forked children > only shared the C libraries after some time. That's a problem of > Perl's way to handle rereferences to data. It might help not to integrate SpamAssassin, but to use spamc to communicate with spamd. However, that only saves memory if you have other stuff in your filter rules that take time processing messages, like DNS blacklist checks, virus scanners, SPF/DKIM/DMARC processing, SMTP forward lookups, etc. That way, perl processes doing those other things do not have a big lump of SpamAssassin rules sitting in memory (which is usually quite a lot of memory due to the way spamassassin works). You'd generally need fewer spamassassin slaves than mimedefang slaves (if you don't, then this won't save memory but eat a bit more memory instead because of the extra perl processes involved). On the other hand, it does make things a bit more complex because you have to manage another daemon, monitor it, restart when rules change, maintain configs, etc. Per-recipient rules might be somewhat harder. Oh, and stock mimedefang doesn't support it. I've attached the SpamC.pm that we use for spamd communication. Also make sure that you set $Features{"SpamAssassin"} = 0 in your filter, to prevent Mail::SpamAssassin from loading (otherwise your mimedefang slaves would still eat memory for spamassassin). You will need to modify this SpamC.pm as it uses a modular Mimedefang.pm, but the changes should be trivial. -- Jan-Pieter Cornet"Any sufficiently advanced incompetence is indistinguishable from malice." - Grey's Law package MailFilter::SpamC; # provide spamc interface to spamassassin, call-compatible with mimedefang # API # ... mostly. It actually only provides spam_assassin_check(). use Mimedefang qw(gen_msgid_header synthesize_received_header :global :logging :config); use IPC::Open2; use base Exporter; our @SpamAssassinExtraHeaders; our @EXPORT_OK = qw( spam_assassin_check @SpamAssassinExtraHeaders ); my $spamc = "/usr/bin/spamc"; my @spamc_opts = qw(-F /etc/spamd/spamc.conf); sub spam_assassin_check { ### open communications to spamc my $in; unless ( open $in, "<", "./INPUTMSG" ) { md_syslog('err', "$MsgID: Spamc error: Cannot read INPUTMSG: $!"); return; } my($sprd, $spwr); my $sp_pid = open2($sprd, $spwr, $spamc, @spamc_opts); unless ( $sp_pid ) { md_syslog('err', "$MsgID: Spamc error: Cannot fork $spamc: $!"); return; } ### note: the lines below duplicate the effect in the real ### spam_assassin_check somewhat ### build complete headers my $hdrs = "Return-Path: $Sender\n" . synthesize_received_header(); $hdrs .= gen_msgid_header() if ($MessageID eq "NOQUEUE"); ### get message headers, remember if we had a "To:" header my($seen_to, $seen_eoh); while ( <$in> ) { if ( /^$/ ) { $seen_eoh++; last; } $seen_to++ if /^To:/i; $hdrs .= $_; } $hdrs .= "To: undisclosed-recipients:;\n" if !$seen_to; if ( $AddApparentlyToForSpamAssassin and @Recipients ) { $hdrs .= "Apparently-To: " . join(", ", @Recipients) . "\n"; } $hdrs .= join("", @SpamAssassinExtraHeaders); ### add header-body separation line that we ate in the loop above $hdrs .= "\n"; ### $hdrs now contains the complete headers as sent to spamc ### send headers to spamc print $spwr $hdrs; ### send rest of message (if there was any left) if ( $seen_eoh ) { print $spwr $_ while <$in>; } close $spwr; ### wait for result my $output = join("", <$sprd>); close $sprd; waitpid($sp_pid, 0); if ( $? ) { md_syslog('err', "$MsgID: spamc returned non-zero exit code: $?\n"); return; } my($hits, $req, $names, $report, %sa_tags); ### first line is hits/req if ( $output =~ s!\A(-?\d+(?:\.\d+)?)/(-?\d+(?:\.\d+)?)\r?\n!! ) { ($hits, $req) = ($1, $2); } else { my $sample = $output; if ( length($sample) > 80 ) { $sample = substr($sample, 0, 80) . "..."; } $sample =~ s{[^ -~]}{sprintf("\\x%02x", ord $1)}ge; md_syslog('err', "$MsgID: Error: spamc returned invalid output: $sample"); return; } ### process rest of output while ( $output =~ s/\A(\w+):\s+(.*)\r?\n// ) { my($k,$v) = ($1,$2); $hits = $v, next if $k eq "Score"; $req = $v, next if $k eq "Required"; $names = $v, next if $k eq "Tests"; $sa_tags{$k} = $v; } ### anything that is left now is the full report $output =~ s/^\s+//; $report = $output; return($hits, $req, $names, $report, \%sa_tags); } 1; signature.asc Description: OpenPGP digital signature ___ NOTE: If
Re: [Mimedefang] Embedded Perl (continued)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, 22 Sep 2015, Amit Gupta wrote: My situation is that the number of mimedefang.pl processes jumps to about 70 during peak loads (we are processing a couple hundres messages per minute on average). Our filter file is in need of some optimizations(since each mimedefang.pl is taking about 125mb of ^^^ resident memory), but I'm wondering if using embedded perl will help in this situation. I see you mentioned using embedded perl prevents forking entire processes.. So does this mean each request is handled by a thread within the main process instead? So would my RAM requirements be reduced drastically? Read Dianne's response about the garbage collector. Unless the script use very view different values of your loaded data or use weak references, you will not notice any reduction in long run. I had SpamAssassin rules allocating about 100MB, the forked children only shared the C libraries after some time. That's a problem of Perl's way to handle rereferences to data. - -- Steffen Kaiser -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQEVAwUBVgFwzFGgR0+MU/4GAQJ5gQf7B/MqyaeU97R22AxFCsT2+/se7Aqy8yFK oMcjXfsyIKG0sUVLbR5fGNALHtw/jpxDFiiikm2z7QzFIhingTUS04/zAwjuqVF2 LhvQ/RgZeGUyq8MHDd4z6sFLH8znbOINpnoIJBhrrE0ewq77gONwi8XRU+F/382z VW3a0k8t9A2QRLqa2JgE1lsVF+mRM/R7/YCASf2CazscwdUtgd0bFUDbzYhGZvO3 Xm1hajxMjdm+xCMBN5WxsjO/iQ1Q9XF083oQy8A/1GGXJR9R91psU4q+Bsu7V5N8 LFLHKGLZayCms1Eh4qshEPtUJde8AX1CicVvr0u3q6DivQHTeQ08Zw== =yqjd -END PGP SIGNATURE- ___ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang
[Mimedefang] Embedded Perl (continued)
Apologies for starting a new thread. I couldn't find any messages in my inbox to reply to. Thanks Paul, Bill and Diane for your replies. My situation is that the number of mimedefang.pl processes jumps to about 70 during peak loads (we are processing a couple hundres messages per minute on average). Our filter file is in need of some optimizations(since each mimedefang.pl is taking about 125mb of resident memory), but I'm wondering if using embedded perl will help in this situation. I see you mentioned using embedded perl prevents forking entire processes.. So does this mean each request is handled by a thread within the main process instead? So would my RAM requirements be reduced drastically? In my peak case, I roughly calculate my RAM usage just for md.pl to be about 8GB. If embedded perl makes this go down a lot, it's a major win for me. Thanks again for your help ___ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang
Re: [Mimedefang] Embedded Perl (continued)
On Tue, 22 Sep 2015 07:57:18 -0700 Amit Guptawrote: > My situation is that the number of mimedefang.pl processes jumps to > about 70 during peak loads (we are processing a couple hundres > messages per minute on average). How much RAM do you have? 70 parallel scanners is not outlandish on busy machines. Our biggest scanning machine is configured to allow up to 400 scanners. It's a pretty powerful machine with 48GB of RAM, though, and our volume is 5-10x yours. > I see you mentioned using embedded perl prevents > forking entire processes. No... it still forks each time, but it doesn't exec a new program. > So would my RAM requirements be reduced drastically? Probably not. As I said, embedded Perl helps a little bit, but not dramatically. Regards, Dianne. ___ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang
Re: [Mimedefang] Embedded Perl (continued)
We have 16GB of ram, though there are other processes running on this machine such as DB that will be segmented later. I'm curious how much resident memory each of your mimedefang.pl processes uses? I haven't been tracking my mimedefang.pl memory usage over time, so I was a little surprised to see it at 125Mb. Before I go down a rabbit hole of minimizing it, i want to make sure it's actually significantly higher than your situation. Also, Am I right in thinking the forking issue is not such a big deal because the processes are pre-forked and stay running for some amount of time and eventually get cleared down to your minimum setting. I have my min processes set to 10, and max to 100.. And my monitoring system shows that I have about 20 running mimedefang.pl processes on average. On Tue, Sep 22, 2015 at 8:12 AM, Dianne Skollwrote: > On Tue, 22 Sep 2015 07:57:18 -0700 > Amit Gupta wrote: > >> My situation is that the number of mimedefang.pl processes jumps to >> about 70 during peak loads (we are processing a couple hundres >> messages per minute on average). > > How much RAM do you have? 70 parallel scanners is not outlandish on > busy machines. Our biggest scanning machine is configured to allow > up to 400 scanners. It's a pretty powerful machine with 48GB of RAM, > though, and our volume is 5-10x yours. > >> I see you mentioned using embedded perl prevents >> forking entire processes. > > No... it still forks each time, but it doesn't exec a new program. > >> So would my RAM requirements be reduced drastically? > > Probably not. As I said, embedded Perl helps a little bit, but not > dramatically. > > Regards, > > Dianne. ___ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang
Re: [Mimedefang] Embedded Perl (continued)
On Tue, 22 Sep 2015 08:20:16 -0700 Amit Guptawrote: > We have 16GB of ram, though there are other processes running on this > machine such as DB that will be segmented later. I'm curious how much > resident memory each of your mimedefang.pl processes uses? About 110MB, but not sure how much of that is shared. > Also, Am I right in thinking the forking issue is not such a big deal > because the processes are pre-forked and stay running for some amount > of time and eventually get cleared down to your minimum setting. Forking is not a big deal at all. execing may be more of a big deal, but still probably not a major performance factor. Regards, Dianne. ___ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang