dccproc/dccifd error
Hi, I am using perl-5.10.1, amavisd-new 2.7.0, Mail-SpamAssassin-3.3.2 and dcc-dccd-1.3.140. When I receive and scan a message with a 'X-DCC-xxx-Metrics'-header the following error is logged to maillog: Dec 23 01:04:53 mx dccproc[81847]: unrecognized many usage: [-VdAQCHER] [-h homedir] [-m map] [-w whiteclnt] [-T tmpdir][-a IP-address] [-f env_from] [-t targets] [-x exitcode][-c type,[log-thold,][spam-thold]] [-g [not-]type] [-S header][-i infile] [-o outfile] [-l logdir] [-B dnsbl-option][-L ltype,facility.level] ; fatal error After modifying DCC.pm the error is gone: --- DCC.pm.bak 2011-12-22 23:03:34.0 +0100 +++ DCC.pm 2011-12-22 23:22:11.0 +0100 @@ -859,7 +859,7 @@ } if ($tag eq dcc:) { # query instead of report if there is an X-DCC header from upstream - unshift(@opts, '-Q', 'many') if defined $permsgstatus-{dcc_raw_x_dcc}; + unshift(@opts, '-Q') if defined $permsgstatus-{dcc_raw_x_dcc}; } else { # learn or report spam unshift(@opts, '-t', 'many'); Is this the correct fix? Or is my setup broken? Thanks. -- Herbert
Re: Bayes and MySQL - does it actually work?
On 12/21/2011 10:58 AM, Robert Schetterer wrote: Am 21.12.2011 19:10, schrieb Kris Deugau: Marc Perkel wrote: I've been trying for a long time to get bayes/mysql to actually work. Running a dedicated server with MySQL. Several servers running SA configured to talk to it. I'm running big servers with lots of ram and raid 0 flash drives for speed. Also using InnoDB. I'm beginning to wonder if it is ever going to work and if someone is going to fix it? I'm not sure what official testing has been done, but some testing I did about a year ago when upgrading the SA cluster here showed pretty much the same IO load for a global Bayes no matter what combination of MyISAM, InnoDB, generic SQL, or MySQL-specific SA modules I used. Enabling MySQL replication also bogged things down pretty badly. Performance with the database on physical disks simply wasn't keeping up with more than about double the average message rate (if that...), so I fell back to the good enough setup of putting the SA database on a RAMdisk, and tweaking the MySQL init script to reload the database on startup. A database dump is done once a day, about a half-hour after a Bayes expiry run. This is handling ~250K messages/day, although with some tweaks to serialize mail delivery a little more to level off the extreme peaks in messages/second it should probably be able to handle a lot more volume. We also have several SA instances - on the inbound side, the first pass has ~25 of the top-scoring only-hits-spam rules (mostly DNSBLs) to skim off the junk that would usually score 15+ on a full ruleset. Anything that gets past that is then passed to a full SA instance with a long list of local rules targeted at the ones reported as missed spam by customers. That first pass tags more than 80% of the junk for far less processing cost than feeding it all through the full ruleset. Occasional mail spikes[1] sometimes cause SA to slw dooowwwnnn due to CPU contention (60+ spamd threads are simply going to take a while to chew through mail if you've only got 16 logical CPU cores), but otherwise a pair of dual-socket, quad-core Xeon E5630 machines with 12G of RAM are mostly idle. (RAM usage is fairly steady at just over 4G.) Average scan times are just under a second. -kgd [1] I'm looking at you, Rocket Science Group - hundreds of messages per second from netblocks all over the US, all nominally operated by (AKA tagged in WHOIS for) the same group - and quite a lot of it spam. Unfortunately MailChimp seems to buy rack space, hosting, or managed email servers from them or I'd drop all of their netblocks in the local reject-at-the-border DNSBL and be done with it. Interesting Infos, by the way anyone knows postgresql performs better i.e with Bayes clusters etc ? at last using postscreen has helped here stopping bots,so these mails never reach spamd, but for sure in large mailsystems a spamassassin setup has to be configured very carefully ever, and analysed during runtime to get performance tweaks however 250K messages/day seems not that much to me scanning outbound mail with spamd ,was slow here too,i only use clamav-milter with sanesecurity for that, also for inbound before spamass-milter but no flames, for performance issues, a look to the total mailsetup is needed ever, there is no straight right or wrong most cases only analysing the bottlenecks will help Maybe it's time for me to try postgresql. Can you provide a link to how to optimize SA for it? -- Marc Perkel - Sales/Support supp...@junkemailfilter.com http://www.junkemailfilter.com Junk Email Filter dot com 415-992-3400
Re: dccproc/dccifd error
A new DCC.pm from the author of DCC was added to trunk on November 14th: https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6698 Looks like it already handles your case: DCC.pm:863: unshift(@opts, '-Q', 'many') if defined $permsgstatus-{dcc_raw_x_dcc}; That will be included in the next spamassassin re release, v3.4.0 (which doesn't have a specific planned release date). On 12/23, Herbert J. Skuhra wrote: Hi, I am using perl-5.10.1, amavisd-new 2.7.0, Mail-SpamAssassin-3.3.2 and dcc-dccd-1.3.140. When I receive and scan a message with a 'X-DCC-xxx-Metrics'-header the following error is logged to maillog: Dec 23 01:04:53 mx dccproc[81847]: unrecognized many usage: [-VdAQCHER] [-h homedir] [-m map] [-w whiteclnt] [-T tmpdir][-a IP-address] [-f env_from] [-t targets] [-x exitcode][-c type,[log-thold,][spam-thold]] [-g [not-]type] [-S header][-i infile] [-o outfile] [-l logdir] [-B dnsbl-option][-L ltype,facility.level] ; fatal error After modifying DCC.pm the error is gone: --- DCC.pm.bak 2011-12-22 23:03:34.0 +0100 +++ DCC.pm 2011-12-22 23:22:11.0 +0100 @@ -859,7 +859,7 @@ } if ($tag eq dcc:) { # query instead of report if there is an X-DCC header from upstream - unshift(@opts, '-Q', 'many') if defined $permsgstatus-{dcc_raw_x_dcc}; + unshift(@opts, '-Q') if defined $permsgstatus-{dcc_raw_x_dcc}; } else { # learn or report spam unshift(@opts, '-t', 'many'); Is this the correct fix? Or is my setup broken? Thanks. -- Herbert -- Let's just say that if complete and utter chaos was lightning, then he'd be the sort to stand on a hilltop in a thunderstorm wearing wet copper armour and shouting 'All gods are bastards'. - The Color of Magic http://www.ChaosReigns.com
Re: dccproc/dccifd error
On 12/22, dar...@chaosreigns.com wrote: DCC.pm:863: unshift(@opts, '-Q', 'many') if defined $permsgstatus-{dcc_raw_x_dcc}; I am using perl-5.10.1, amavisd-new 2.7.0, Mail-SpamAssassin-3.3.2 and dcc-dccd-1.3.140. Dec 23 01:04:53 mx dccproc[81847]: unrecognized many usage: [-VdAQCHER] [-h homedir] [-m map] [-w whiteclnt] [-T tmpdir][-a IP-address] [-f env_from] [-t targets] [-x exitcode][-c type,[log-thold,][spam-thold]] [-g [not-]type] [-S header][-i infile] [-o outfile] [-l logdir] [-B dnsbl-option][-L ltype,facility.level] ; fatal error - unshift(@opts, '-Q', 'many') if defined $permsgstatus-{dcc_raw_x_dcc}; + unshift(@opts, '-Q') if defined $permsgstatus-{dcc_raw_x_dcc}; Yeah, I read that backwards. Maybe it's handled by this? ./lib/Mail/SpamAssassin/Plugin/DCC.pm:692: $x_dcc =~ s/many/99/ig; - http://svn.apache.org/viewvc/spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/DCC.pm?view=markup Info on trunk: http://wiki.apache.org/spamassassin/DownloadFromSvn The author did say I believe it is entirely upward compatible. in November, which was well after the DCC 1.3.140 release, so it probably works. I'd be interested to hear how that works if you try it. Might be worth posting the results to that bug. -- Whom God wishes to destroy, he first makes mad. - Euripides (c.480 - 406 BC). http://www.ChaosReigns.com
Re: Bayes and MySQL - does it actually work?
Am 23.12.2011 02:45, schrieb Marc Perkel: On 12/21/2011 10:58 AM, Robert Schetterer wrote: Am 21.12.2011 19:10, schrieb Kris Deugau: Marc Perkel wrote: I've been trying for a long time to get bayes/mysql to actually work. Running a dedicated server with MySQL. Several servers running SA configured to talk to it. I'm running big servers with lots of ram and raid 0 flash drives for speed. Also using InnoDB. I'm beginning to wonder if it is ever going to work and if someone is going to fix it? I'm not sure what official testing has been done, but some testing I did about a year ago when upgrading the SA cluster here showed pretty much the same IO load for a global Bayes no matter what combination of MyISAM, InnoDB, generic SQL, or MySQL-specific SA modules I used. Enabling MySQL replication also bogged things down pretty badly. Performance with the database on physical disks simply wasn't keeping up with more than about double the average message rate (if that...), so I fell back to the good enough setup of putting the SA database on a RAMdisk, and tweaking the MySQL init script to reload the database on startup. A database dump is done once a day, about a half-hour after a Bayes expiry run. This is handling ~250K messages/day, although with some tweaks to serialize mail delivery a little more to level off the extreme peaks in messages/second it should probably be able to handle a lot more volume. We also have several SA instances - on the inbound side, the first pass has ~25 of the top-scoring only-hits-spam rules (mostly DNSBLs) to skim off the junk that would usually score 15+ on a full ruleset. Anything that gets past that is then passed to a full SA instance with a long list of local rules targeted at the ones reported as missed spam by customers. That first pass tags more than 80% of the junk for far less processing cost than feeding it all through the full ruleset. Occasional mail spikes[1] sometimes cause SA to slw dooowwwnnn due to CPU contention (60+ spamd threads are simply going to take a while to chew through mail if you've only got 16 logical CPU cores), but otherwise a pair of dual-socket, quad-core Xeon E5630 machines with 12G of RAM are mostly idle. (RAM usage is fairly steady at just over 4G.) Average scan times are just under a second. -kgd [1] I'm looking at you, Rocket Science Group - hundreds of messages per second from netblocks all over the US, all nominally operated by (AKA tagged in WHOIS for) the same group - and quite a lot of it spam. Unfortunately MailChimp seems to buy rack space, hosting, or managed email servers from them or I'd drop all of their netblocks in the local reject-at-the-border DNSBL and be done with it. Interesting Infos, by the way anyone knows postgresql performs better i.e with Bayes clusters etc ? at last using postscreen has helped here stopping bots,so these mails never reach spamd, but for sure in large mailsystems a spamassassin setup has to be configured very carefully ever, and analysed during runtime to get performance tweaks however 250K messages/day seems not that much to me scanning outbound mail with spamd ,was slow here too,i only use clamav-milter with sanesecurity for that, also for inbound before spamass-milter but no flames, for performance issues, a look to the total mailsetup is needed ever, there is no straight right or wrong most cases only analysing the bottlenecks will help Maybe it's time for me to try postgresql. Can you provide a link to how to optimize SA for it? sorry no, i have no links beside offical ones, but i was told from good DB People postgresql is more handy in Cluster Setups but as i said , try to limit amount of mails comming to spamassassin by using other filter tecs before it this should help anyway, beside of the DB Stuff -- Best Regards MfG Robert Schetterer Germany/Munich/Bavaria