Re: Scoring Philosophy?

Bill Cole Tue, 21 Nov 2017 21:40:51 -0800

On 21 Nov 2017, at 16:01 (-0500), Jerry Malcolm wrote:

I have the Bayesian filter working, with a simple way to train it. I have sent over 5000 training messages to it over the past 6-8months.

That's good, but only if it has a mix of ham and spam. If you only tellthe Bayes filter about one or the other, it will work poorly. Also,while there is widely differing opinion on the "autolearn"functionality, I find it keeps my Bayes DB pretty accurate, withbayes_auto_learn_threshold_nonspam set to -0.1 (it defaults to 0.1) andbayes_auto_learn_threshold_spam to 8 (default is 12.)

On the larger issue of whether SpamAssassin is an "out-of-the-box" totalsolution for spam control, the simple answer is "no." No such thingexists or CAN exist. I've worked with about a dozen mail systemshandling mail for about a hundred distinct domains (I'm old) and whilethey have had many commonalities, each has needed its own tweaking tooptimize, and in cases where I've run multi-tenant systems, seeminglysimilar customers have required divergent filtering, i.e. multiple smallbusinesses of similar scale in the same metropolitan area have eachneeded domain-specific filtering that their neighbors using the sameinfrastructure & services provider couldn't tolerate. The FUSSP is amyth. (see https://www.rhyolite.com/anti-spam/you-might-be.html forsigns of FUSSP delusions.)

Beyond the fact that similar domains and even similar individuals canhave starkly different anti-spam needs, there is a blind spot inSpamAssassin which is the result of its contributors generallypracticing layered defense in such a way that they never even show toSpamAssassin a large fraction of the spam which targets them. SA willnot catch some of the most blatant and even dangerous spam because it iseasily caught by safe MTA (or pre-MTA, e.g. postscreen) anti-spamtactics or even network layer tools like the Spamhaus DROP and EDROPlists. If you do not use mechanisms that end the majority of SMTPsessions before the DATA phase, you will need to be especially carefulabout correct and customized configuration of SA.

One common area to be particularly careful about in configuring SA isnetwork classification: the trusted_networks and internal_networkssettings. If you do not set those correctly, you can end up neverhitting any of the DNSBL rules or hitting them improperly because SAisn't working with the right Received header for a particular DNSBL. Arelated and increasingly common (dunno why) source of never hittingDNSBL rules is a form of firewall/router NAT sometimes called "SecureNAT" where inbound connections have their source IP's replaced with theIP of the device handling the NAT. This typically kills any ability of aMTA or a filter like SA to use DNSBLs or any other anti-spam tactic thatrequires knowing the client IP (or the client IP of the lastexternal-client transport hop.)


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steady Work: https://linkedin.com/in/billcole

Re: Scoring Philosophy?

Reply via email to