Re: (newbie question) Increasing SA effectiveness

Marcin Krol Fri, 12 Dec 2008 01:36:06 -0800

Karsten Bräckelmann wrote:

Well, isn't it better to use them before SA, provided your MTA does have
this feature (I recommend Exim to everyone)?


No -- unless you ultimately trust the RBL to produce a *negligible*
amount of FPs. Every single RBL does have FPs to a highly variable
degree. Instead ob outright blocking on a hit, it is a good idea to
assign a score for the hit only, and see what the result is after all
tests have been performed...


Actually, I think there are good reasons to reject mail based on RBLs:

First, it has a strong policing effect on the internet: nobody excepthardcore spammers dares to send spam.

In hosting, where I worked for some time (another admin was taking careof SA-related issues), the few false positives we had were generallyquickly taken care of. With literally thousands of customers, we didn'tfind RBL false positives to be any major issue.

Another "policing" issue that is positive side effect of commonrejecting the mail by RBLs: the major shared hosting providers do notdare to do business with spammers. We all know the reality of it, if itmade a few nickles profit for providers, they would not hesitate to hostspammers. Were it not for, granted, drastic phenomenon of mail rejectiondue to RBLs, spam would be even more of a problem.

Suppose everyone used your approach: most of the mail would be accepted,which is good enough for spammers (few MTAs do SA-scanning at SMTP time,a la sa-exim). Maybe it would be filtered, maybe it wouldn't, butpolicing effect would mostly disappear without outright rejection ofmail coming from RBL-damned addresses.

Second: SA-scanning is a MAJOR cost. At hosting we found that *majority*of overall server load was generated by SA, even after most spam waseliminated by RBLs and sender-verify before it even reached SA!

Face it, SA is effective, but that comes at a cost: all those tests burnhuge, and I mean huge, amounts of CPU and time. Even scanning time athosting server is a somewhat important issue, as it greatly increasesthe number of concurrent connections to your server and the number offorked MTA software instances (memory, etc). Anything that cuts thatcost down, even an occasional FP, is worth it, especially as it'sresolvable nowadays -- I have taken quite a number addresses off RBLs(mostly Spamhaus and Spamcop). Sure, it was never pleasant. But IMO,it's well worth it.

Exactly the SA approach. A single (or even a few) rules and RBLs can
misfire, without affecting the overall deliverability of a particular
mail.

With all due respect, I disagree, in the sense: there are very few caseswhere it would produce overall benefit, while many other benefits(above) would disappear and many problems would be much more common hadyour recommended approach been common.

Also look at setting up Bayes and train it well. A well trained Bayessetup can hit 99% plus spam (for me) and can be highly effective.
Except I found that while it often gets positive identification right,
it sometimes produces false negatives (BAYES_00 negative scoring gets
fired on what it should classify as spam -- I reduced BAYES_00 scoring
for that reason).


As mentioned a few times already -- do train Bayes instead. That's a
mis-fire of Bayes, and needs to be corrected.

The problem is paradoxically the lack of spam - my spamtraps do not getenough spam.



Regards,
Marcin Krol

Re: (newbie question) Increasing SA effectiveness

Reply via email to