Re: SA Problem: spam with random words to defeat Baysian filtering ...

Matt Kettler 11 Feb 2004 16:32:50 -0000

At 11:05 AM 2/11/2004, Robert S. Sciuk wrote:

I've just joined the list, and requested FAQ and info from the majordomo.
In the absence of either one, I am forced to ask the following of the list
with no knowledge of whether it is an FAQ or not -- sorry.

The FAQ is actually a wiki web, and it's linked from the spamassassin.org main page.

http://wiki.spamassassin.org/w/

As indicated in the subject line, I'm getting negative hit rates on spam
which uses random dictionary words.  Obviously sa-learn cannot learn how
to deal with such an approach, and my formerly brilliant
sendmail/spamassassin configuration is now next to useless - as I'm
getting 200 - 300 spam's per day.

Can anyone point me to a solution or a counter-counter measure to kill
this damn spam??

This is quite surprising to me.. I've been getting a lot of the "random word" spams too, but feeding them to sa-learn has been quite effective.

If you've got a lot of input to bayes, the random-word attacks wind up being more-or-less a wash.

So far this month, I've had 7 false negatives, 0 false positives. Most of the "dictionary bayes poison" spams are gettting BAYES_99 for me.

For reference, and those wondering about the full details of how I get that my config consist of:

        DCC, razor2 and RBLs used.
        habeas_swe score forced down to -1.0
        bayes_ignore_header statements for all the habeas SWE headers
        bayes_auto_learn_threshold_nonspam -0.3

A few add-on rules: antidrug.cf (gee, there's a shock, since I wrote it ;) <http://mywebpages.comcast.net/mkettler/sa/antidrug.cf>http://mywebpages.comcast.net/mkettler/sa/antidrug.cf

A collapsed version of popcorn that's just 2 rules. Based on http://www.emtinc.net/includes/popcorn.cf , but edited by me to only be 2 rules

A few rules from http://www.merchantsoverseas.com/wwwroot/gorilla/body.txt L_b_MaskedW0rds* A few rules from http://www.exit0.us/index.php/FredsRules-SUBJECT FVGT_s_OBFU_*

One of the blackholes.us blacklists added, with score set fairly low to avoid FPs. header RCVD_IN_CHINA_KR eval:check_rbl('country', 'cn-kr.blackholes.us.') describe RCVD_IN_CHINA_KR Received from China or Korea score RCVD_IN_CHINA_KR 1.0

about 15 negative scoring rules which have "industry specific" phrases for my companies business in it.

I feed bayes with some spamtraps and nonspamtraps each day, giving it about 100 spams, and 25 nonspams in manual training daily.

Re: SA Problem: spam with random words to defeat Baysian filtering ...

Reply via email to