I'm running spamc/spamd 3.0.2 in Debian.  I have Bayesian tests turned on,
and network tests off.

Lately a lot of spam has been getting through to my mailbox.  SA's false
negative rate used to be about 1%; now it's about 50%.  Looking at the
headers for the spam that's getting through, I see that the Bayesian filter
is working correctly: almost all of the spam is tagged as BAYES_95 or
BAYES_99.  My score threshold is 5, the BAYES_99 test alone (using its
default value) is worth 4.07, and a few other tests are usually positive as
well.  Yet, the total score is around 2.5.  Here's a sample from today:

X-Spam-Status: No, score=2.7 required=5.0 tests=BAYES_99,HTML_20_30,
 HTML_FONT_INVISIBLE,HTML_IMAGE_ONLY_24,HTML_MESSAGE autolearn=no 
 version=3.0.2

The scores from the tests listed here should add up to about 5.3, but as you
can see, the total is only 2.7.  So this one gets through.

I understand that the individual test scores are fed through a neural
network to derive the final score.  So it seems that this network has
started to behave badly.  

Can anyone shed any light on this?  Is it a well-known problem?  What's the
preferred way to address it?  Remove all of SA's learned information and
retrain the network?

Thanks,
Andrew.

Reply via email to