Re: Lots of missed spam

Theo Van Dinter Wed, 28 Jun 2006 20:06:48 -0700

On Wed, Jun 28, 2006 at 06:55:07PM -0700, jdow wrote:
> >1) all of this spam is hitting BAYES_00.. you really should check your
> >bayes training and correct it.
> 
> THAT is a bad thing. Getting down to BAYES_00 for spam takes some
> doing. At the very least a whole lot of spam got trained as ham.


Well, that's not necessarily true.  Another possibility is that the spam
message comes in but there are few tokens which are also in the DB.
At that point Bayes has little to go on, and if the tokens in the DB
are hammy, then the message is scored as ham.

ie:

Message has tokens a, b, c, d, ..., z.
Of those, Bayes DB has tokens a, c, z, which are statistically ham.
Therefore with the information available to Bayes, the Message is ham.


This could even account for "lots" of messages all being marked as ham
if there's no learning of the tokens going on in between receipt of
the messages.

But in the end, running the message through "spamassassin -D bayes"
is likely the only thing that can be done to debug what is going on,
but that's also probably not going to be helpful in the end with DB
changes/learning/etc.

-- 
Randomly Generated Tagline:
"I think Ultra Slimfast powered the SCUD missile." - Bob Lazarus

pgpGbLkeqraNH.pgp
Description: PGP signature

Re: Lots of missed spam

Reply via email to